Re: [GSLL-devel] [Gsll-devel] Efficient access to externally generated double-float arrays?
OK, I've now committed what I think is my last attempt at this. This latest version includes optimization of matrices and setting values in foreign arrays. This latter comes with a caveat however. The compiler will macroexpand setf, and for SBCL at least, that means making temporary variables to bind the actual values. Those variables of course don't have any declarations, so no compiler macro expansion is done. They way around this is to funcall #'(setf gref*) instead of using the setf macro, something that I bet most people won't want to do. If I get some more energy to pursue this, I could define a setf macro to shadow cl:setf, but I think I'll let this be for now. I've written a "timing test" function loosely based on yours. It is in foreign-array/tests/timing.lisp. I hope in the next few days to run a few of these tests to see the benefit of the optimizations that have been put in place. You can see I have commented out the funcall #'(setf grid:gref*) form at the end; in this example there are so few sets that the speed difference is undetectable. Liam ___ GSLL-devel mailing list GSLL-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel
Re: [GSLL-devel] [Gsll-devel] Efficient access to externally generated double-float arrays?
On Sun, Nov 14, 2010 at 10:24 AM, Sebastian Sturm wrote: > My apologies for taking so long to answer. I have verified that the new gref* > can now be used without consing and comes relatively close to the > unconvenient cffi:mem-aref solution in terms of computation time. Many thanks > for that! I also tried the functions on matrices, which still results in some > consing, but this doesn't seem to be time-critical here. In my experiments, > most of the consing could be eliminated by storing the linearized index in an > auxiliary variable, but this even resulted in a slight runtime increase. > For my original application, the GSLL solution is now comparable in speed to > Mathematica for small dimensionality (dim < about 50); for a fair comparison > of both solutions for larger dim, I'll have to rework my very naive > computation of the Hessian matrix. > > Judging from the git commits, you have also changed something related to > complex-valued arrays. Did you implement similar optimizations as in the > real-valued case? I'll try that once I have repaired my Hessian. > > Best regards, > Sebastian The work is still ongoing. I don't know when you last pulled, but I've made a few commits in the last day or so, and I'm working on some more improvements. My previous email was incorrect in stating that the 'declare form doesn't work; it does at least in newer versions of SBCL. Also, I've expanded use to CCL because that supports variable-information as well (this hasn't been committed yet). The other recent change is that inputs and outputs of gref* and (setf gref*) are now wrapped with the appropriate 'the forms to help the compiler optimize. So far these mostly apply only to vectors (1 dimensional arrays) but I am working on compile-time linearization of indices for higher dimensional arrays, so that at compile time the forms will also expand directly into mem-aref calls; that should help with your matrices if they are declared with their dimensions. I'm working on this now. Everything should apply to all foreign array types, including complex. Liam ___ GSLL-devel mailing list GSLL-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel
Re: [GSLL-devel] [Gsll-devel] Efficient access to externally generated double-float arrays?
My apologies for taking so long to answer. I have verified that the new gref* can now be used without consing and comes relatively close to the unconvenient cffi:mem-aref solution in terms of computation time. Many thanks for that! I also tried the functions on matrices, which still results in some consing, but this doesn't seem to be time-critical here. In my experiments, most of the consing could be eliminated by storing the linearized index in an auxiliary variable, but this even resulted in a slight runtime increase. For my original application, the GSLL solution is now comparable in speed to Mathematica for small dimensionality (dim < about 50); for a fair comparison of both solutions for larger dim, I'll have to rework my very naive computation of the Hessian matrix. Judging from the git commits, you have also changed something related to complex-valued arrays. Did you implement similar optimizations as in the real-valued case? I'll try that once I have repaired my Hessian. Best regards, Sebastian ___ GSLL-devel mailing list GSLL-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel
Re: [GSLL-devel] [Gsll-devel] Efficient access to externally generated double-float arrays?
OK, this turned out to be a lot harder than I thought. I have done three things: 1) I have defined gref* and (setf gref*) methods specific to each of the foreign-array types that call cffi:mem-aref. This gives about 20x speedup because I am now passing the literal type to cffi:mem-aref, so it can work that fact in at compile time. 2) There is now a compiler macro to turn a grid:gref into a grid:gref* if there's only one index. This gives about 2x speed up when gref is used. 3) There is another compiler macro that turns a grid:gref* into a cffi:mem-aref directly if the foreign array is declared. This gives about a 400x speedup overall, similar to your "hardwired" result. It is a bit slower because I'm not able to precompute the pointer, it has to be recomputed each time on the gref* call. On the last point, there is a caveat. I tried to make it work when the foreign array has been declared with a standard (declare ...) form. This has a chance of working on SBCL because of its support for the CLtL2 function variable-information, which was removed from CL before it was sent to ANSI standardization. However, it did not work for me; I will continue to try to get this working. In the meantime, the only way to do a declaration to take advantage of 3 is with a 'the form, e.g. (grid:gref (the vector-double-float zvector) i). This is kind of annoying, but it is portable, and allows you to avoid going to lower level functions (i.e., cffi:mem-aref). So for example see my rewrite of your function (in foreign-array/tests/fast-array-access.lisp) (defun gref-access (dim) "Given an integer dim, this constructs a function that, when supplied with a N-dimensional vector Z and some output vector (-> pointer?), yields the corresponding forces" (let ((temp-values (make-array 2 :element-type 'double-float :initial-element 0.0d0))) (lambda (zvector output) (declare (fixnum dim) (optimize (speed 3) (safety 0) (debug 0)) (type vector-double-float zvector)) ;;; <--- this is useless, but ought not to be! (do ((i 0 (1+ i))) ((= i dim)) (declare (fixnum i)) (setf (aref temp-values 0) 0.0d0) (do ((m 0 (1+ m))) ((> m i)) (declare (fixnum m)) (do ((n i (1+ n))) ((= n dim)) (declare (fixnum n)) (setf (aref temp-values 1) 0.0d0) (do ((k m (1+ k))) ((> k n)) (declare (fixnum k)) (incf (aref temp-values 1) (grid:gref (the vector-double-float zvector) k))) ; This declaration does the work! (incf (aref temp-values 0) (expt (aref temp-values 1) -2 (setf (grid:gref output i) (- (grid:gref (the vector-double-float zvector) i) ; This one does too! (aref temp-values 0))) is now (almost) as fast as your cffi-access. There is still a bunch of stuff to be done --- the optimizations only work for vectors, not higher dimensional arrays, and I haven't defined a compiler macro for setf yet on 3). But it's a start; try it in your problem and let me know how it performs. Liam ___ GSLL-devel mailing list GSLL-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel
Re: [GSLL-devel] [Gsll-devel] Efficient access to externally generated double-float arrays?
Liam, I'm sorry, I now see a difference between the original gref* and your modified version of it. I guess I forgot to reload the Lisp image last time, so both timings had been obtained using the new gref*. There is still consing, but it's already reduced by a factor of three. Can be reduced by a further ~60% by using (the double-float (grid:gref* ...)) instead of just gref*; I assume this is due to to the float->pointer coercion done to gref*'s .(?) Here's the new timing data (dim = 15): "gref" Evaluation took: 0.063 seconds of real time 0.056171 seconds of total run time (0.044926 user, 0.011245 system) [ Run times consist of 0.006 seconds GC time, and 0.051 seconds non-GC time. ] 88.89% CPU 28 lambdas converted 137,624,278 processor cycles 4,032,960 bytes consed "gref*" Evaluation took: 0.012 seconds of real time 0.011932 seconds of total run time (0.011929 user, 0.03 system) 100.00% CPU 26,178,823 processor cycles 785,360 bytes consed "modified gref*" Evaluation took: 0.001 seconds of real time 0.001202 seconds of total run time (0.000884 user, 0.000318 system) 100.00% CPU 3,126,090 processor cycles 278,512 bytes consed "hardwired cffi:mem-aref" Evaluation took: 0.000 seconds of real time 0.33 seconds of total run time (0.32 user, 0.01 system) 100.00% CPU 66,462 processor cycles 0 bytes consed I have attached the lisp file used to obtain these timings. thanks, Sebastian fast-array-access.lisp Description: Binary data ___ GSLL-devel mailing list GSLL-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel
Re: [GSLL-devel] [Gsll-devel] Efficient access to externally generated double-float arrays?
Sebastian, Trying to walk code correctly is a nasty business, and generally non-portable, so I'd stay away from it. We could do this with a macro/macrolet pair but I'd like to optimize via the CLOS route if we can. Liam On Thu, Oct 28, 2010 at 1:01 PM, Sebastian Sturm wrote: > Here is something similar to what I suggested, although still very > unfinished. It's probably not the proper way to write a macro, but I guess > other people on the list will know how to do it correctly. Efficient > linearization can be added in the same manner; also, the macro should be > modified s.t. it accepts several array specifications at once, i.e. > with-foreign-array ((array-1 :double) (array-2 :int) (array-3 :double) ...), > etc. > Of course, if you can figure out a way to incorporate the optimizations into > gref without such a clumsy workaround, I'd be all for it. > best regards, > Sebastian > > (defun mapcons (fn x) > (if (atom x) > x > (funcall fn (let ((a (mapcons fn (car x))) > (d (mapcons fn (cdr x > (if (and (eql a (car x)) (eql d (cdr x))) > x > (cons a d)) > > (defmacro with-fast-access-to-single-foreign-array ((array element-type) > &body body) > (alexandria:with-unique-names (array-fptr) > `(let ((,array-fptr (grid::foreign-pointer ,array))) > ,@(mapcons > (lambda (expr) > (if (and (consp expr) > (eq (first expr) 'grid:gref*) > (eq (second expr) array)) > (list 'cffi:mem-aref array-fptr element-type (elt expr 2)) > expr)) body > > (defun macro-force-function (dim) > "Given an integer dim, this constructs a function that, when supplied with a > N-dimensional vector Z and some output vector (-> pointer?), yields the > corresponding forces" > (declare (fixnum dim)) > (let ((temp-values (make-array 2 :element-type 'double-float > :initial-element 0.0d0))) > (lambda (zvector output) > (with-fast-access-to-single-foreign-array (output :double) > (with-fast-access-to-single-foreign-array (zvector :double) > (do ((i 0 (1+ i))) ((= i dim)) (declare (fixnum i)) > (setf (aref temp-values 0) 0.0d0) > (do ((m 0 (1+ m))) ((> m i)) (declare (fixnum m)) > (do ((n i (1+ n))) ((= n dim)) (declare (fixnum n)) > (setf (aref temp-values 1) 0.0d0) > (do ((k m (1+ k))) ((> k n)) (declare (fixnum k)) > (incf (aref temp-values 1) (grid:gref* zvector k))) > (incf (aref temp-values 0) (expt (aref temp-values 1) -2 > (setf (grid:gref* output i) > (- (grid:gref* zvector i) > (aref temp-values 0) > > > ___ > GSLL-devel mailing list > GSLL-devel@common-lisp.net > http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel > ___ GSLL-devel mailing list GSLL-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel
Re: [GSLL-devel] [Gsll-devel] Efficient access to externally generated double-float arrays?
Sebastian, I'm very puzzled by this. You should now have the type :double hard-wired in the CFFI call. Can you run your big test with and without this function defined, instead of just the grid:gref* test? If it still shows no improvement, that says that the problem is not in the binding of type at runtime in the cffi:mem-aref call. Or if you'd like, send me the test (I know you posted pieces before but just to be sure, send the whole file again) and I will try it. Liam On Thu, Oct 28, 2010 at 10:51 AM, Sebastian Sturm wrote: > Liam, > > I have tried this and (without using (the double-float (...)) since this is > unnecessary for cffi:mem-aref with hardcoded :double data type), I get the > following timing data: > > using the standard grid:gref routine: > Evaluation took: > 0.542 seconds of real time > 0.536671 seconds of total run time (0.525229 user, 0.011442 system) > [ Run times consist of 0.037 seconds GC time, and 0.500 seconds non-GC time. ] > 99.08% CPU > 1,189,729,211 processor cycles > 153,334,416 bytes consed > > using the standard grid:gref* routine: > Evaluation took: > 0.067 seconds of real time > 0.066006 seconds of total run time (0.063179 user, 0.002827 system) > [ Run times consist of 0.008 seconds GC time, and 0.059 seconds non-GC time. ] > 98.51% CPU > 146,192,222 processor cycles > 27,060,416 bytes consed > > using the modified grid:gref* routine (specialized to > grid:vector-double-float) > Evaluation took: > 0.070 seconds of real time > 0.068041 seconds of total run time (0.065151 user, 0.002890 system) > [ Run times consist of 0.011 seconds GC time, and 0.058 seconds non-GC time. ] > 97.14% CPU > 152,642,611 processor cycles > 27,061,456 bytes consed > > Apparently my system didn't notice the difference. Also, SBCL complains about > the argument (grid::foreign-pointer object) to cffi:mem-aref being of type > NUMBER instead of integer or fixnum. Furthermore, it says it has to do float > to pointer coercion to . I checked that the redefined method is > actually used by doing a second run with a print statement added to the > defmethod. > > using cffi:mem-aref directly: > Evaluation took: > 0.002 seconds of real time > 0.002483 seconds of total run time (0.002482 user, 0.01 system) > 100.00% CPU > 5,442,756 processor cycles > 0 bytes consed > > I guess that even using compiler macros or other trickery one would have to > remove the allocation of linearized indices and foreign ptr addresses from > the inner loops as I have done in my example by using auxiliary variables > zvector-fptr and output-fptr. Maybe one can define something like > (with-foreign-array (name-of-array :double) ...) that locally redefines > (grid:gref name-of-array ...) and (grid:gref* name-of-array ...) as macros > evaluating to cffi:mem-aref and storing the respective linearized indices and > memory pointers at the level of with-foreign-array? Although not as > convenient as some 'self-optimizing' grid:gref, I would consider this a > satisfactory solution. Don't know how to do that without getting lost in a > forest of commas and backquotes, though. > > best regards, > Sebastian > > On 27.10.2010, at 05:22, Liam Healy wrote: > >> Sebastian, >> >> Can you temporarily define this and find the timing/consing for your >> test case: >> >> (defmethod gref* ((object vector-double-float) linearized-index) >> (cffi:mem-aref >> (foreign-pointer object) >> :double >> linearized-index)) >> >> (I think you don't use any matrices but if you do, define an analogous >> function for matrix-double-float.) >> >> As you can see, it has the literal type declaration, and I'm hopeful >> that CFFI will pick that up and make this competitive in speed with >> the best that you saw. If that's so, it should be fairly easy for me >> to make this generic and incorporate it into GSD. I'm still >> interested in making the linearization more efficient if that's still >> significant, but let's try this for now to see how much speed we can >> squeeze out of gref*. >> >> Thanks, >> >> Liam > > ___ > GSLL-devel mailing list > GSLL-devel@common-lisp.net > http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel > ___ GSLL-devel mailing list GSLL-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel
Re: [GSLL-devel] [Gsll-devel] Efficient access to externally generated double-float arrays?
Here is something similar to what I suggested, although still very unfinished. It's probably not the proper way to write a macro, but I guess other people on the list will know how to do it correctly. Efficient linearization can be added in the same manner; also, the macro should be modified s.t. it accepts several array specifications at once, i.e. with-foreign-array ((array-1 :double) (array-2 :int) (array-3 :double) ...), etc. Of course, if you can figure out a way to incorporate the optimizations into gref without such a clumsy workaround, I'd be all for it. best regards, Sebastian (defun mapcons (fn x) (if (atom x) x (funcall fn (let ((a (mapcons fn (car x))) (d (mapcons fn (cdr x (if (and (eql a (car x)) (eql d (cdr x))) x (cons a d)) (defmacro with-fast-access-to-single-foreign-array ((array element-type) &body body) (alexandria:with-unique-names (array-fptr) `(let ((,array-fptr (grid::foreign-pointer ,array))) ,@(mapcons (lambda (expr) (if (and (consp expr) (eq (first expr) 'grid:gref*) (eq (second expr) array)) (list 'cffi:mem-aref array-fptr element-type (elt expr 2)) expr)) body (defun macro-force-function (dim) "Given an integer dim, this constructs a function that, when supplied with a N-dimensional vector Z and some output vector (-> pointer?), yields the corresponding forces" (declare (fixnum dim)) (let ((temp-values (make-array 2 :element-type 'double-float :initial-element 0.0d0))) (lambda (zvector output) (with-fast-access-to-single-foreign-array (output :double) (with-fast-access-to-single-foreign-array (zvector :double) (do ((i 0 (1+ i))) ((= i dim)) (declare (fixnum i)) (setf (aref temp-values 0) 0.0d0) (do ((m 0 (1+ m))) ((> m i)) (declare (fixnum m)) (do ((n i (1+ n))) ((= n dim)) (declare (fixnum n)) (setf (aref temp-values 1) 0.0d0) (do ((k m (1+ k))) ((> k n)) (declare (fixnum k)) (incf (aref temp-values 1) (grid:gref* zvector k))) (incf (aref temp-values 0) (expt (aref temp-values 1) -2 (setf (grid:gref* output i) (- (grid:gref* zvector i) (aref temp-values 0) ___ GSLL-devel mailing list GSLL-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel
Re: [GSLL-devel] [Gsll-devel] Efficient access to externally generated double-float arrays?
Liam, I have tried this and (without using (the double-float (...)) since this is unnecessary for cffi:mem-aref with hardcoded :double data type), I get the following timing data: using the standard grid:gref routine: Evaluation took: 0.542 seconds of real time 0.536671 seconds of total run time (0.525229 user, 0.011442 system) [ Run times consist of 0.037 seconds GC time, and 0.500 seconds non-GC time. ] 99.08% CPU 1,189,729,211 processor cycles 153,334,416 bytes consed using the standard grid:gref* routine: Evaluation took: 0.067 seconds of real time 0.066006 seconds of total run time (0.063179 user, 0.002827 system) [ Run times consist of 0.008 seconds GC time, and 0.059 seconds non-GC time. ] 98.51% CPU 146,192,222 processor cycles 27,060,416 bytes consed using the modified grid:gref* routine (specialized to grid:vector-double-float) Evaluation took: 0.070 seconds of real time 0.068041 seconds of total run time (0.065151 user, 0.002890 system) [ Run times consist of 0.011 seconds GC time, and 0.058 seconds non-GC time. ] 97.14% CPU 152,642,611 processor cycles 27,061,456 bytes consed Apparently my system didn't notice the difference. Also, SBCL complains about the argument (grid::foreign-pointer object) to cffi:mem-aref being of type NUMBER instead of integer or fixnum. Furthermore, it says it has to do float to pointer coercion to . I checked that the redefined method is actually used by doing a second run with a print statement added to the defmethod. using cffi:mem-aref directly: Evaluation took: 0.002 seconds of real time 0.002483 seconds of total run time (0.002482 user, 0.01 system) 100.00% CPU 5,442,756 processor cycles 0 bytes consed I guess that even using compiler macros or other trickery one would have to remove the allocation of linearized indices and foreign ptr addresses from the inner loops as I have done in my example by using auxiliary variables zvector-fptr and output-fptr. Maybe one can define something like (with-foreign-array (name-of-array :double) ...) that locally redefines (grid:gref name-of-array ...) and (grid:gref* name-of-array ...) as macros evaluating to cffi:mem-aref and storing the respective linearized indices and memory pointers at the level of with-foreign-array? Although not as convenient as some 'self-optimizing' grid:gref, I would consider this a satisfactory solution. Don't know how to do that without getting lost in a forest of commas and backquotes, though. best regards, Sebastian On 27.10.2010, at 05:22, Liam Healy wrote: > Sebastian, > > Can you temporarily define this and find the timing/consing for your > test case: > > (defmethod gref* ((object vector-double-float) linearized-index) > (cffi:mem-aref > (foreign-pointer object) > :double > linearized-index)) > > (I think you don't use any matrices but if you do, define an analogous > function for matrix-double-float.) > > As you can see, it has the literal type declaration, and I'm hopeful > that CFFI will pick that up and make this competitive in speed with > the best that you saw. If that's so, it should be fairly easy for me > to make this generic and incorporate it into GSD. I'm still > interested in making the linearization more efficient if that's still > significant, but let's try this for now to see how much speed we can > squeeze out of gref*. > > Thanks, > > Liam ___ GSLL-devel mailing list GSLL-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel
Re: [GSLL-devel] [Gsll-devel] Efficient access to externally generated double-float arrays?
Sebastian, Can you temporarily define this and find the timing/consing for your test case: (defmethod gref* ((object vector-double-float) linearized-index) (cffi:mem-aref (foreign-pointer object) :double linearized-index)) (I think you don't use any matrices but if you do, define an analogous function for matrix-double-float.) As you can see, it has the literal type declaration, and I'm hopeful that CFFI will pick that up and make this competitive in speed with the best that you saw. If that's so, it should be fairly easy for me to make this generic and incorporate it into GSD. I'm still interested in making the linearization more efficient if that's still significant, but let's try this for now to see how much speed we can squeeze out of gref*. Thanks, Liam On Tue, Oct 26, 2010 at 10:25 AM, Sebastian Sturm wrote: > It seems that CFFI includes some compiler macros that use type information > supplied at compile time to generate more efficient code (got that from the > cffi mailing > list, http://www.mail-archive.com/cffi-de...@common-lisp.net/msg01154.html). > In my case, I'm using this optimization by supplying :double to > cffi:mem-aref. If I replace this by (cl-cffi (element-type zvector)), as is > done internally by gref, then (again with dim = 50), better-force-function > uses around 1.8 GCycles and conses 80 MB in the process, whereas the :double > version needs ~ 8.6 MCycles, not consing anything. The slow-but-flexible > version of better-force-function reads as follows: > (defun better-force-function (dim) > "Given an integer dim, this constructs a function that, when supplied with > a > N-dimensional vector Z and some output vector (-> pointer?), yields the > corresponding forces" > (declare (fixnum dim)) > (let ((temp-values (make-array 2 :element-type 'double-float > :initial-element 0.0d0))) > (lambda (zvector output) > (let ((zvector-fptr (grid::foreign-pointer zvector)) > (output-fptr (grid::foreign-pointer output)) > ;; this makes it worse > (elt-type (grid:cl-cffi (grid:element-type zvector))) > ) > (macrolet ((quick-ref (the-vector n) > `(cffi:mem-aref > ,(case the-vector > (zvector 'zvector-fptr) > (output 'output-fptr)) > ;; :double > elt-type ;; replace this by :double > ,n))) > (do ((i 0 (1+ i))) ((= i dim)) (declare (fixnum i)) > (setf (aref temp-values 0) 0.0d0) > (do ((m 0 (1+ m))) ((> m i)) (declare (fixnum m)) > (do ((n i (1+ n))) ((= n dim)) (declare (fixnum n)) > (setf (aref temp-values 1) 0.0d0) > (do ((k m (1+ k))) ((> k n)) (declare (fixnum k)) > (incf (aref temp-values 1) (quick-ref zvector k))) ;; generates efficiency > warnings when using elt-type > (incf (aref temp-values 0) (expt (aref temp-values 1) -2 > (setf (quick-ref output i) > (- (quick-ref zvector i) > (aref temp-values 0) > Also, with the variable type left unspecified at compile time, the innermost > loop generates efficiency warnings telling me that generic-+ needs to be > used. Writing (the double-float (quick-ref zvector k)) removes these and > slightly reduces the consing amount of the slow variant to ~ 63 MB. I still > have to try the SLIME profiler though. > thanks, > Sebastian ___ GSLL-devel mailing list GSLL-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel
Re: [GSLL-devel] [Gsll-devel] Efficient access to externally generated double-float arrays?
On Fri, Oct 22, 2010 at 2:13 PM, Sebastian Sturm wrote: >> I'm surprised about the grid:gref result. Please run a profiler and >> confirm that the time consumed is within grid:gref and its callees, > > I tried using sb-sprof, but as a complete CL newbie I had a hard time making > sense of its output and thus ended up ignoring the profiler and using > trial-and-error again. It seems that a fair amount of consing is due to the > index linearization. Simply replacing grid:gref by grid:gref* for > one-dimensional arrays reduces consing by a factor of 3 in my experiments. It > seems that the rest can be eliminated by specifying the data type (here it's > :double) at compile-time; incorporating this as an option to grid:gref via > some macro-trickery would probably uglify the GSLL framework somewhat (I > guess?), but I'd be willing to pay that price in return for an > order-of-magnitude speedup. Also, I haven't tried if similar problems arise > (and if similar workarounds are available) for the case of complex-valued > arrays. Interesting. It's plausible to me that linearization is causing the problem; I think that there is room for improvement there. Your example with the mem-aref show that it's got to be that and/or generic function dispatch, because as you can see on line 35 of http://repo.or.cz/w/gsd.git/blob/e537bd9be551b90c9fbfb6020a6119f3df00d650:/foreign-array/methods.lisp, gref* essentially just calls mem-aref. So maybe if we can add declarations and inlining, we can get the same effect with the current code. If you can't make sense of sb-prof and you use slime, you might try the slime profiler: http://lhealy.livejournal.com/8495.html I find its output much clearer. I'm not sure where you are putting a :double declaration to get that speedup; can you post some example code? It's possible that there is some loss of effiicency in using CLOS (generic function dispatch) that could be remedied by declarations or compiler macros. That's beyond my realm of expertise, but I'm willing to dig into it. Liam ___ GSLL-devel mailing list GSLL-devel@common-lisp.net http://common-lisp.net/cgi-bin/mailman/listinfo/gsll-devel