Re: [racket-dev] speeding up 16-bit integer adds

2010-09-24 Thread Noel Welsh
On Fri, Sep 24, 2010 at 3:42 AM, John Clements cleme...@brinckerhoff.org wrote: the inner loop. Grr! Any suggestions? Inline assembly? It works and is easy to do -- you'll need to extend http://github.com/noelwelsh/assembler/ with jumps. I'm serious. N.

[racket-dev] speeding up 16-bit integer adds

2010-09-23 Thread John Clements
I'm trying to add together big buffers. The following code creates two big fat buffers of 16-bit integers, and adds them together destructively. It looks to me like this code *could* run really fast, but it doesn't; this takes about 8.5 seconds. Changing + to unsafe-fx+ has no detectable

Re: [racket-dev] speeding up 16-bit integer adds

2010-09-23 Thread Matthew Flatt
I think the problem is that the `ptr-ref' and `ptr-set!' operations are slow. They are slow because they not yet inlined by the JIT, and they're not yet inlined because they have complicated APIs (including a pointer datatype with many variants). I haven't worked out a way to make them faster or

Re: [racket-dev] speeding up 16-bit integer adds

2010-09-23 Thread John Clements
On Sep 23, 2010, at 7:55 PM, Matthew Flatt wrote: I think the problem is that the `ptr-ref' and `ptr-set!' operations are slow. They are slow because they not yet inlined by the JIT, and they're not yet inlined because they have complicated APIs (including a pointer datatype with many

Re: [racket-dev] speeding up 16-bit integer adds

2010-09-23 Thread John Clements
On Sep 23, 2010, at 8:16 PM, Matthew Flatt wrote: One more thought: Do you get to pick whether you use 16-bit integers or 64-bit floating-point numbers? The `flvector-' and `f64vector-' operations are inlined by the JIT and recognized for unboxing, so using flonum vectors and operations

Re: [racket-dev] speeding up 16-bit integer adds

2010-09-23 Thread John Clements
On Sep 23, 2010, at 9:46 PM, John Clements wrote: On Sep 23, 2010, at 8:16 PM, Matthew Flatt wrote: One more thought: Do you get to pick whether you use 16-bit integers or 64-bit floating-point numbers? The `flvector-' and `f64vector-' operations are inlined by the JIT and recognized for