subject:"Re\: \[racket\-dev\] speeding up 16\-bit integer adds"

Re: [racket-dev] speeding up 16-bit integer adds

2010-09-24 Thread Noel Welsh

On Fri, Sep 24, 2010 at 3:42 AM, John Clements
 wrote:
the inner loop. Grr! Any suggestions?

Inline assembly? It works and is easy to do -- you'll need to extend
http://github.com/noelwelsh/assembler/ with jumps. I'm serious.

N.
_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev

Re: [racket-dev] speeding up 16-bit integer adds

2010-09-23 Thread John Clements

On Sep 23, 2010, at 9:46 PM, John Clements wrote:

> 
> On Sep 23, 2010, at 8:16 PM, Matthew Flatt wrote:
> 
>> One more thought: Do you get to pick whether you use 16-bit integers or
>> 64-bit floating-point numbers? The `flvector-' and `f64vector-'
>> operations are inlined by the JIT and recognized for unboxing, so using
>> flonum vectors and operations could be much faster than using raw
>> pointers and 16-bit integers.
> 
> Well, that's an option, albeit a somewhat unappetizing one; as the 44100 in 
> my code no doubt signaled, I'm reading and writing sound data here, and both 
> 16-bit ints and 32-bit floats are fairly common. 64-bit floats will be 
> another factor of 2 in memory, for a total of 42 megabytes per minute.
> 
> I ran some tests, using flvectors and unsafe operations everywhere. (Code 
> below.)

Update before going to bed; re-running the C tests with doubles everywhere and 
the same setup (simply adding together two big buffers) took about half a 
second, so in fact in this instance Racket is less that 10x slower, which is as 
fast as I would expect it to be.  So basically, it sounds like the flvectors 
are the way to go, if I can stomach the memory usage.

Thanks again,

John

smime.p7s
Description: S/MIME cryptographic signature
_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev

Re: [racket-dev] speeding up 16-bit integer adds

2010-09-23 Thread John Clements

On Sep 23, 2010, at 8:16 PM, Matthew Flatt wrote:

> One more thought: Do you get to pick whether you use 16-bit integers or
> 64-bit floating-point numbers? The `flvector-' and `f64vector-'
> operations are inlined by the JIT and recognized for unboxing, so using
> flonum vectors and operations could be much faster than using raw
> pointers and 16-bit integers.

Well, that's an option, albeit a somewhat unappetizing one; as the 44100 in my 
code no doubt signaled, I'm reading and writing sound data here, and both 
16-bit ints and 32-bit floats are fairly common. 64-bit floats will be another 
factor of 2 in memory, for a total of 42 megabytes per minute.

I ran some tests, using flvectors and unsafe operations everywhere. (Code 
below.)

My tests called for 400 seconds of audio, or 282 Megabytes, and this made 
DrRacket flustered.  Restarting and running with half that size yielded (quite 
variable) times between 1 and 3 seconds, so that appears about twice as fast as 
the fixed-point one.

I'm tempted to write a little C code, but then of course I have to compile it 
separately for every darn platform.

Thanks again for your help,

John

#lang racket

(require ffi/unsafe
 racket/flonum
 racket/unsafe/ops)

(define (make-buffer-of-small-randoms len)
  (let ([buf (make-flvector len)])
(for ([i (in-range len)])
  (unsafe-flvector-set! buf i 0.73))
buf))

(define buf-len (* 44100 2 100))

(define b1 (make-buffer-of-small-randoms buf-len))
(define b2 (make-buffer-of-small-randoms buf-len))

(time
 (for ([i (in-range buf-len)])
   (unsafe-flvector-set! b1 i
 (unsafe-fl+ (unsafe-flvector-ref b1 i)
 (unsafe-flvector-ref b2 i)

smime.p7s
Description: S/MIME cryptographic signature
_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev

Re: [racket-dev] speeding up 16-bit integer adds

2010-09-23 Thread John Clements


On Sep 23, 2010, at 7:55 PM, Matthew Flatt wrote:

> I think the problem is that the `ptr-ref' and `ptr-set!' operations are
> slow. They are slow because they not yet inlined by the JIT, and
> they're not yet inlined because they have complicated APIs (including a
> "pointer" datatype with many variants).
> 
> I haven't worked out a way to make them faster or a way to provide
> faster variants, but it's on my list.

Okay, thanks.  FWIW, my attempt to use the s16vector variants performs 
similarly; perhaps these primitives call the same code.

John Clements
 

smime.p7s
Description: S/MIME cryptographic signature
_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev

Re: [racket-dev] speeding up 16-bit integer adds

2010-09-23 Thread Matthew Flatt

One more thought: Do you get to pick whether you use 16-bit integers or
64-bit floating-point numbers? The `flvector-' and `f64vector-'
operations are inlined by the JIT and recognized for unboxing, so using
flonum vectors and operations could be much faster than using raw
pointers and 16-bit integers.

At Thu, 23 Sep 2010 19:42:15 -0700, John Clements wrote:
> I'm trying to add together big buffers. The following code creates two big 
> fat 
> buffers of 16-bit integers, and adds them together destructively. It looks to 
> me like this code *could* run really fast, but it doesn't; this takes about 
> 8.5 seconds. Changing + to unsafe-fx+ has no detectable effect.  Is there 
> allocation going on in the inner loop? I'd hoped that since an _sint16 fits 
> safely in 31 bits, that no memory would be allocated in the inner loop. Grr! 
> Any suggestions? (I ran a similar test on floats, and C ran about 64x faster, 
> about a tenth of a second).
> 
> Doc pointers appreciated as always,
> 
> John
> 
> #lang racket 
> 
> (require ffi/unsafe)
> 
> (define (make-buffer-of-small-random-ints len)
>   (let ([buf (malloc _sint16 len)])
> (for ([i (in-range len)])
>   (ptr-set! buf _sint16 i 73))
> buf))
> 
> (define buf-len (* 44100 2 200))
> 
> (define b1 (make-buffer-of-small-random-ints buf-len))
> (define b2 (make-buffer-of-small-random-ints buf-len))
> 
> (time
>  (for ([i (in-range buf-len)])
>(ptr-set! b1 _sint16 i 
>  (+ (ptr-ref b1 _sint16 i)
> (ptr-ref b2 _sint16 i)
> --
> [application/#f "smime.p7s"] [~/Desktop & open] [~/Temp & open]
> _
>   For list-related administrative tasks:
>   http://lists.racket-lang.org/listinfo/dev
_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev

Re: [racket-dev] speeding up 16-bit integer adds

2010-09-23 Thread Matthew Flatt

I think the problem is that the `ptr-ref' and `ptr-set!' operations are
slow. They are slow because they not yet inlined by the JIT, and
they're not yet inlined because they have complicated APIs (including a
"pointer" datatype with many variants).

I haven't worked out a way to make them faster or a way to provide
faster variants, but it's on my list.

At Thu, 23 Sep 2010 19:42:15 -0700, John Clements wrote:
> I'm trying to add together big buffers. The following code creates two big 
> fat 
> buffers of 16-bit integers, and adds them together destructively. It looks to 
> me like this code *could* run really fast, but it doesn't; this takes about 
> 8.5 seconds. Changing + to unsafe-fx+ has no detectable effect.  Is there 
> allocation going on in the inner loop? I'd hoped that since an _sint16 fits 
> safely in 31 bits, that no memory would be allocated in the inner loop. Grr! 
> Any suggestions? (I ran a similar test on floats, and C ran about 64x faster, 
> about a tenth of a second).
> 
> Doc pointers appreciated as always,
> 
> John
> 
> #lang racket 
> 
> (require ffi/unsafe)
> 
> (define (make-buffer-of-small-random-ints len)
>   (let ([buf (malloc _sint16 len)])
> (for ([i (in-range len)])
>   (ptr-set! buf _sint16 i 73))
> buf))
> 
> (define buf-len (* 44100 2 200))
> 
> (define b1 (make-buffer-of-small-random-ints buf-len))
> (define b2 (make-buffer-of-small-random-ints buf-len))
> 
> (time
>  (for ([i (in-range buf-len)])
>(ptr-set! b1 _sint16 i 
>  (+ (ptr-ref b1 _sint16 i)
> (ptr-ref b2 _sint16 i)

_
  For list-related administrative tasks:
  http://lists.racket-lang.org/listinfo/dev

Re: [racket-dev] speeding up 16-bit integer adds

Re: [racket-dev] speeding up 16-bit integer adds

Re: [racket-dev] speeding up 16-bit integer adds

Re: [racket-dev] speeding up 16-bit integer adds

Re: [racket-dev] speeding up 16-bit integer adds

Re: [racket-dev] speeding up 16-bit integer adds

6 matches

Site Navigation

Mail list logo

Footer information