Re: [racket-dev] [plt] Push #28239: master branch updated
On Tue, Feb 25, 2014 at 8:34 AM, mfl...@racket-lang.org wrote: eff53cd Matthew Flatt mfl...@racket-lang.org 2014-02-24 16:42 : | treat FFI primitives like other primitives internally | | This change paves the way for JIT-inlining FFI operations | such as `ptr-ref`. Even without JIT treatment, the change | slightly reduces the overhead for calling FFI primitives. Do you have benchmarks for the FFI that you used here? Sam _ Racket Developers list: http://lists.racket-lang.org/dev
Re: [racket-dev] [plt] Push #28239: master branch updated
At Tue, 25 Feb 2014 09:22:54 -0500, Sam Tobin-Hochstadt wrote: On Tue, Feb 25, 2014 at 8:34 AM, mfl...@racket-lang.org wrote: eff53cd Matthew Flatt mfl...@racket-lang.org 2014-02-24 16:42 : | treat FFI primitives like other primitives internally | | This change paves the way for JIT-inlining FFI operations | such as `ptr-ref`. Even without JIT treatment, the change | slightly reduces the overhead for calling FFI primitives. Do you have benchmarks for the FFI that you used here? The claim of reduced overhead was based on removing indirections through scheme_do_eval(), as visible in a low-level profile of a drawing loop. Now that you ask, though, the change cuts about 25% of the time for the loop below on my machine. #lang racket/base (require ffi/unsafe) (define N 1000) (define p (malloc N)) (time (for ([i (in-range 1000)]) (for ([j (in-range N)]) (ptr-set! p _byte j 42 _ Racket Developers list: http://lists.racket-lang.org/dev
Re: [racket-dev] [plt] Push #28239: master branch updated
On Tue, Feb 25, 2014 at 9:32 AM, Matthew Flatt mfl...@cs.utah.edu wrote: At Tue, 25 Feb 2014 09:22:54 -0500, Sam Tobin-Hochstadt wrote: On Tue, Feb 25, 2014 at 8:34 AM, mfl...@racket-lang.org wrote: eff53cd Matthew Flatt mfl...@racket-lang.org 2014-02-24 16:42 : | treat FFI primitives like other primitives internally | | This change paves the way for JIT-inlining FFI operations | such as `ptr-ref`. Even without JIT treatment, the change | slightly reduces the overhead for calling FFI primitives. Do you have benchmarks for the FFI that you used here? The claim of reduced overhead was based on removing indirections through scheme_do_eval(), as visible in a low-level profile of a drawing loop. Now that you ask, though, the change cuts about 25% of the time for the loop below on my machine. Nice! What code in the drawing loop were you profiling, though? I'm really looking for a larger-scale Racket program where the FFI is performance-important, rather than a micro-benchmark. Would the slideshow benchmark from our OOPSLA paper be a reasonable choice here? Sam #lang racket/base (require ffi/unsafe) (define N 1000) (define p (malloc N)) (time (for ([i (in-range 1000)]) (for ([j (in-range N)]) (ptr-set! p _byte j 42 _ Racket Developers list: http://lists.racket-lang.org/dev
Re: [racket-dev] [plt] Push #28239: master branch updated
At Tue, 25 Feb 2014 09:37:27 -0500, Sam Tobin-Hochstadt wrote: On Tue, Feb 25, 2014 at 9:32 AM, Matthew Flatt mfl...@cs.utah.edu wrote: At Tue, 25 Feb 2014 09:22:54 -0500, Sam Tobin-Hochstadt wrote: On Tue, Feb 25, 2014 at 8:34 AM, mfl...@racket-lang.org wrote: eff53cd Matthew Flatt mfl...@racket-lang.org 2014-02-24 16:42 : | treat FFI primitives like other primitives internally | | This change paves the way for JIT-inlining FFI operations | such as `ptr-ref`. Even without JIT treatment, the change | slightly reduces the overhead for calling FFI primitives. Do you have benchmarks for the FFI that you used here? The claim of reduced overhead was based on removing indirections through scheme_do_eval(), as visible in a low-level profile of a drawing loop. Now that you ask, though, the change cuts about 25% of the time for the loop below on my machine. Nice! What code in the drawing loop were you profiling, though? I'm really looking for a larger-scale Racket program where the FFI is performance-important, rather than a micro-benchmark. See below, but I don't think FFI performance turns out to be crucial. Streamlining calls of FFI functions was more about reducing noise in the information that I was looking at (i.e., making sure I understood where scheme_do_eval() indirections came from). I have been looking at editor refresh in DrRacket, and while time spent in foreign functions is significant, it doesn't appear that the path to get there is a bottleneck. Would the slideshow benchmark from our OOPSLA paper be a reasonable choice here? No, that benchmark mostly constructs picts without rendering them. Offhand, I can't think of an existing program that makes a good benchmark, but I'll keep it in mind. #lang racket/gui (define bm (make-bitmap 600 600)) (define dc (send bm make-dc)) (define (copy n s) (apply string-append (for/list ([i (in-range n)]) s))) (define (go) (for ([i (in-range 100)]) (for ([j (in-range 100)]) (send dc draw-text hello (* i 5) (* j 5) (time (go)) (time (for ([i (in-range 50)]) (go))) _ Racket Developers list: http://lists.racket-lang.org/dev