Do any of these functions turn out to have case-> contracts by the time TR gets done with them?
Robby On Tue, Jan 14, 2014 at 12:22 PM, Neil Toronto <neil.toro...@gmail.com>wrote: > An update on my math-centered tests. The first is the built-in `magnitude' > vs. a TR near-transliteration of its C implementation, called from > *untyped* Racket. The numbers are in milliseconds, for 5 million calls, by > input type: > > > Function Flonum Rational Fixnum Integer Float-Complex > ------------------------------------------------------------------- > Pre-static contracts: > > magnitude* 385 419 378 414 686 > magnitude 59 44 40 40 390 > Post-static contracts: > magnitude* 175 196 164 195 475 > magnitude 68 43 36 40 406 > > All but the Float-Complex case just return the absolute value of the > argument, so there's not much computation. They're dominated by contract > checking, and the speedup is 0.45 to 0.5. Nice. The Float-Complex case does > something substantive, and is close to C. That's awesome. > > So for really small functions (i.e. just a test and a call to `abs'), > writing them in C is still better. But for anything much larger (that part > of `magnitude*' is about 10 lines and 7-13 numeric ops), a TR > implementation isn't much slower than C (about 17%). > > -- > > 1 million iterations of some math library flonum functions, in TR, untyped > before Robby's changes, untyped after Robby's 12/12 changes, and untyped > after Eric's static contract changes, in milliseconds: > > Function TR Untyped Robby's Robby's+Eric's > -------------------------------------------------------------- > flrational? 5 322 98 37 > flsinh 55 343 121 86 > fllog1p 47 351 117 80 > lg+ 61 384 154 115 > flgamma 165 521 262 234 > > So calling TR floating-point functions from untyped Racket takes only 0.11 > to 0.45 times the amount of time it took only a month ago, with typical > cases being about 0.25. That's awesome. > > Neil ⊥ > > > On 01/14/2014 10:27 AM, Eric Dobson wrote: > >> The changes to TR contract generation are now in at HEAD. >> >> If you can find any cases where contracts are still slowing programs >> down by a lot I'd like to take a look at them. (Excepting the case of >> structs where I know it is still an issue). >> >> On Thu, Dec 12, 2013 at 10:52 AM, Robby Findler >> <ro...@eecs.northwestern.edu> wrote: >> >>> Re-reading your message I see that you're not actually asserting >>> something >>> different from what I said, but just for some precision here I wish to >>> point >>> out that I wasn't basing my opinion on intuition from the code, but on >>> some >>> microbenchmark timings. (There was a much more substantial difference >>> yesterday because the loop inside any-wrap/c wasn't as cheap as it could >>> have been.) >>> >>> I'd be interested to see if your improvements to type->contract improve >>> the >>> situation any! I expect they will make things better again for the Number >>> case, but at the moment, there isn't a big difference. >>> >>> Program 1: >>> >>> >>> #lang racket/base >>> (module m typed/racket/base >>> (: f (Any -> Any)) >>> (define (f x) 1) >>> (provide f)) >>> (require 'm) >>> (time >>> (for ([x (in-range 20000)]) >>> (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) >>> (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) >>> (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) >>> (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) >>> (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) >>> (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) >>> (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) >>> (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7))) >>> >>> Timings: >>> >>> cpu time: 142 real time: 142 gc time: 8 >>> cpu time: 144 real time: 144 gc time: 7 >>> cpu time: 144 real time: 143 gc time: 6 >>> cpu time: 142 real time: 142 gc time: 6 >>> cpu time: 142 real time: 142 gc time: 7 >>> cpu time: 146 real time: 146 gc time: 6 >>> >>> Program 2: >>> >>> >>> #lang racket/base >>> (module m typed/racket/base >>> (: f (Any -> Integer)) >>> >>> (define (f x) 1) >>> (provide f)) >>> (require 'm) >>> (time >>> (for ([x (in-range 20000)]) >>> (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) >>> (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) >>> (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) >>> (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) >>> (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) >>> (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) >>> (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7) >>> (f 0) (f 1) (f 2) (f 3) (f 4) (f 5) (f 6) (f 7))) >>> >>> >>> Timings: >>> >>> cpu time: 139 real time: 138 gc time: 7 >>> cpu time: 145 real time: 144 gc time: 7 >>> cpu time: 140 real time: 140 gc time: 6 >>> cpu time: 151 real time: 150 gc time: 6 >>> cpu time: 139 real time: 138 gc time: 6 >>> cpu time: 139 real time: 139 gc time: 8 >>> >>> >>> >>> >>> >>> >>> >>> >>> On Thu, Dec 12, 2013 at 12:33 PM, Eric Dobson <eric.n.dob...@gmail.com> >>> wrote: >>> >>>> >>>> any-wrap/c still requires the check for one value, while any (which is >>>> from Number not Any) does not. So I would still guess at Number being >>>> faster, but Robby's changes may make it so that inlining and dead code >>>> elimination can see through everything and turn it into the same code. >>>> >>>> On Thu, Dec 12, 2013 at 10:27 AM, Robby Findler >>>> <ro...@eecs.northwestern.edu> wrote: >>>> >>>>> FWIW, my push speeds up the any-wrap/c implementation a bunch. Those >>>>> two >>>>> should have similar speeds after you get that, I guess. >>>>> >>>>> Robby >>>>> >>>>> >>>>> On Thu, Dec 12, 2013 at 11:03 AM, Neil Toronto <neil.toro...@gmail.com >>>>> > >>>>> wrote: >>>>> >>>>>> >>>>>> I tried your branch that implements it and saw about 3.5x speedup for >>>>>> the >>>>>> `magnitude*' test. This is of course without Robby's recent >>>>>> first-order >>>>>> contract changes. >>>>>> >>>>>> (I think it's about 3.5x: I tried with magnitude* : Number -> Any >>>>>> first >>>>>> and got 2400ms on the easy tests. I changed it to magnitude* : Number >>>>>> -> >>>>>> Number and got 690ms. Apparently, for an `Any' return type, an >>>>>> `any-wrap/c' >>>>>> contract is generated instead of nothing. If that's much faster to >>>>>> check >>>>>> than `number?', though, the speedup is even better.) >>>>>> >>>>>> I'd love to see this with Robby's recent changes. Hint? Nudge? Please? >>>>>> >>>>>> I didn't see very much speedup with arrays (about 1.2x). Speed tests >>>>>> on >>>>>> the math library's distribution objects were very interesting, though, >>>>>> and >>>>>> indicate why the arrays might not be much faster. Here's my test >>>>>> program: >>>>>> >>>>>> >>>>>> #lang racket >>>>>> >>>>>> (require math/distributions) >>>>>> >>>>>> (define d (normal-dist 0 1)) >>>>>> >>>>>> (printf "pdf d 0~n") >>>>>> (for ([_ (in-range 5)]) >>>>>> (time (for ([_ (in-range 100000)]) >>>>>> (pdf d 0)))) >>>>>> (newline) >>>>>> >>>>>> (define p (distribution-pdf d)) >>>>>> (printf "p 0~n") >>>>>> (for ([_ (in-range 5)]) >>>>>> (time (for ([_ (in-range 100000)]) >>>>>> (p 0)))) >>>>>> (newline) >>>>>> >>>>>> >>>>>> The two tests are equivalent, as `pdf' just pulls the pdf function out >>>>>> of >>>>>> the distribution struct and applies it. In TR, the tests are exactly >>>>>> the >>>>>> same speed (extremely fast). In untyped Racket, on the main branch, >>>>>> the >>>>>> second test is 16x faster, and on your branch, it's 44x faster. (It's >>>>>> still >>>>>> 10x slower than TR on your branch, so again... I'd love to see your >>>>>> changes >>>>>> and Robby's together. :D) >>>>>> >>>>>> Neil ⊥ >>>>>> >>>>>> >>>>>> On 12/12/2013 12:40 AM, Eric Dobson wrote: >>>>>> >>>>>>> >>>>>>> Removing the return value checking is in the works. It actually is >>>>>>> removing all of the checks that would blame typed code, so higher >>>>>>> order functions/datastructure get improvements too. It is actually >>>>>>> functional the last time I checked, but lacking documentation which >>>>>>> is >>>>>>> what is holding up merging with mainline. >>>>>>> >>>>>>> https://github.com/plt/racket/pull/453 >>>>>>> >>>>>>> On Wed, Dec 11, 2013 at 7:57 PM, Robby Findler >>>>>>> <ro...@eecs.northwestern.edu> wrote: >>>>>>> >>>>>>>> >>>>>>>> I see that TR's type->contract returns >>>>>>>> >>>>>>>> (-> (flat-named-contract (quote Float) flonum?) >>>>>>>> (flat-named-contract >>>>>>>> (quote >>>>>>>> Float) flonum?)) >>>>>>>> >>>>>>>> for the type (Float -> Float), but it could return >>>>>>>> >>>>>>>> (-> (flat-named-contract (quote Float) flonum?) any) >>>>>>>> >>>>>>>> which wouldn't do any result value checking (this being different >>>>>>>> from >>>>>>>> any/c >>>>>>>> as the range of the arrow contract). >>>>>>>> >>>>>>>> Robby >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Dec 11, 2013 at 6:18 PM, Neil Toronto >>>>>>>> <neil.toro...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 12/11/2013 02:49 PM, Neil Toronto wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 12/11/2013 01:55 PM, Stephen Bloch wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Dec 11, 2013, at 2:36 PM, Neil Toronto wrote: >>>>>>>>>>>> >>>>>>>>>>>> numeric primitives implemented in Typed Racket are faster than >>>>>>>>>>>>> the >>>>>>>>>>>>> same primitives implemented in C. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Whoa! How did that happen? >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Whoa! That's not what I meant! O_o >>>>>>>>>> >>>>>>>>>> I said "we might be getting close" to that. I haven't tried >>>>>>>>>> porting >>>>>>>>>> a >>>>>>>>>> numeric C primitive to TR yet, but I have a hunch that it'll still >>>>>>>>>> be >>>>>>>>>> slower. I'll try one now and report what I find. >>>>>>>>>> >>>>>>>>>> Neil ⊥ >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> I can't figure out why `flsinh' is faster to call from untyped >>>>>>>>> Racket >>>>>>>>> than >>>>>>>>> `sinh'. All my tests with a Typed Racket `magnitude' show calls >>>>>>>>> from >>>>>>>>> untyped >>>>>>>>> code are significantly slower, except in the one case that it >>>>>>>>> computes >>>>>>>>> Euclidean distance. That case is only twice as slow. >>>>>>>>> >>>>>>>>> I've attached the benchmark program. The `magnitude*' function is >>>>>>>>> more >>>>>>>>> or >>>>>>>>> less a direct translation of `magnitude' from "number.c" into Typed >>>>>>>>> Racket. >>>>>>>>> Here's a summary of the results I get on my computer, in >>>>>>>>> milliseconds, >>>>>>>>> for 5 >>>>>>>>> million calls from untyped Racket, by data type. >>>>>>>>> >>>>>>>>> >>>>>>>>> Function Flonum Rational Fixnum Integer Float-Complex >>>>>>>>> ------------------------------------------------------------ >>>>>>>>> ------- >>>>>>>>> magnitude* 385 419 378 414 686 >>>>>>>>> magnitude 59 44 40 40 390 >>>>>>>>> >>>>>>>>> >>>>>>>>> The only one that's close in relative terms is Float-Complex. The >>>>>>>>> others >>>>>>>>> just call `abs'. The decompiled code doesn't show any inlining of >>>>>>>>> `magnitude', so this comparison should be good. >>>>>>>>> >>>>>>>>> I'll bet checking the return value contract (which is unnecessary) >>>>>>>>> is >>>>>>>>> the >>>>>>>>> main slowdown. It has to check for number of values. >>>>>>>>> >>>>>>>>> For comparison, here are the timings for running the benchmarks in >>>>>>>>> TR >>>>>>>>> with >>>>>>>>> #:no-optimize: >>>>>>>>> >>>>>>>>> >>>>>>>>> Function Flonum Rational Fixnum Integer Float-Complex >>>>>>>>> ------------------------------------------------------------ >>>>>>>>> ------- >>>>>>>>> magnitude* 45 70* 37 102* 318 >>>>>>>>> magnitude 61 45 39 91* 394 >>>>>>>>> >>>>>>>>> * = unexpectedly >>>>>>>>> high >>>>>>>>> >>>>>>>>> >>>>>>>>> Here's what I understand from comparing the numbers: >>>>>>>>> >>>>>>>>> * Except for non-fixnum integers, calling `magnitude' in TR is >>>>>>>>> just >>>>>>>>> as >>>>>>>>> fast as in untyped Racket. I have no idea why it would be slower on >>>>>>>>> big >>>>>>>>> integers. That's just weird. >>>>>>>>> >>>>>>>>> * Calling `abs' in Racket is faster than calling `scheme_abs' in >>>>>>>>> C, >>>>>>>>> except on rationals and big integers. >>>>>>>>> >>>>>>>>> * Operating on flonums in Typed Racket, using generic numeric >>>>>>>>> functions, >>>>>>>>> is faster than doing the same in C. >>>>>>>>> >>>>>>>>> Overall, it looks like the TR code is within the same order of >>>>>>>>> magnitude >>>>>>>>> (pun not intended) as the C code. I would love to try this >>>>>>>>> benchmark >>>>>>>>> with >>>>>>>>> either 1) a `magnitude*' with an `AnyValues' return type; or 2) a >>>>>>>>> contract >>>>>>>>> boundary that doesn't check TR's return types for first-order >>>>>>>>> functions. >>>>>>>>> >>>>>>>>> (I managed to make a `magnitude*' with type Number -> AnyValues, >>>>>>>>> but >>>>>>>>> TR >>>>>>>>> couldn't make a contract for it.) >>>>>>>>> >>>>>>>>> Neil ⊥ >>>>>>>>> >>>>>>>>> >>>>>>>>> _________________________ >>>>>>>>> Racket Developers list: >>>>>>>>> http://lists.racket-lang.org/dev >>>>>>>>> >>>>>>>>> >>>>>>>> _________________________ >>>>>>>> Racket Developers list: >>>>>>>> http://lists.racket-lang.org/dev >>>>>>>> >>>>>>>> >>>>>> >>>>> >
_________________________ Racket Developers list: http://lists.racket-lang.org/dev