Nicolas, you'd mentioned a log_2 n implementation for number parsing. I think by that you mean the division of digits into two "packets" that are processed and then added?
I tried a straight-forward fold (using the * 10 that you said you didn't like!) Time millisecondsToRun: [10000 timesRepeat: ['123456789012345678901234567890' inject: 0 into: [:acc :each | 10 * acc + each digitValue]]] => 818 and what I think you meant: | f | f := 1. "Just to shut the compiler up about the self-reference." f := [:someDigits | | len | len := someDigits size. (len = 1) ifTrue: [someDigits first digitValue] ifFalse: [ | leftHalf rightHalf | leftHalf := someDigits first: len // 2. rightHalf := someDigits allButFirst: len // 2. ((f value: (leftHalf)) * (10 raisedTo: rightHalf size)) + (f value: rightHalf)]]. Time millisecondsToRun: [10000 timesRepeat: [f value: '123456789012345678901234567890']] => 1508 I went with the straight-forward fold because it was simpler and my (quite possibly wrong/inefficient) attempt at a log_2 n version was a lot less readable. I'd greatly appreciate being gently corrected! frank On 14 September 2011 20:26, Frank Shearar <frank.shea...@gmail.com> wrote: > On 3 September 2011 19:35, Nicolas Cellier > <nicolas.cellier.aka.n...@gmail.com> wrote: >> 2011/9/3 Frank Shearar <frank.shea...@gmail.com>: >>> On 3 September 2011 18:50, Lukas Renggli <reng...@gmail.com> wrote: >>>> I think it is a good idea to have the number parser separate, after >>>> all it might also make sense to use it separately. >>>> >>>> It seems that the new Smalltalk grammar is significantly slower. The >>>> benchmark PPSmalltalkClassesTests class>>#benchmark: that uses the >>>> source code of the collection hierarchy and does not especially target >>>> number literals runs 30% slower. >>>> >>>> Also I see that "Number readFrom: ..." is still used within the >>>> grammar. This seems to be a bit strange, no? >>> >>> Yes: it's a double-parse, which is a bit lame. First, we parse the >>> literal with PPSmalltalkNumberParser, which ensures that the thing >>> given to Number class >> #readFrom: is a well-formed token (so, in >>> particular, Squeak's Number doesn't get to see anything other than a >>> well-formed token). >>> >>> It sounds like you're happy with the basic concept, so maybe I should >>> remove the Number class >> #readFrom: stuff, see if I can't remove the >>> performance issues, and resubmit the patch. >>> >>> frank >>> >> >> Yes, a NumberParser is essentially parsing, and this duplication sounds >> useless. >> The main feature of interest in NumberParser that I consider a >> requirement and should find its equivalence in a PetitNumberParser is: >> - round a decimal representation to nearest Float >> It's simple, just convert a Fraction asFloat in a single final step to >> avoid cumulating round off errors - see >> #makeFloatFromMantissa:exponent:base: >> >> The second feature of interest in NumberParser is the ability to >> parser LargeInteger efficiently by avoiding (10 * largeValue + >> digitValue) loops, and replacing them with a log(n) cost. >> This would be a simple thing to implement in a functional language. > > Hopefully this won't offend your sensibilities too much :). It does, > in fact, use 10* loops - I wrote an experimental "front half * rear > half" recursion, which was slower in my benchmarks. > > This version has the grammar and parser doing no string->number > conversion at all. PPSmalltalkNumberMaker supplies a number of utility > methods designed to stop one from making malformed numbers. It also > supplies a builder interface that the parser uses to construct > numbers. > > frank > >> Nicolas >> >>>> Lukas >>>> >>>> >>>> On 3 September 2011 17:18, Frank Shearar <frank.shea...@gmail.com> wrote: >>>>> On 3 September 2011 15:56, Lukas Renggli <reng...@gmail.com> wrote: >>>>>> On 3 September 2011 16:51, Frank Shearar <frank.shea...@gmail.com> wrote: >>>>>>> Hi Lukas, >>>>>>> >>>>>>> I haven't :) mainly because I'm unsure where to put it - is there >>>>>>> perhaps a PP Inbox, or shall I just post the merged version, or what's >>>>>>> your preference? (How about an mcd between my merge and PP's head?) >>>>>> >>>>>> Just put the .mcz at some public URL (dropbox, squeak source, ...) or >>>>>> attach it to a mail. >>>>> >>>>> Ah, great - here it is. You'll see I've written the grammar as a >>>>> separate class. That was really more to make what I'd done more >>>>> obvious and to minimise the change to PPSmalltalkGrammar, but perhaps >>>>> it's not a bad idea anyway: it's easy to see the number literal >>>>> subgrammar. >>>>> >>>>> frank >>>>> >>>>>> Lukas >>>>>> >>>>>> -- >>>>>> Lukas Renggli >>>>>> www.lukas-renggli.ch >>>>>> >>>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Lukas Renggli >>>> www.lukas-renggli.ch >>>> >>>> >>> >>> >> >> >