Re: [Pharo-project] PetitSmalltalk: now with a self-contained grammar for number literals

Frank Shearar Sun, 25 Sep 2011 05:23:27 -0700

Nicolas, you'd mentioned a log_2 n implementation for number parsing.
I think by that you mean the division of digits into two "packets"
that are processed and then added?


I tried a straight-forward fold (using the * 10 that you said you didn't like!)

Time millisecondsToRun: [10000 timesRepeat:
['123456789012345678901234567890' inject: 0 into: [:acc :each | 10 *
acc + each digitValue]]] => 818

and what I think you meant:

| f |
f := 1. "Just to shut the compiler up about the self-reference."
f := [:someDigits | | len |
        len := someDigits size.
        (len = 1)
                ifTrue: [someDigits first digitValue]
                ifFalse: [ | leftHalf rightHalf |
                        leftHalf := someDigits first: len // 2.
                        rightHalf := someDigits allButFirst: len // 2.
                        ((f value: (leftHalf)) * (10 raisedTo: rightHalf size)) 
+ (f value:
rightHalf)]].
Time millisecondsToRun: [10000 timesRepeat: [f value:
'123456789012345678901234567890']] => 1508

I went with the straight-forward fold because it was simpler and my
(quite possibly wrong/inefficient) attempt at a log_2 n version was a
lot less readable.

I'd greatly appreciate being gently corrected!

frank

On 14 September 2011 20:26, Frank Shearar <frank.shea...@gmail.com> wrote:
> On 3 September 2011 19:35, Nicolas Cellier
> <nicolas.cellier.aka.n...@gmail.com> wrote:
>> 2011/9/3 Frank Shearar <frank.shea...@gmail.com>:
>>> On 3 September 2011 18:50, Lukas Renggli <reng...@gmail.com> wrote:
>>>> I think it is a good idea to have the number parser separate, after
>>>> all it might also make sense to use it separately.
>>>>
>>>> It seems that the new Smalltalk grammar is significantly slower. The
>>>> benchmark PPSmalltalkClassesTests class>>#benchmark: that uses the
>>>> source code of the collection hierarchy and does not especially target
>>>> number literals runs 30% slower.
>>>>
>>>> Also I see that "Number readFrom: ..." is still used within the
>>>> grammar. This seems to be a bit strange, no?
>>>
>>> Yes: it's a double-parse, which is a bit lame. First, we parse the
>>> literal with PPSmalltalkNumberParser, which ensures that the thing
>>> given to Number class >> #readFrom: is a well-formed token (so, in
>>> particular, Squeak's Number doesn't get to see anything other than a
>>> well-formed token).
>>>
>>> It sounds like you're happy with the basic concept, so maybe I should
>>> remove the Number class >> #readFrom: stuff, see if I can't remove the
>>> performance issues, and resubmit the patch.
>>>
>>> frank
>>>
>>
>> Yes, a NumberParser is essentially parsing, and this duplication sounds 
>> useless.
>> The main feature of interest in NumberParser that I consider a
>> requirement and should find its equivalence in a PetitNumberParser is:
>> - round a decimal representation to nearest Float
>> It's simple, just convert a Fraction asFloat in a single final step to
>> avoid cumulating round off errors - see
>> #makeFloatFromMantissa:exponent:base:
>>
>> The second feature of interest in NumberParser is the ability to
>> parser LargeInteger efficiently by avoiding (10 * largeValue +
>> digitValue) loops, and replacing them with a log(n) cost.
>> This would be a simple thing to implement in a functional language.
>
> Hopefully this won't offend your sensibilities too much :). It does,
> in fact, use 10* loops - I wrote an experimental "front half * rear
> half" recursion, which was slower in my benchmarks.
>
> This version has the grammar and parser doing no string->number
> conversion at all. PPSmalltalkNumberMaker supplies a number of utility
> methods designed to stop one from making malformed numbers. It also
> supplies a builder interface that the parser uses to construct
> numbers.
>
> frank
>
>> Nicolas
>>
>>>> Lukas
>>>>
>>>>
>>>> On 3 September 2011 17:18, Frank Shearar <frank.shea...@gmail.com> wrote:
>>>>> On 3 September 2011 15:56, Lukas Renggli <reng...@gmail.com> wrote:
>>>>>> On 3 September 2011 16:51, Frank Shearar <frank.shea...@gmail.com> wrote:
>>>>>>> Hi Lukas,
>>>>>>>
>>>>>>> I haven't :) mainly because I'm unsure where to put it - is there
>>>>>>> perhaps a PP Inbox, or shall I just post the merged version, or what's
>>>>>>> your preference? (How about an mcd between my merge and PP's head?)
>>>>>>
>>>>>> Just put the .mcz at some public URL (dropbox, squeak source, ...) or
>>>>>> attach it to a mail.
>>>>>
>>>>> Ah, great - here it is. You'll see I've written the grammar as a
>>>>> separate class. That was really more to make what I'd done more
>>>>> obvious and to minimise the change to PPSmalltalkGrammar, but perhaps
>>>>> it's not a bad idea anyway: it's easy to see the number literal
>>>>> subgrammar.
>>>>>
>>>>> frank
>>>>>
>>>>>> Lukas
>>>>>>
>>>>>> --
>>>>>> Lukas Renggli
>>>>>> www.lukas-renggli.ch
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Lukas Renggli
>>>> www.lukas-renggli.ch
>>>>
>>>>
>>>
>>>
>>
>>
>

Re: [Pharo-project] PetitSmalltalk: now with a self-contained grammar for number literals

Reply via email to