On 25 April 2012 21:11, Levente Uzonyi <[email protected]> wrote:
> On Fri, 20 Apr 2012, Frank Shearar wrote:
>
>> On 20 April 2012 03:51, Levente Uzonyi <[email protected]> wrote:
>>>
>>> On Thu, 19 Apr 2012, Frank Shearar wrote:
>>>
>>>> I found a serious bug in parsing numbers with negative exponents. It
>>>> was completey broken, in fact, parsing 1e-1 as 10, not 1 / 10. Anyway.
>>>> This version fixes that, and adds a bunch of tests demonstrating that
>>>> number parsing will return rationals if it can.
>>>>
>>>> It's significantly slower than Squeak's SqNumberParser:
>>>>
>>>> Time millisecondsToRun: [100000 timesRepeat: [SqNumberParser parse:
>>>> '1234567890']] => 466
>>>>
>>>> Time millisecondsToRun: [100000 timesRepeat: [PPSmalltalkNumberParser
>>>> parse: '1234567890']] => 32082
>>>>
>>>> I've attached a MessageTally spying on the latter: I've not much skill
>>>> in reading these, but nothing leaps out at me as being obviously
>>>> awful.
>>>
>>>
>>>
>>> Didn't check the code, just the tally, and I think that
>>> PPSmalltalkNumberParser(PPSmalltalkNumberGrammar)>>digitsBase: is begging
>>> for optimization. It's probably also the cause of the high amount of
>>> garbage
>>> which causes significant amount of time spent with garbage collection.
>>> It's also interesting is that the finalization process does so much work,
>>> there may be something wrong with your image.
>>
>>
>> Thanks for taking a look, Levente.
>>
>> I'd expect digitsBase: to dominate the running costs, given that we're
>> parsing numbers.
>
>
> I finally checked the code and there's plenty of space for optimization.
> Note that the code can't be loaded into Squeak, because there's an invalid
> symbol #__gen__binding, and some methods with nil category.

Indeed: you might recall I've mentioned some incompatibilities between
Pharo and Squeak where PetitParser is concerned: Symbol >> isBinary
doesn't exist, you MUST have Scanner prefAllowUnderscoreSelectors:
true and Scanner allowUnderscoreAsAssignment: false, you need to
fiddle with RB so you can load AST-Core and AST-Compiler, and so on.
It's _possible_ (I wrote/am writing the number grammar in a Squeak
image), and once I've addressed _this_ incompatibility, I'll address
the others!

>> I do make a large number of throwaway "immutable" values with a
>> Builder-like pattern... in PPSmalltalkNumberParser >>
>> #makeNumberFrom:base:. That, I would imagine, could explain the
>> garbage?
>
>
> The current garbage collector is not optimal for large images and large
> amount of garbage, so you should try avoid creating it in performance
> critial parts of your code.

I foolishly forgot to add the message tally. You'll notice the absence
of the weak finalization process: that might have been because the
image predated the recent weak finalization thrashing fix.

frank

>> If I may, what do you look for when reading the MessageTally? How do
>> you tell, for instance, that there's excessive garbage production?
>
>
> In your tally GC time was 20% of total time and another 20% for the
> finalization process. These numbers should be much lower, usually less than
> 1%.
>
>
>> That the incremental GCs take 7ms? (I'm reading Andreas' comments on
>> http://wiki.squeak.org/squeak/4210 again.)
>
>
> 7ms for an incremental GC is also a bit high, it should be around 1-2ms.
>
>
> Levente
>
>
>>
>> frank
>>
>>> Levente
>>>
>>>
>>>>
>>>> frank
>>>>
>>>> On 14 September 2011 20:26, Frank Shearar <[email protected]>
>>>> wrote:
>>>>>
>>>>>
>>>>> On 3 September 2011 19:35, Nicolas Cellier
>>>>> <[email protected]> wrote:
>>>>>>
>>>>>>
>>>>>> 2011/9/3 Frank Shearar <[email protected]>:
>>>>>>>
>>>>>>>
>>>>>>> On 3 September 2011 18:50, Lukas Renggli <[email protected]> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> I think it is a good idea to have the number parser separate, after
>>>>>>>> all it might also make sense to use it separately.
>>>>>>>>
>>>>>>>> It seems that the new Smalltalk grammar is significantly slower. The
>>>>>>>> benchmark PPSmalltalkClassesTests class>>#benchmark: that uses the
>>>>>>>> source code of the collection hierarchy and does not especially
>>>>>>>> target
>>>>>>>> number literals runs 30% slower.
>>>>>>>>
>>>>>>>> Also I see that "Number readFrom: ..." is still used within the
>>>>>>>> grammar. This seems to be a bit strange, no?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Yes: it's a double-parse, which is a bit lame. First, we parse the
>>>>>>> literal with PPSmalltalkNumberParser, which ensures that the thing
>>>>>>> given to Number class >> #readFrom: is a well-formed token (so, in
>>>>>>> particular, Squeak's Number doesn't get to see anything other than a
>>>>>>> well-formed token).
>>>>>>>
>>>>>>> It sounds like you're happy with the basic concept, so maybe I should
>>>>>>> remove the Number class >> #readFrom: stuff, see if I can't remove
>>>>>>> the
>>>>>>> performance issues, and resubmit the patch.
>>>>>>>
>>>>>>> frank
>>>>>>>
>>>>>>
>>>>>> Yes, a NumberParser is essentially parsing, and this duplication
>>>>>> sounds
>>>>>> useless.
>>>>>> The main feature of interest in NumberParser that I consider a
>>>>>> requirement and should find its equivalence in a PetitNumberParser is:
>>>>>> - round a decimal representation to nearest Float
>>>>>> It's simple, just convert a Fraction asFloat in a single final step to
>>>>>> avoid cumulating round off errors - see
>>>>>> #makeFloatFromMantissa:exponent:base:
>>>>>>
>>>>>> The second feature of interest in NumberParser is the ability to
>>>>>> parser LargeInteger efficiently by avoiding (10 * largeValue +
>>>>>> digitValue) loops, and replacing them with a log(n) cost.
>>>>>> This would be a simple thing to implement in a functional language.
>>>>>
>>>>>
>>>>>
>>>>> Hopefully this won't offend your sensibilities too much :). It does,
>>>>> in fact, use 10* loops - I wrote an experimental "front half * rear
>>>>> half" recursion, which was slower in my benchmarks.
>>>>>
>>>>> This version has the grammar and parser doing no string->number
>>>>> conversion at all. PPSmalltalkNumberMaker supplies a number of utility
>>>>> methods designed to stop one from making malformed numbers. It also
>>>>> supplies a builder interface that the parser uses to construct
>>>>> numbers.
>>>>>
>>>>> frank
>>>>>
>>>>>> Nicolas
>>>>>>
>>>>>>>> Lukas
>>>>>>>>
>>>>>>>>
>>>>>>>> On 3 September 2011 17:18, Frank Shearar <[email protected]>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 3 September 2011 15:56, Lukas Renggli <[email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 3 September 2011 16:51, Frank Shearar <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Hi Lukas,
>>>>>>>>>>>
>>>>>>>>>>> I haven't :) mainly because I'm unsure where to put it - is there
>>>>>>>>>>> perhaps a PP Inbox, or shall I just post the merged version, or
>>>>>>>>>>> what's
>>>>>>>>>>> your preference? (How about an mcd between my merge and PP's
>>>>>>>>>>> head?)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Just put the .mcz at some public URL (dropbox, squeak source, ...)
>>>>>>>>>> or
>>>>>>>>>> attach it to a mail.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Ah, great - here it is. You'll see I've written the grammar as a
>>>>>>>>> separate class. That was really more to make what I'd done more
>>>>>>>>> obvious and to minimise the change to PPSmalltalkGrammar, but
>>>>>>>>> perhaps
>>>>>>>>> it's not a bad idea anyway: it's easy to see the number literal
>>>>>>>>> subgrammar.
>>>>>>>>>
>>>>>>>>> frank
>>>>>>>>>
>>>>>>>>>> Lukas
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Lukas Renggli
>>>>>>>>>> www.lukas-renggli.ch
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Lukas Renggli
>>>>>>>> www.lukas-renggli.ch
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>>
>>
>
 - 16181 tallies, 16206 msec.

**Tree**
--------------------------------
Process: (40s) 80032: nil
--------------------------------
66.5% {10777ms} PPSmalltalkNumberParser class(PPCompositeParser 
class)>>parse:startingAt:
  66.2% {10728ms} PPSmalltalkNumberParser class(PPCompositeParser 
class)>>newStartingAt:
    66.2% {10728ms} 
PPSmalltalkNumberParser(PPCompositeParser)>>initializeStartingAt:
      27.0% {4376ms} PPSmalltalkNumberParser(PPCompositeParser)>>productionNames
        |8.2% {1329ms} Dictionary>>at:put:
        |  |3.9% {632ms} Association class>>key:value:
        |  |  |3.3% {535ms} Association>>key:value:
        |  |2.3% {373ms} primitives
        |  |1.9% {308ms} Dictionary(HashedCollection)>>atNewIndex:put:
        |  |  1.8% {292ms} Dictionary(HashedCollection)>>grow
        |7.2% {1167ms} ByteString(String)>>asSymbol
        |  |6.9% {1118ms} Symbol class>>intern:
        |  |  6.6% {1070ms} Symbol class>>lookup:
        |  |    6.5% {1053ms} WeakSet>>like:
        |  |      6.4% {1037ms} WeakSet>>scanFor:
        |  |        5.6% {908ms} ByteSymbol(Symbol)>>=
        |  |          5.0% {810ms} primitives
        |4.4% {713ms} Array(SequenceableCollection)>>includes:
        |  |2.7% {438ms} primitives
        |  |1.7% {276ms} Array(SequenceableCollection)>>indexOf:
        |  |  1.6% {259ms} Array(SequenceableCollection)>>indexOf:ifAbsent:
        |1.9% {308ms} PPSmalltalkNumberParser class(Behavior)>>allInstVarNames
        |  |1.4% {227ms} PPSmalltalkNumberGrammar 
class(Behavior)>>allInstVarNames
        |  |  1.1% {178ms} PPCompositeParser class(Behavior)>>allInstVarNames
        |1.8% {292ms} PPSmalltalkNumberParser class(PPCompositeParser 
class)>>ignoredNames
        |  |1.5% {243ms} PPCompositeParser class(Behavior)>>allInstVarNames
        |  |  1.1% {178ms} PPDelegateParser class(Behavior)>>allInstVarNames
        |1.3% {211ms} Array(SequenceableCollection)>>collect:
      10.1% {1637ms} PPSmalltalkNumberParser(PPSmalltalkNumberGrammar)>>exponent
        |8.5% {1378ms} PPPredicateObjectParser class>>anyOf:
        |  5.0% {810ms} ByteString(Object)>>printString
        |    |4.5% {729ms} ByteString(Object)>>printStringLimitedTo:
        |    |  3.0% {486ms} String class(SequenceableCollection 
class)>>streamContents:limitedTo:
        |    |    |2.0% {324ms} LimitedWriteStream class(PositionableStream 
class)>>on:
        |    |    |  1.8% {292ms} primitives
        |    |  1.3% {211ms} ByteString(String)>>printOn:
        |    |    1.3% {211ms} ByteString(String)>>storeOn:
        |  2.9% {470ms} ByteString(String)>>,
        |    2.2% {357ms} primitives
      6.6% {1070ms} PPDelegateParser class(PPParser class)>>named:
        |6.4% {1037ms} PPDelegateParser(PPParser)>>name:
        |  5.0% {810ms} PPDelegateParser(PPParser)>>propertyAt:put:
        |    |4.1% {664ms} Dictionary>>at:put:
        |    |  2.4% {389ms} Association class>>key:value:
        |    |    |1.9% {308ms} primitives
        |    |  1.1% {178ms} primitives
        |  1.4% {227ms} primitives
      4.7% {762ms} PPSmalltalkNumberParser(PPSmalltalkNumberGrammar)>>sign
        |4.3% {697ms} Character>>asParser
        |  4.2% {681ms} PPLiteralObjectParser class(PPLiteralParser class)>>on:
        |    2.7% {438ms} Character(Object)>>printString
        |      2.7% {438ms} Character(Object)>>printStringLimitedTo:
        |        2.3% {373ms} String class(SequenceableCollection 
class)>>streamContents:limitedTo:
        |          1.7% {276ms} LimitedWriteStream class(PositionableStream 
class)>>on:
        |            1.5% {243ms} primitives
      4.6% {745ms} PPSmalltalkNumberParser>>decimalNumber
        |3.8% {616ms} 
PPSmalltalkNumberParser(PPSmalltalkNumberGrammar)>>decimalNumber
        |  2.1% {340ms} Character>>asParser
        |    2.1% {340ms} PPLiteralObjectParser class(PPLiteralParser 
class)>>on:
        |      1.1% {178ms} Character(Object)>>printString
        |        1.1% {178ms} Character(Object)>>printStringLimitedTo:
      4.3% {697ms} 
PPSmalltalkNumberParser(PPSmalltalkNumberGrammar)>>signedDigits
        |4.1% {664ms} 
PPSmalltalkNumberParser(PPSmalltalkNumberGrammar)>>signedDigitsBase:
        |  2.8% {454ms} Character>>asParser
        |    2.8% {454ms} PPLiteralObjectParser class(PPLiteralParser 
class)>>on:
        |      1.7% {276ms} Character(Object)>>printString
        |        1.6% {259ms} Character(Object)>>printStringLimitedTo:
        |          1.2% {194ms} String class(SequenceableCollection 
class)>>streamContents:limitedTo:
      3.1% {502ms} Dictionary>>keysAndValuesDo:
        |2.3% {373ms} Dictionary>>associationsDo:
      2.2% {357ms} PPSmalltalkNumberParser(PPSmalltalkNumberGrammar)>>scale
        |2.0% {324ms} Character>>asParser
        |  2.0% {324ms} PPLiteralObjectParser class(PPLiteralParser class)>>on:
        |    1.1% {178ms} Character(Object)>>printString
        |      1.1% {178ms} Character(Object)>>printStringLimitedTo:
      1.2% {194ms} PPSmalltalkNumberParser(PPSmalltalkNumberGrammar)>>number
        1.1% {178ms} BlockClosure>>asParser
          1.0% {162ms} PPPluggableParser class>>on:
7.9% {1280ms} PPDelegateParser>>parseOn:
  |7.9% {1280ms} PPActionParser>>parseOn:
  |  7.8% {1264ms} PPSequenceParser>>parseOn:
  |    6.4% {1037ms} PPDelegateParser>>parseOn:
  |      |6.2% {1005ms} PPSequenceParser>>parseOn:
  |      |  5.4% {875ms} PPFlattenParser>>parseOn:
  |      |    5.0% {810ms} PPPossessiveRepeatingParser>>parseOn:
  |      |      3.5% {567ms} PPPredicateObjectParser>>parseOn:
  |      |        2.6% {421ms} PPFailure class>>message:at:
  |      |          2.6% {421ms} PPFailure>>initializeMessage:at:
  |    1.2% {194ms} PPOptionalParser>>parseOn:
6.9% {1118ms} PPAndParser>>parseOn:
  |6.9% {1118ms} PPSequenceParser>>parseOn:
  |  5.3% {859ms} PPFlattenParser>>parseOn:
  |    |5.0% {810ms} PPDelegateParser>>parseOn:
  |    |  4.9% {794ms} PPFlattenParser>>parseOn:
  |    |    4.4% {713ms} PPPossessiveRepeatingParser>>parseOn:
  |    |      2.7% {438ms} PPPredicateObjectParser>>parseOn:
  |    |        1.9% {308ms} PPFailure class>>message:at:
  |    |          1.8% {292ms} PPFailure>>initializeMessage:at:
  |  1.0% {162ms} PPOptionalParser>>parseOn:
4.1% {664ms} Character>>asParser
  4.1% {664ms} PPLiteralObjectParser class(PPLiteralParser class)>>on:
    2.5% {405ms} Character(Object)>>printString
      |2.4% {389ms} Character(Object)>>printStringLimitedTo:
      |  2.0% {324ms} String class(SequenceableCollection 
class)>>streamContents:limitedTo:
      |    1.2% {194ms} LimitedWriteStream class(PositionableStream class)>>on:
    1.0% {162ms} ByteString(String)>>,
7.1% {1151ms} Array(SequenceableCollection)>>includes:
  7.0% {1134ms} Array(SequenceableCollection)>>indexOf:
    6.8% {1102ms} Array(SequenceableCollection)>>indexOf:ifAbsent:
      5.1% {827ms} Array(SequenceableCollection)>>indexOf:startingAt:ifAbsent:
        |3.2% {519ms} Character>>=
        |1.9% {308ms} primitives
      1.7% {276ms} primitives
4.0% {648ms} PPSmalltalkNumberParser>>makeNumberFrom:base:
  3.3% {535ms} PPSmalltalkNumberMaker>>makeNumber
    3.1% {502ms} PPSmalltalkNumberMaker>>makeInteger
      3.0% {486ms} PPSmalltalkNumberMaker>>toInteger:
        3.0% {486ms} PPSmalltalkNumberMaker>>toInteger:base:
          1.7% {276ms} Character>>digitValue

**Leaves**
5.8% {940ms} LimitedWriteStream class(PositionableStream class)>>on:
5.6% {908ms} ByteSymbol(Symbol)>>=
5.2% {843ms} ByteString(String)>>,
4.9% {794ms} PPFailure>>initializeMessage:at:
3.8% {616ms} Association>>key:value:
3.7% {600ms} Character>>=
3.4% {551ms} Dictionary>>at:put:
3.2% {519ms} PPSequenceParser class(PPListParser class)>>withAll:
2.8% {454ms} Array(SequenceableCollection)>>includes:
2.5% {405ms} Association class>>key:value:
2.4% {389ms} PPSmalltalkNumberGrammar class(ClassDescription)>>instVarNames
2.3% {373ms} Array(SequenceableCollection)>>indexOf:ifAbsent:
2.3% {373ms} Array(SequenceableCollection)>>indexOf:startingAt:ifAbsent:
2.3% {373ms} Dictionary>>associationsDo:
2.1% {340ms} PPOptionalParser(PPDelegateParser)>>setParser:
1.8% {292ms} ByteString class(String class)>>new:
1.7% {276ms} Character>>digitValue
1.6% {259ms} PPLiteralObjectParser class(PPLiteralParser class)>>on:message:
1.5% {243ms} LimitedWriteStream>>nextPut:
1.4% {227ms} PPDelegateParser(PPParser)>>name:
1.3% {211ms} PPPossessiveRepeatingParser>>parseOn:
1.3% {211ms} Array(SequenceableCollection)>>collect:
1.3% {211ms} LimitedWriteStream>>setLimit:limitBlock:
1.2% {194ms} PPSequenceParser>>parseOn:
1.1% {178ms} PPFlattenParser>>create:start:stop:
1.0% {162ms} PPPredicateObjectParser class>>anyOf:

**Memory**
        old                     +0 bytes
        young           -312,908 bytes
        used            -312,908 bytes
        free            +312,908 bytes

**GCs**
        full                    0 totalling 0ms (0.0% uptime)
        incr            807 totalling 3,660ms (23.0% uptime), avg 5.0ms
        tenures         0
        root table      0 overflows

Reply via email to