On Wed, 25 Apr 2012, Eliot Miranda wrote:
On Wed, Apr 25, 2012 at 1:11 PM, Levente Uzonyi <[email protected]> wrote:
On Fri, 20 Apr 2012, Frank Shearar wrote:
On 20 April 2012 03:51, Levente Uzonyi <[email protected]> wrote:
On Thu, 19 Apr 2012, Frank Shearar wrote:
I found a serious bug in parsing numbers with negative
exponents. It
was completey broken, in fact, parsing 1e-1 as 10, not
1 / 10. Anyway.
This version fixes that, and adds a bunch of tests
demonstrating that
number parsing will return rationals if it can.
It's significantly slower than Squeak's SqNumberParser:
Time millisecondsToRun: [100000 timesRepeat:
[SqNumberParser parse:
'1234567890']] => 466
Time millisecondsToRun: [100000 timesRepeat:
[PPSmalltalkNumberParser
parse: '1234567890']] => 32082
I've attached a MessageTally spying on the latter: I've not much skill
in reading these, but nothing leaps out at me as being obviously
awful.
Didn't check the code, just the tally, and I think that
PPSmalltalkNumberParser(PPSmalltalkNumberGrammar)>>digitsBase: is begging
for optimization. It's probably also the cause of the high amount of garbage
which causes significant amount of time spent with garbage collection.
It's also interesting is that the finalization process does so much work,
there may be something wrong with your image.
Thanks for taking a look, Levente.
I'd expect digitsBase: to dominate the running costs, given that we're
parsing numbers.
I finally checked the code and there's plenty of space for optimization. Note
that the code can't be loaded into Squeak, because there's an invalid
symbol #__gen__binding, and some methods with nil category.
I do make a large number of throwaway "immutable" values with a
Builder-like pattern... in PPSmalltalkNumberParser >>
#makeNumberFrom:base:. That, I would imagine, could explain the
garbage?
The current garbage collector is not optimal for large images and large amount
of garbage, so you should try avoid creating it in performance
critial parts of your code.
If I may, what do you look for when reading the MessageTally? How do
you tell, for instance, that there's excessive garbage production?
In your tally GC time was 20% of total time and another 20% for the
finalization process. These numbers should be much lower, usually less than 1%.
That the incremental GCs take 7ms? (I'm reading Andreas' comments on
http://wiki.squeak.org/squeak/4210 again.)
7ms for an incremental GC is also a bit high, it should be around 1-2ms.
Careful. Absolute GC times (as any run-time) depend on the machine. On a slow
machine GC runs slow...
Of course, but the incremental GC is intended to be fast enough to
be unnoticable even during audio playback, so it shouldn't take more than
1-2ms on a slower system.
Levente
Levente
frank
Levente
frank
On 14 September 2011 20:26, Frank Shearar
<[email protected]> wrote:
On 3 September 2011 19:35, Nicolas Cellier
<[email protected]> wrote:
2011/9/3 Frank Shearar
<[email protected]>:
On 3 September 2011 18:50, Lukas Renggli
<[email protected]> wrote:
I think it is a good idea to
have the number parser
separate, after
all it might also make sense to
use it separately.
It seems that the new Smalltalk
grammar is significantly
slower. The
benchmark PPSmalltalkClassesTests
class>>#benchmark: that
uses the
source code of the collection
hierarchy and does not
especially target
number literals runs 30% slower.
Also I see that "Number readFrom:
..." is still used
within the
grammar. This seems to be a bit
strange, no?
Yes: it's a double-parse, which is a
bit lame. First, we parse the
literal with PPSmalltalkNumberParser,
which ensures that the thing
given to Number class >> #readFrom:
is a well-formed token (so, in
particular, Squeak's Number doesn't
get to see anything other than a
well-formed token).
It sounds like you're happy with the
basic concept, so maybe I should
remove the Number class >> #readFrom:
stuff, see if I can't remove the
performance issues, and resubmit the
patch.
frank
Yes, a NumberParser is essentially parsing,
and this duplication sounds
useless.
The main feature of interest in
NumberParser that I consider a
requirement and should find its equivalence
in a PetitNumberParser is:
- round a decimal representation to nearest
Float
It's simple, just convert a Fraction
asFloat in a single final step to
avoid cumulating round off errors - see
#makeFloatFromMantissa:exponent:base:
The second feature of interest in
NumberParser is the ability to
parser LargeInteger efficiently by avoiding
(10 * largeValue +
digitValue) loops, and replacing them with
a log(n) cost.
This would be a simple thing to implement
in a functional language.
Hopefully this won't offend your sensibilities
too much :). It does,
in fact, use 10* loops - I wrote an experimental
"front half * rear
half" recursion, which was slower in my
benchmarks.
This version has the grammar and parser doing no
string->number
conversion at all. PPSmalltalkNumberMaker
supplies a number of utility
methods designed to stop one from making
malformed numbers. It also
supplies a builder interface that the parser uses
to construct
numbers.
frank
Nicolas
Lukas
On 3 September 2011 17:18,
Frank Shearar
<[email protected]>
wrote:
On 3 September 2011
15:56, Lukas Renggli
<[email protected]> wrote:
On 3 September 2011
16:51, Frank
Shearar
<[email protected]>
wrote:
Hi Lukas,
I haven't :)
mainly
because I'm
unsure
where to put
it - is
there
perhaps a PP
Inbox, or
shall I just
post the
merged
version, or
what's
your
preference? (How
about an mcd
between
my merge and
PP's
head?)
Just put the .mcz
at some public
URL (dropbox,
squeak source, ...)
or
attach it to a mail.
Ah, great - here it is.
You'll see I've
written the grammar as a
separate class. That was
really more to make
what I'd done more
obvious and to minimise
the change to
PPSmalltalkGrammar, but
perhaps
it's not a bad idea
anyway: it's easy to see
the number literal
subgrammar.
frank
Lukas
--
Lukas Renggli
www.lukas-renggli.ch
--
Lukas Renggli
www.lukas-renggli.ch
--
best,Eliot