I have an ArbitraryPrecisionFloatTests doing an exhaustive test for
printing and reevaluating all positve half precision float.

That's about 2^15 or approximately 32k loop which evaluate snippets like

    (ArbitraryPrecisionFloat readFrom: '1.123' readStream numBits: 10)

The test was naively written with Compiler evaluate: and was using the
legacy Compiler.

If I rewrite self class compiler evaluate: the test times out.
Let's see what increase is necessary:

    [ ArbitraryPrecisionFloatTest new testPrintAndEvaluate  ] timeToRun.
    -> 3s with legacy Compiler
    -> 14s with OpalCompiler

It's not unexpected that intermediate representation (IR) reification has a
cost, but here the 4.5x is a bit too much...
This test did account for 1/4 of total test duration already (3s out of
12s).
With Opal, the total test duration doubles... (14s out of 23s)

So let's analyze the hot spot with:

    MessageTally  spyOn: [ ArbitraryPrecisionFloatTest new
testPrintAndEvaluate  ].

(I didn't use AndreasSystemProfiler becuase outputs seems a bit garbbled,
no matter since the primitives do not account that much, a MessageTally
will do the job)

I first see a hot spot which does not seem that necessary:

      |    |24.6% {3447ms} RBMethodNode(RBProgramNode)>>formattedCode

>From the comments I understand that AST-based stuff requires a pattern
(DoIt) and an explicit return (^), but this expensive formatting seems too
much for just evaluating. i think that we should change that.

Then comes:

      |    |20.7% {2902ms} RBMethodNode>>generate:

which is split in two halves, ATS->IR and IR->bytecode

      |    |  |9.3% {1299ms} RBMethodNode>>generateIR

      |    |  |  |11.4% {1596ms} IRMethod>>generate:

But then I see this cost a 2nd time which also leave room for progress:

      |                |10.9% {1529ms} RBMethodNode>>generateIR

      |                |  |12.9% {1814ms} IRMethod>>generate:

The first is in RBMethodNode>>generateWithSource, the second in
OpalCompiler>>compile

Last comes the parse time (sourceCode -> AST)

      |                  13.2% {1858ms} OpalCompiler>>parse

Along with semantic analysis

      |                  6.0% {837ms} OpalCompiler>>doSemanticAnalysis

-----------------------------------

For comparison, the legacy Compiler decomposes into:

      |        |61.5% {2223ms}
Parser>>parse:class:category:noPattern:context:notifying:ifFail:

which more or less covers parse time + semantic analysis time.
That means that Opal does a fair work for this stage.

Then, the direct AST->byteCode phase is:

     |      16.9% {609ms} MethodNode>>generate

IR costs almost a 5x on this phase, but we know it's the price to pay for
the additional features that it potentially offers. If only we would do it
once...

And that's all for the legacy one...

--------------------------------------

This little exercize shows that a 2x acceleration of OpalCompiler evaluate
seems achievable:
- simplify the uselessely expensive formatted code
- generate bytecodes once, not twice

Then it will be a bit more 2x slower than legacy, which is a better trade
for yet to come superior features potentially brought by Opal.

It would be interesting to carry same analysis on method compilation

Reply via email to