Re: [Pharo-dev] OpalCompiler evaluate speed

Nicolas Cellier Fri, 24 Nov 2017 00:38:03 -0800

2017-11-23 23:37 GMT+01:00 Clément Bera <[email protected]>:

> Hi Nicolas.
>
> Just some comments:
>
> Another thing you can try is to remove the allocation of Opal's IR. It
> seems people use the IR only through the IRBuilder, so the API can be kept
> but it can generate directly bytecode instead of IR then bytecode. Removing
> those allocations would speed things up. It means merging IRFix /
> IRTranslator / IRBytecodeGenerator somehow and have the IRBuilder API
> directly call the new resulting merged class.
>
>
Yes, I was thinking of this optimization: use IR only when we need advanced
tools/instrumentation.


Another thing is that when Opal became the default compiler, I evaluated
> the speed and saw it was slower, but when loading large projects it seemed
> loading time was dominated by Monticello / source reading / source loading
> and the compilation time was overall not that significant (< 20% of time).
> I don't know if this is still the case with GIT. I have problems currently
> when editing some large methods (it seems in the IDE the method is compiled
> at each keystroke...) and when doing OpalCompiler recompileAll (which I do
> often since I edit bytecode sets) but else the performance of Opal seems to
> be OK. Evaluate performance may be relevant in some cases but I've never
> found such cases outside of the IDE in production.
>
> Best !
>
> Clement
>
>
Right, my case was generating the strings, so Compiler was the main
contributor.
For more general case, it should be benched.
I was thinking of a tool for checking the example comments "some code
example >>> some result".


> On Thu, Nov 23, 2017 at 9:41 PM, Nicolas Cellier <
> [email protected]> wrote:
>
>>
>>
>> 2017-11-22 0:31 GMT+01:00 Ben Coman <[email protected]>:
>>
>>>
>>>
>>> On 22 November 2017 at 05:49, Nicolas Cellier <
>>> [email protected]> wrote:
>>>
>>>>
>>>>
>>>> 2017-11-21 14:19 GMT+01:00 Nicolas Cellier <
>>>> [email protected]>:
>>>>
>>>>> I have an ArbitraryPrecisionFloatTests doing an exhaustive test for
>>>>> printing and reevaluating all positve half precision float.
>>>>>
>>>>> That's about 2^15 or approximately 32k loop which evaluate snippets
>>>>> like
>>>>>
>>>>>     (ArbitraryPrecisionFloat readFrom: '1.123' readStream numBits: 10)
>>>>>
>>>>> The test was naively written with Compiler evaluate: and was using the
>>>>> legacy Compiler.
>>>>>
>>>>> If I rewrite self class compiler evaluate: the test times out.
>>>>> Let's see what increase is necessary:
>>>>>
>>>>>     [ ArbitraryPrecisionFloatTest new testPrintAndEvaluate  ]
>>>>> timeToRun.
>>>>>     -> 3s with legacy Compiler
>>>>>     -> 14s with OpalCompiler
>>>>>
>>>>> It's not unexpected that intermediate representation (IR) reification
>>>>> has a cost, but here the 4.5x is a bit too much...
>>>>> This test did account for 1/4 of total test duration already (3s out
>>>>> of 12s).
>>>>> With Opal, the total test duration doubles... (14s out of 23s)
>>>>>
>>>>> So let's analyze the hot spot with:
>>>>>
>>>>>     MessageTally  spyOn: [ ArbitraryPrecisionFloatTest new
>>>>> testPrintAndEvaluate  ].
>>>>>
>>>>> (I didn't use AndreasSystemProfiler becuase outputs seems a bit
>>>>> garbbled, no matter since the primitives do not account that much, a
>>>>> MessageTally will do the job)
>>>>>
>>>>> I first see a hot spot which does not seem that necessary:
>>>>>
>>>>>       |    |24.6% {3447ms} RBMethodNode(RBProgramNode)>>formattedCode
>>>>>
>>>>> From the comments I understand that AST-based stuff requires a pattern
>>>>> (DoIt) and an explicit return (^), but this expensive formatting seems too
>>>>> much for just evaluating. i think that we should change that.
>>>>>
>>>>> Then comes:
>>>>>
>>>>>       |    |20.7% {2902ms} RBMethodNode>>generate:
>>>>>
>>>>> which is split in two halves, ATS->IR and IR->bytecode
>>>>>
>>>>>       |    |  |9.3% {1299ms} RBMethodNode>>generateIR
>>>>>
>>>>>       |    |  |  |11.4% {1596ms} IRMethod>>generate:
>>>>>
>>>>> But then I see this cost a 2nd time which also leave room for progress:
>>>>>
>>>>>       |                |10.9% {1529ms} RBMethodNode>>generateIR
>>>>>
>>>>>       |                |  |12.9% {1814ms} IRMethod>>generate:
>>>>>
>>>>> The first is in RBMethodNode>>generateWithSource, the second in
>>>>> OpalCompiler>>compile
>>>>>
>>>>> Last comes the parse time (sourceCode -> AST)
>>>>>
>>>>>       |                  13.2% {1858ms} OpalCompiler>>parse
>>>>>
>>>>> Along with semantic analysis
>>>>>
>>>>>       |                  6.0% {837ms} OpalCompiler>>doSemanticAnalysis
>>>>>
>>>>> -----------------------------------
>>>>>
>>>>> For comparison, the legacy Compiler decomposes into:
>>>>>
>>>>>       |        |61.5% {2223ms} Parser>>parse:class:category:n
>>>>> oPattern:context:notifying:ifFail:
>>>>>
>>>>> which more or less covers parse time + semantic analysis time.
>>>>> That means that Opal does a fair work for this stage.
>>>>>
>>>>> Then, the direct AST->byteCode phase is:
>>>>>
>>>>>      |      16.9% {609ms} MethodNode>>generate
>>>>>
>>>>> IR costs almost a 5x on this phase, but we know it's the price to pay
>>>>> for the additional features that it potentially offers. If only we would 
>>>>> do
>>>>> it once...
>>>>>
>>>>> And that's all for the legacy one...
>>>>>
>>>>> --------------------------------------
>>>>>
>>>>> This little exercize shows that a 2x acceleration of OpalCompiler
>>>>> evaluate seems achievable:
>>>>> - simplify the uselessely expensive formatted code
>>>>> - generate bytecodes once, not twice
>>>>>
>>>>> Then it will be a bit more 2x slower than legacy, which is a better
>>>>> trade for yet to come superior features potentially brought by Opal.
>>>>>
>>>>> It would be interesting to carry same analysis on method compilation
>>>>>
>>>>
>>>> Digging further here is what I find:
>>>>
>>>> compile sends generate: and answer a CompiledMethod
>>>> translate sends compile but throw the CompiledMethod away, and just
>>>> answer the AST.
>>>> Most senders of translate will also generate: (thus we generate: twice
>>>> quite often, loosing a 2x factor in compilation).
>>>>
>>>> A 2x gain is a huge gain when installing big code bases, especially if
>>>> the custom is to throw image away and reconstruct.
>>>> No matter if a bot does the job, it does it for twice many watts and at
>>>> the end, we're waiting for the bot.
>>>>
>>>> However, before changing anything, further clarification is required:
>>>> translate does one more thing, it catches ReparseAfterSourceEditing and
>>>> retry compilation (once).
>>>> So my question: are there some cases when generate: will cause
>>>> ReparseAfterSourceEditing?
>>>>
>>>
>>> I don't know the full answer about other cases, but I can provide the
>>> background why ReparseAfterSourceEditing was added.
>>>
>>> IIRC, a few years ago with the move to an AST based system, there was a
>>> problem with syntax highlighting where
>>> the AST referenced its original source which caused highlighting offsets
>>> when reference to source modified in the editor.
>>> Trying to work backwards from modified source to update all AST elements
>>> source-location proved an intractable problem.
>>> The workaround I found was to move only in a forward direction
>>> regenerating AST from source every keystroke.
>>> Performance was acceptable so this became the permanent solution.
>>>
>>> I don't have access to an image to check, but you should find
>>> ReparseAfterSourceEditing raised only in one location near editor #changed:
>>> Maybe this should activate only for interactively modified code, and
>>> disabled/bypassed for bulk loading code.
>>> For testing purposes commenting it out should not harm the system, just
>>> produce visual artifacts in syntax highlighting.
>>>
>>>
>>>
>>>> That could happen in generation phase if some byte code limit is
>>>> exceeded, and an interactive handling corrects code...
>>>> I did not see any such condition, but code base is huge...
>>>>
>>>
>>> At worst, the impact should only be a temporary visual artifact.
>>> Corrected on the next keystroke.
>>> (unless ReparseAfterSourceEditing has been adopted for other than
>>> original purpose, but I'd guess not)
>>>
>>> cheers -ben
>>>
>>> Hi Ben,
>> Thanks for information.
>> We must keep ReparseAfterSourceEditing, it does its job.
>>
>> But it just sounds like we have an inversion:
>>
>> translate (source code->AST) does call compile (source
>> code->AST->bytecode in a CompiledMethod)
>>
>> I would expect the other way around: if we want to compile, we need to
>> translate first.
>> If we want to translate, we don't really need to compile, unless there's
>> an hidden reason...
>> Thus my question: is there an hidden reason?
>>
>
>
>
> --
> Clément Béra
> Pharo consortium engineer
> https://clementbera.wordpress.com/
> Bâtiment B 40, avenue Halley 59650
> <https://maps.google.com/?q=40,+avenue+Halley+59650%C2%A0Villeneuve+d'Ascq&entry=gmail&source=g>Villeneuve
> d
> <https://maps.google.com/?q=40,+avenue+Halley+59650%C2%A0Villeneuve+d'Ascq&entry=gmail&source=g>
> 'Ascq
> <https://maps.google.com/?q=40,+avenue+Halley+59650%C2%A0Villeneuve+d'Ascq&entry=gmail&source=g>
>

Re: [Pharo-dev] OpalCompiler evaluate speed

Reply via email to