Re: [Pharo-dev] Squeak and Pharo speed differences

Eliot Miranda Mon, 18 May 2020 20:57:10 -0700

Hi Shaping,

> On May 18, 2020, at 6:52 PM, Shaping <shap...@uurda.org> wrote:
> 
> 
> 1.  Double-click text selection in both Squeak and Pharo shows a 75-100 ms 
> latency (eye-balled, estimated) between end of double click (button up on 
> second click) and time of highlighting of selected text.  It could be as low 
> as 60 ms, but I doubt it, and that’s still too long.  I can’t track the 
> latency in VW 8.3.2.  It’s too short, probably 30 ms or less, and is under my 
> noise floor.  Notepad latencies are even lower.  The difference between VW 
> and Notepad is not enough to complain about.  Neither is noticeable in 
> passing.  The difference between VW and Pharo/Squeak latencies is a little 
> painful/distracting.  It’s very much in your face, and you are keenly aware 
> that you are waiting for something to happen before you can resume your 
> thoughts about the code.
>  
> 2.  Stepping in the Pharo debugger is slow (Squeak is fine).  The latencies 
> between the step-click event and selection of the next evaluable is a solid 
> 100 ms (again estimated).  Feels more like 150-175 ms much of the time.  This 
> is actually hard to work with. 
>  
> Neither of these unequivocally demonstrates VM performance.
>  
> I know.  This comment is not about the VM.  VM performance is another issue.  
> This comment is only about general usability as a function of the latencies, 
> whatever the cause. 
>  
>  Both are more likely to derive from overall GUI architecture.
>  
> Yup.
>  
> In particular, VW’s display architecture is a direct stimulus-response i/o 
> model where input results in a transformation producing immediate rendering, 
> whereas Morphic is an animation architecture where input results in a new 
> state but no rendering.  The Morphic GUI is rendered separately on every 
> “step” of the system.
>  
> Okay.
>  
>  Hence graphics output necessarily lags input on Morphic. So these speed 
> differences have nothing to do with vm performance and everything to do with 
> GUI architecture.
>  
> Both Squeak and Pharo show the same delay for text selection latency.   The 
> architecture difference is not likely causing that.


Given that both Pharo and Squeak useorphic and hence nothing have the same 
tender-in-step architecture isn’t the fact that they show the sane performance 
issue evidence that points to precisely this being the cause?

>  How do we index or look up the word rectangle to render?   I’m think that is 
> more likely the cause.  Is a map created at method compile time and updated  
> after text is moved during edits?

My understanding is that damage rectangles are retrieved, combined to produce a 
smaller (non-overlapping?) set, and that the entire morph tree is asked to 
render within these damage rectangles.  You can read the gods for yourself.

>  Where is VisualWorks significantly faster than either Squeak or Pharo?  
>  
> VW 8.3.2 faster:
>  
> 1. Text selection. 
>  
> 2. Repeat-key rate in VW is smoother (not perfect; I see a few pauses).  
> Pharo’s repeat-key rate is the same or a little slower, there are more 
> pauses, and distribution of those pause-times is slightly wider for Pharo 9, 
> as if event flow isn’t as smooth as it could be (because text/cursor 
> rendering is a not efficient?).  This is a minor issue, not a practical 
> problem.  I did the test in a workspace in both cases.
>  
>  
> Pharo 9 same or faster:
>  
> Everything else in the GUI, like window openings/closings, menu 
> openings/closings work at nearly the same speed, or Pharo 9 is faster.
>  
> Opening a system browser in VW 8.3.2 and Pharo 9 takes about the same time.  
> If you scrutinize, you can see that Pharo system browser open times are often 
> about 2/3 to 4/5 of the VW times.  This action is never faster in VW. 
>  
> Popup menus in Pharo 9 are noticeably faster than those in VW 8.3.2.   
> Instant--delightful.
>  
>  
> Specifically which VisualWorks VM or lower level facilities are much faster 
> than the Cog VM?  Do you have benchmarks?
>  
> No, I don’t, but I find the subject interesting, and would like to pursue it. 
>  I’m trying to get some pressing work done in VW (as I contemplate jumping 
> ship to Pharo/Squeak).  It’s not a good time for excursions, but here I am 
> playing with Squeak/Pharo, anyway.  I want to dig deeper at some future date.
>  
> Do you have a specific procedure you like to use when benchmarking the VW VM? 
>  
> Any VM.  Express the benchmark as a block.  If the benchmark is not trying to 
> measure JIT and/or GC overhead then before the block is run make sure to put 
> the vm in some initialized state wrt hitting and/or GC, eg by voiding the JIT 
> code cache,
>  
> How is the JIT code cache cleared?

Dialect dependent.  In Squeak/Pharo/Cuis IIRC Smalltalk voidCogVMState.  Can’t 
remember how it’s done in VW.

> and/or forcing a scavenge or a global GC.  Then run the block twice, 
> reporting it’s second iteration, to ensure all code is JITted.
>  
> Okay, so the above procedure tests execution-engine efficiency apart from JIT 
> and GC efficiency.
>  
> If attempting to measure JIT and/or GC overhead then do the same wet getting 
> the vm to some baseline consistent initial state
>  
> Baseline state:  the only thing that comes to mind here is Collect All 
> Garbage.

There’s also Smalltalk garbageCollectMost which just runs a scavenge.  IIRC 
someInstance has a side effect of running a scavenge in VW.

>  
> and then ensure, through the relevant introspection primitives,
>  
> What are these?  What state features am I introspecting after the test?  
> Sizes of heap subspaces?  I can do Time microsecondsToRun: on the blocks.

In Squeak/Pharo/Cuis see Smalltalk vmParameterAt: or Smalltalk vm parameterAt: 
and senders.

> that after the benchmark has run the events desired to be benchmarked have 
> actually taken place.
>  
> I’m thinking most checks on state will involve running more Smalltalk, not 
> just primitives.
>  
> If a micobenchmark then ensure that eg loop, block invocation, arithmetic, 
> overheads are either minimised wrt the code being benchmarked or subtracted 
> from the code being benchmarked.
>  
> Okay.
>  
> i.e. make sure the benchmark is repeatable (benchmark an initial used state). 
> make sure the benchmark measures what is intended to be benchmarked and not 
> some overhead.
>  
> Right.  I don’t see how to guarantee some known starting state except to 
> collect all garbage.

See answers above.

> Shaping

Re: [Pharo-dev] Squeak and Pharo speed differences

Reply via email to