I have put a slightly improved version here, it at the end adds a discussion 
that the mapping still
works (you can inspect ConstantBlockClosure allSubInstances and it can even 
show the block highlighted
in the home method).

        https://blog.marcusdenker.de/constant-blocks-in-pharo11

I will put this in the Queue for the Pharo Dev blog next.

        Marcus

> On 20 May 2023, at 11:02, Marcus Denker <marcus.den...@inria.fr> wrote:
> 
> You might have come across code like this:
> 
> ```
> minHeight
>       "answer the receiver's minHeight"
>       ^ self
>               valueOfProperty: #minHeight
>               ifAbsent: [2]
> ```
> 
> In the case the #minHeight property is not set, it returns 2.
> 
> Code like this is quite common, another example are empty ifAbsent blocks:
> 
> 
> ```
> someDictonary remove: anObject ifAbsent: []
> ```
> 
> If we analyse the system, we can easily find all of them. The best is to use 
> the AST for this:
> 
> ```
> allBlocks := Smalltalk globals methods flatCollect: [:method | method ast 
> blockNodes ].
> allBlocks size. "86805"
> 
> nonInlinedBlocks := allBlocks select: [:blockNode | blockNode isInlined not].
> nonInlinedBlocks size.  "36661"
> 
> “the blocks are actually just constant"
> constantBlocks := nonInlinedBlocks select: [:blockNode | blockNode 
> isConstant].
> constantBlocks size. "2572" 
> 
> ```
> 
> So there are 2572 constant (literal) blocks. You can inspect constantBlocks 
> to explore them:
> 
> <Constant.jpeg>
> 
> Constant or empty blocks ([] is just [nil]) do not feel like something to 
> think too much about.
> 
> After all, they just retunr the literal when you send #value to them. What 
> can be the problem?
> 
> But: they are blocks, and in a system without clean blocks, they are full 
> blocks, which means they are created at runtime for *every* execution of the 
> [] block. And they are blocks, so there is a CompiledBlock created for each 
> and sending #value will execute that bytecode, with the JIT having to create 
> binary code.
> 
> For Morph>>#minHeight the bytecode would be:
> 
> ```
> "'49 <4C> self
> 50 <20> pushConstant: #minHeight
> 51 <F9 01 00> fullClosure:a CompiledBlock: [2] NumCopied: 0
> 54 <A2> send: valueOfProperty:ifAbsent:
> 55 <5C> returnTop'"
> ```
> 
> This is expensive! [2] is the same as 2 (the only thing we can do with the 
> block is to send #value, and we can do that with the literal directly).
> 
> ```
> [ 2 value ] bench.
> [ [2] value ] bench
> 
> 218625362.000/25750416.833 "8.490167884188363"
> ```
> 
> So there >factor 8 for "create and evaluate" in difference between the two!
> 
> This lead to people actually rewriting code to use the literal directly, e.g. 
> we could just change it to
> 
> 
> ```
> minHeight
>       "answer the receiver's minHeight"
>       ^ self
>               valueOfProperty: #minHeight
>               ifAbsent: 2
> ```
> 
> I am guilty of using this sometimes when optimizing for performance, but it 
> does not feel nice. Yet another rule for performance to think about, and the 
> number of constant blocks that are there shows that this is not how people 
> want to do it. And, most important: it just works for 0 arg constant blocks, 
> as literals undestand #value, but not #value:, #value:value: and so on.
> 
> 
> So what can we do? The first thing (and I am sure you are thinking about that 
> alreary) is the idea of clean blocks. Clean blocks are blocks that only need 
> (to be created)  information that the compiler has statically at compile 
> time. you can look at RBProgramNode>>#isClean and the overrides in 
> RBBlockNode>>#isClean RBVariableNode>>#isClean to see the exact cases, but 
> for this case, all what you need to know is that a constant block, as it 
> accesses nothing, is of course the trivial case of a clean block.
> 
> If we compile them as clean blocks, we will immediatly move creation to 
> compile time, and runtime property will be the same as using a literal. With 
> the added benefit that constant blocks with arguments are supported, too.
> 
> But: using "2" instead of [2] is not only faster for *creation*, it is faster 
> when evaluting, too. The reason is that "2 value" sends #value, which 
> executes Object>>#value, which is
> 
> ```
> value
> 
>       ^self
> ```
> 
> Which is a Quick return self method, aka a primitive:
> 
> 
> ```           
> self symbolic   "'Quick return self'"
> ```
> 
> This is *very fast*. While even as a clean block, we have, for every clean 
> block, it's own method (compiledBlock) that the VM has to execute and thus 
> create 
> code for:
> 
> self symbolic 
> 
> "'25 <20> pushConstant: 2
> 26 <5E> blockReturn'"
> 
> 
> It seems the fact that one is a quick return and the other a push/return is 
> for the JIT not that of a difference, it matters for the interpreter more. 
> But the JIT has to create code for *every* constant block, and #value means 
> executing BlockClosure>>#value, which triggers execution of that compiedBlock.
> 
> We thus have to execute two methods, not one. And the JIT has to cache all 
> the generated code.
> 
> So can we do better? It is actually easy to implement a class 
> ConstantBlockClosure, subclass of CleanBlockClosure, that implements all the 
> #value methods to just return
> the constant value:
> 
> ```
> value
>       ^literal
> ```
> 
> Thus we get the same as with sending #value to the literal directly: we send 
> #value, we execute one method that is a quick return.
> 
> And the good news: there is #optionConstantBlockClosure in the compiler, and 
> it is enabled by default in Pharo11!
> 
> The reason why we can turn on Constant Bocks without problem is that they are 
> never on the stack, so we do not need to take care to fix all the tools to 
> know how to deal with them.
> (Constant Bocks actually do have a CompiledBlock so that the e.g. for 
> “senders of” we check the literals just as if it would be a normal clean 
> block, it is just never executed)
> 
> If we go back to our method #minHeight, this means the bytecode looks like 
> that:
> 
> 
> ```
> self symbolic "'49 <4C> self
> 50 <20> pushConstant: #minHeight
> 51 <21> pushConstant: [2]
> 52 <A2> send: valueOfProperty:ifAbsent:
> 53 <5C> returnTop'"
> ```
> 
> Thus, in Pharo11, the execution path of all the >2500 constant blocks end up 
> executing one of the #value methods of ConstantBlockClosure. To get all the 
> exceptions corect when sending e.g. #value ot a 1-arg block, there are 
> subclasses for 1/2/3 args, we do not support 4 arg cleanBlocks for now (there 
> are not many).
> 
> If you want to check that this really works, go to 
> ConstantBlockClosure>>#value and add a Counter via the Debug menu, it's 
> really called a lot!
> 
>       Marcus
> 
> 

Reply via email to