I love the idea of a blog post. 
I like to be able to read about when I want. 

S

> On 22 May 2023, at 14:12, Marcus Denker <[email protected]> wrote:
> 
> I have put a slightly improved version here, it at the end adds a discussion 
> that the mapping still
> works (you can inspect ConstantBlockClosure allSubInstances and it can even 
> show the block highlighted
> in the home method).
> 
>       https://blog.marcusdenker.de/constant-blocks-in-pharo11
> 
> I will put this in the Queue for the Pharo Dev blog next.
> 
>       Marcus
> 
>> On 20 May 2023, at 11:02, Marcus Denker <[email protected]> wrote:
>> 
>> You might have come across code like this:
>> 
>> ```
>> minHeight
>>      "answer the receiver's minHeight"
>>      ^ self
>>              valueOfProperty: #minHeight
>>              ifAbsent: [2]
>> ```
>> 
>> In the case the #minHeight property is not set, it returns 2.
>> 
>> Code like this is quite common, another example are empty ifAbsent blocks:
>> 
>> 
>> ```
>> someDictonary remove: anObject ifAbsent: []
>> ```
>> 
>> If we analyse the system, we can easily find all of them. The best is to use 
>> the AST for this:
>> 
>> ```
>> allBlocks := Smalltalk globals methods flatCollect: [:method | method ast 
>> blockNodes ].
>> allBlocks size. "86805"
>> 
>> nonInlinedBlocks := allBlocks select: [:blockNode | blockNode isInlined not].
>> nonInlinedBlocks size.  "36661"
>> 
>> “the blocks are actually just constant"
>> constantBlocks := nonInlinedBlocks select: [:blockNode | blockNode 
>> isConstant].
>> constantBlocks size. "2572" 
>> 
>> ```
>> 
>> So there are 2572 constant (literal) blocks. You can inspect constantBlocks 
>> to explore them:
>> 
>> <Constant.jpeg>
>> 
>> Constant or empty blocks ([] is just [nil]) do not feel like something to 
>> think too much about.
>> 
>> After all, they just retunr the literal when you send #value to them. What 
>> can be the problem?
>> 
>> But: they are blocks, and in a system without clean blocks, they are full 
>> blocks, which means they are created at runtime for *every* execution of the 
>> [] block. And they are blocks, so there is a CompiledBlock created for each 
>> and sending #value will execute that bytecode, with the JIT having to create 
>> binary code.
>> 
>> For Morph>>#minHeight the bytecode would be:
>> 
>> ```
>> "'49 <4C> self
>> 50 <20> pushConstant: #minHeight
>> 51 <F9 01 00> fullClosure:a CompiledBlock: [2] NumCopied: 0
>> 54 <A2> send: valueOfProperty:ifAbsent:
>> 55 <5C> returnTop'"
>> ```
>> 
>> This is expensive! [2] is the same as 2 (the only thing we can do with the 
>> block is to send #value, and we can do that with the literal directly).
>> 
>> ```
>> [ 2 value ] bench.
>> [ [2] value ] bench
>> 
>> 218625362.000/25750416.833 "8.490167884188363"
>> ```
>> 
>> So there >factor 8 for "create and evaluate" in difference between the two!
>> 
>> This lead to people actually rewriting code to use the literal directly, 
>> e.g. we could just change it to
>> 
>> 
>> ```
>> minHeight
>>      "answer the receiver's minHeight"
>>      ^ self
>>              valueOfProperty: #minHeight
>>              ifAbsent: 2
>> ```
>> 
>> I am guilty of using this sometimes when optimizing for performance, but it 
>> does not feel nice. Yet another rule for performance to think about, and the 
>> number of constant blocks that are there shows that this is not how people 
>> want to do it. And, most important: it just works for 0 arg constant blocks, 
>> as literals undestand #value, but not #value:, #value:value: and so on.
>> 
>> 
>> So what can we do? The first thing (and I am sure you are thinking about 
>> that alreary) is the idea of clean blocks. Clean blocks are blocks that only 
>> need (to be created)  information that the compiler has statically at 
>> compile time. you can look at RBProgramNode>>#isClean and the overrides in 
>> RBBlockNode>>#isClean RBVariableNode>>#isClean to see the exact cases, but 
>> for this case, all what you need to know is that a constant block, as it 
>> accesses nothing, is of course the trivial case of a clean block.
>> 
>> If we compile them as clean blocks, we will immediatly move creation to 
>> compile time, and runtime property will be the same as using a literal. With 
>> the added benefit that constant blocks with arguments are supported, too.
>> 
>> But: using "2" instead of [2] is not only faster for *creation*, it is 
>> faster when evaluting, too. The reason is that "2 value" sends #value, which 
>> executes Object>>#value, which is
>> 
>> ```
>> value
>> 
>>      ^self
>> ```
>> 
>> Which is a Quick return self method, aka a primitive:
>> 
>> 
>> ```          
>> self symbolic   "'Quick return self'"
>> ```
>> 
>> This is *very fast*. While even as a clean block, we have, for every clean 
>> block, it's own method (compiledBlock) that the VM has to execute and thus 
>> create 
>> code for:
>> 
>> self symbolic 
>> 
>> "'25 <20> pushConstant: 2
>> 26 <5E> blockReturn'"
>> 
>> 
>> It seems the fact that one is a quick return and the other a push/return is 
>> for the JIT not that of a difference, it matters for the interpreter more. 
>> But the JIT has to create code for *every* constant block, and #value means 
>> executing BlockClosure>>#value, which triggers execution of that 
>> compiedBlock.
>> 
>> We thus have to execute two methods, not one. And the JIT has to cache all 
>> the generated code.
>> 
>> So can we do better? It is actually easy to implement a class 
>> ConstantBlockClosure, subclass of CleanBlockClosure, that implements all the 
>> #value methods to just return
>> the constant value:
>> 
>> ```
>> value
>>      ^literal
>> ```
>> 
>> Thus we get the same as with sending #value to the literal directly: we send 
>> #value, we execute one method that is a quick return.
>> 
>> And the good news: there is #optionConstantBlockClosure in the compiler, and 
>> it is enabled by default in Pharo11!
>> 
>> The reason why we can turn on Constant Bocks without problem is that they 
>> are never on the stack, so we do not need to take care to fix all the tools 
>> to know how to deal with them.
>> (Constant Bocks actually do have a CompiledBlock so that the e.g. for 
>> “senders of” we check the literals just as if it would be a normal clean 
>> block, it is just never executed)
>> 
>> If we go back to our method #minHeight, this means the bytecode looks like 
>> that:
>> 
>> 
>> ```
>> self symbolic "'49 <4C> self
>> 50 <20> pushConstant: #minHeight
>> 51 <21> pushConstant: [2]
>> 52 <A2> send: valueOfProperty:ifAbsent:
>> 53 <5C> returnTop'"
>> ```
>> 
>> Thus, in Pharo11, the execution path of all the >2500 constant blocks end up 
>> executing one of the #value methods of ConstantBlockClosure. To get all the 
>> exceptions corect when sending e.g. #value ot a 1-arg block, there are 
>> subclasses for 1/2/3 args, we do not support 4 arg cleanBlocks for now 
>> (there are not many).
>> 
>> If you want to check that this really works, go to 
>> ConstantBlockClosure>>#value and add a Counter via the Debug menu, it's 
>> really called a lot!
>> 
>>      Marcus
>> 
>> 
> 

Reply via email to