I love the idea of a blog post. I like to be able to read about when I want.
S > On 22 May 2023, at 14:12, Marcus Denker <[email protected]> wrote: > > I have put a slightly improved version here, it at the end adds a discussion > that the mapping still > works (you can inspect ConstantBlockClosure allSubInstances and it can even > show the block highlighted > in the home method). > > https://blog.marcusdenker.de/constant-blocks-in-pharo11 > > I will put this in the Queue for the Pharo Dev blog next. > > Marcus > >> On 20 May 2023, at 11:02, Marcus Denker <[email protected]> wrote: >> >> You might have come across code like this: >> >> ``` >> minHeight >> "answer the receiver's minHeight" >> ^ self >> valueOfProperty: #minHeight >> ifAbsent: [2] >> ``` >> >> In the case the #minHeight property is not set, it returns 2. >> >> Code like this is quite common, another example are empty ifAbsent blocks: >> >> >> ``` >> someDictonary remove: anObject ifAbsent: [] >> ``` >> >> If we analyse the system, we can easily find all of them. The best is to use >> the AST for this: >> >> ``` >> allBlocks := Smalltalk globals methods flatCollect: [:method | method ast >> blockNodes ]. >> allBlocks size. "86805" >> >> nonInlinedBlocks := allBlocks select: [:blockNode | blockNode isInlined not]. >> nonInlinedBlocks size. "36661" >> >> “the blocks are actually just constant" >> constantBlocks := nonInlinedBlocks select: [:blockNode | blockNode >> isConstant]. >> constantBlocks size. "2572" >> >> ``` >> >> So there are 2572 constant (literal) blocks. You can inspect constantBlocks >> to explore them: >> >> <Constant.jpeg> >> >> Constant or empty blocks ([] is just [nil]) do not feel like something to >> think too much about. >> >> After all, they just retunr the literal when you send #value to them. What >> can be the problem? >> >> But: they are blocks, and in a system without clean blocks, they are full >> blocks, which means they are created at runtime for *every* execution of the >> [] block. And they are blocks, so there is a CompiledBlock created for each >> and sending #value will execute that bytecode, with the JIT having to create >> binary code. >> >> For Morph>>#minHeight the bytecode would be: >> >> ``` >> "'49 <4C> self >> 50 <20> pushConstant: #minHeight >> 51 <F9 01 00> fullClosure:a CompiledBlock: [2] NumCopied: 0 >> 54 <A2> send: valueOfProperty:ifAbsent: >> 55 <5C> returnTop'" >> ``` >> >> This is expensive! [2] is the same as 2 (the only thing we can do with the >> block is to send #value, and we can do that with the literal directly). >> >> ``` >> [ 2 value ] bench. >> [ [2] value ] bench >> >> 218625362.000/25750416.833 "8.490167884188363" >> ``` >> >> So there >factor 8 for "create and evaluate" in difference between the two! >> >> This lead to people actually rewriting code to use the literal directly, >> e.g. we could just change it to >> >> >> ``` >> minHeight >> "answer the receiver's minHeight" >> ^ self >> valueOfProperty: #minHeight >> ifAbsent: 2 >> ``` >> >> I am guilty of using this sometimes when optimizing for performance, but it >> does not feel nice. Yet another rule for performance to think about, and the >> number of constant blocks that are there shows that this is not how people >> want to do it. And, most important: it just works for 0 arg constant blocks, >> as literals undestand #value, but not #value:, #value:value: and so on. >> >> >> So what can we do? The first thing (and I am sure you are thinking about >> that alreary) is the idea of clean blocks. Clean blocks are blocks that only >> need (to be created) information that the compiler has statically at >> compile time. you can look at RBProgramNode>>#isClean and the overrides in >> RBBlockNode>>#isClean RBVariableNode>>#isClean to see the exact cases, but >> for this case, all what you need to know is that a constant block, as it >> accesses nothing, is of course the trivial case of a clean block. >> >> If we compile them as clean blocks, we will immediatly move creation to >> compile time, and runtime property will be the same as using a literal. With >> the added benefit that constant blocks with arguments are supported, too. >> >> But: using "2" instead of [2] is not only faster for *creation*, it is >> faster when evaluting, too. The reason is that "2 value" sends #value, which >> executes Object>>#value, which is >> >> ``` >> value >> >> ^self >> ``` >> >> Which is a Quick return self method, aka a primitive: >> >> >> ``` >> self symbolic "'Quick return self'" >> ``` >> >> This is *very fast*. While even as a clean block, we have, for every clean >> block, it's own method (compiledBlock) that the VM has to execute and thus >> create >> code for: >> >> self symbolic >> >> "'25 <20> pushConstant: 2 >> 26 <5E> blockReturn'" >> >> >> It seems the fact that one is a quick return and the other a push/return is >> for the JIT not that of a difference, it matters for the interpreter more. >> But the JIT has to create code for *every* constant block, and #value means >> executing BlockClosure>>#value, which triggers execution of that >> compiedBlock. >> >> We thus have to execute two methods, not one. And the JIT has to cache all >> the generated code. >> >> So can we do better? It is actually easy to implement a class >> ConstantBlockClosure, subclass of CleanBlockClosure, that implements all the >> #value methods to just return >> the constant value: >> >> ``` >> value >> ^literal >> ``` >> >> Thus we get the same as with sending #value to the literal directly: we send >> #value, we execute one method that is a quick return. >> >> And the good news: there is #optionConstantBlockClosure in the compiler, and >> it is enabled by default in Pharo11! >> >> The reason why we can turn on Constant Bocks without problem is that they >> are never on the stack, so we do not need to take care to fix all the tools >> to know how to deal with them. >> (Constant Bocks actually do have a CompiledBlock so that the e.g. for >> “senders of” we check the literals just as if it would be a normal clean >> block, it is just never executed) >> >> If we go back to our method #minHeight, this means the bytecode looks like >> that: >> >> >> ``` >> self symbolic "'49 <4C> self >> 50 <20> pushConstant: #minHeight >> 51 <21> pushConstant: [2] >> 52 <A2> send: valueOfProperty:ifAbsent: >> 53 <5C> returnTop'" >> ``` >> >> Thus, in Pharo11, the execution path of all the >2500 constant blocks end up >> executing one of the #value methods of ConstantBlockClosure. To get all the >> exceptions corect when sending e.g. #value ot a 1-arg block, there are >> subclasses for 1/2/3 args, we do not support 4 arg cleanBlocks for now >> (there are not many). >> >> If you want to check that this really works, go to >> ConstantBlockClosure>>#value and add a Counter via the Debug menu, it's >> really called a lot! >> >> Marcus >> >> >
