Re: [Pharo-dev] Problems with a large block

Eliot Miranda Thu, 08 Oct 2015 00:39:29 -0700

Hi Nicolai,

On Thu, Oct 8, 2015 at 12:09 AM, Nicolai Hess <[email protected]> wrote:

>
>
> 2015-10-08 1:30 GMT+02:00 Eliot Miranda <[email protected]>:
>
>> Hi Nicolai,
>>
>> On Wed, Oct 7, 2015 at 2:13 PM, Nicolai Hess <[email protected]> wrote:
>>
>>>
>>>
>>> 2015-10-07 16:49 GMT+02:00 Marcus Denker <[email protected]>:
>>>
>>>>
>>>> > On 07 Oct 2015, at 15:49, Marco Naddeo <[email protected]> wrote:
>>>> >
>>>> > Hi,
>>>> >
>>>> > I have some problems with the large block at bottom:
>>>> >
>>>> > -sending the message argumentNames gives me an empty array #()
>>>> > -sending the message sourceNode gives me a strange result and not a
>>>> RBBlockNode, as I would expect
>>>> >
>>>>
>>>> This happens already with:
>>>>
>>>> [  :h :s :v |    | min chroma hdash X red green blue | ] sourceNode
>>>>
>>>> I will check… it is a bug in the mapping pc -> AST.
>>>>
>>>
>>> That's interesting, the way the pc is mapped to the AST (ir
>>> instructionForPC: )
>>> takes the "length" of the instruction bytecode into account and the
>>> length of a pushclosure includes
>>> all bytes needed for the pushing the local temps (pushConstant nil.
>>> pushConstant nil ....)
>>>
>>> But the startPC of a block context starts at the first push, not after,
>>> that means
>>>
>>> self method sourceNodeForPC: self startpc - 1
>>> starts the search after the push closure bytecode but before any push
>>> local temp byte code.
>>>
>>> Either we don't include the bytecodes for the "pushConstant:nil" in the
>>> bytecode offset, or we start the search at
>>>
>>> self method sourceNodeForPC: self startpc - 1 + self numLocalTemps
>>>
>>
>> No, no, no, no, no :-).
>>
>
> I think I will always get confused by BlockClosure code at the bytecode
> level :)
>

What exactly do you find confusing?  The byte codes are simple if you read
the right source.  Perhaps you could talk with Clément; he's local and
understands the byte code set as well as I do.  If you can't talk to him
you could read the class comment of EncoderForV3PlusClosures in a Squeak
4.6 or 5.0 image, or the class comment of EncoderForSistaV1 in the
BytecodeSets package on http://source.squeak.org/VMMaker.

If you're confused by the scheme for closing over variables have you read
my blog posts on the closure compiler?

I wish you hadn't deleted this bit, it is "the truth":

"*The PC in question is the pc of the block creation byte code.  The
startpc of a block is the first bytecode following the block creation byte
code.  That *includes* any pushConstant: nil byte codes establishing temps
in the block.  Instead, it is expected that "self method sourceNodeForPC:
self startpc - 1" answers the same as "self method sourceNodeForPC: self
blockCreationPC", where self blockCreationPC is the pc of the block's block
creation bytecode.*

*In Squeak we have*
*BlockClosure>>blockCreationPC*
* "Answer the pc for the bytecode that created the receuver."*
* | method |*
* method := self method.*
* ^method encoderClass*
* pcOfBlockCreationBytecodeForBlockStartingAt: startpc*
* in: method*

*This is because we support (as you will soon enough) different byte code
sets so it is wrong to assume the size of the block creation bytecode is 4.*
"

>
>>
>> self method sourceNodeForPC: self startpc - 4
>>
>
> We already have this little heuristic
>
> instructionForPC: aPC
>   0 to: -3 by: -1 do: [ :off |
>         (self firstInstructionMatching: [:ir | ir bytecodeOffset = (aPC -
> off) ]) ifNotNil: [:it |^it]]
>
> AFAUI this 0 to -3 offset is used because we have 1,2,3 or 4 byte-length
> bytecodes.
>

Well this will break down with the Sista byte code set because one could
conceivably have 7 or 8 byte bytecoedes when one includes prefixes.  In the
Newspeak and Sista sets we extend the ranges of byte codes by allowing
prefixes.  Notionally the number of prefixes is unlimited but in practice
we're using only one of each of the two prefix bytecodes Prefix A & Prefix
B (currently E0 and E1 in all sets).

But in the Sista set there is a field in the block creation byte code that
reveals how many prefixes the byte code has so one can compute the actual
start of the block creation byte code.  i.e.

EncoderForSistaV1 class>>pcOfBlockCreationBytecodeForBlockStartingAt:
startpc in: method
"Answer the pc of the push closure bytecode whose block starts at startpc
in method.
May need to back up to include extension bytecodes."

"* 224 11100000 aaaaaaaa Extend A (Ext A = Ext A prev * 256 + Ext A)
* 225 11100001 bbbbbbbb Extend B (Ext B = Ext B prev * 256 + Ext B)
** 250 11111010 eeiiikkk jjjjjjjj Push Closure Num Copied iii (+ExtA//16*8)
Num Args kkk (+ ExtA\\16*8) BlockSize jjjjjjjj (+ExtB*256). ee = num
extensions"
| numExtensions |
self assert: (method at: startpc - 3) = 250.
numExtensions := (method at: startpc - 2) >> 6.
^startpc - 3 - (numExtensions * 2)

and make sure that the compiler maps a block creation bytecode to the
>> source node for the entire block.
>>
>
> in the above code (instructionForPC) we try to find the instruction for
> the pc (here the pc of the closure creation code), but I don't really
> understand what
> "bytecodeOffset" is, it looks like (bytecodeIndex+startPC), but for a
> IRPushClosureCopy, it looks like the bytecodeIndex does include
> the number of local temps resp. (numberOfLocalTemps times the
> bytecodelength for pushConstant:nil).
>

Marcus, what is bytecodeOffset?  Is it the distance from the first byte
code or the distance from the first literal or...?

The way to think about "self method sourceNodeForPC: self startpc - 1 +
>> self numLocalTemps" not making sense is that a pc inside the block must map
>> to a statement inside the block, not the block itself.
>>
>
> Hm, I don't know :)
>

Why?  I designed the closure byte code set so I'm telling you :-).  It's
safe to believe me.  For example, an initial pushConstant: nil byte code in
a block either establishes the first local temp in that block, or is an
argument of a send in the block.  In either case it refers to an element
inside the block, not to the entire block itself.  the thing that defines
the entire block is the block creation byte code, which includes the span
(the size of) the block's bytecodes.  I chose to live with this annoying
ambiguity because there were very few unused byte codes and I did;t want to
use all of them for the closure byte code set.  And I chose the closure
architecture so I could write a faster VM than one that accessed temps
directly in outer scopes (which I go into at length on my blog).

HTH
>>
>>
>>>
>>>
>>>>
>>>>         Marcus
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> _,,,^..^,,,_
>> best, Eliot
>>
>
>

-- 
_,,,^..^,,,_
best, Eliot

Re: [Pharo-dev] Problems with a large block

Reply via email to