Re: [Pharo-dev] Stack size for compiled methods (issue 13854 Crash/hang on array access)

Nicolas Cellier Fri, 22 Aug 2014 13:51:43 -0700

2014-08-21 9:09 GMT+02:00 Nicolai Hess <[email protected]>:

> 2014-08-21 7:14 GMT+02:00 Eliot Miranda <[email protected]>:
>
>
>>
>>
>> On Wed, Aug 20, 2014 at 10:15 PM, Nicolai Hess <[email protected]>
>> wrote:
>>
>>>
>>> 2014-08-19 19:02 GMT+02:00 Eliot Miranda <[email protected]>:
>>>
>>> Hi Nicolai,
>>>>
>>>>
>>>> On Aug 19, 2014, at 11:58 AM, Nicolai Hess <[email protected]> wrote:
>>>>
>>>> Thank you eliot,
>>>>
>>>>
>>>> 2014-08-19 7:29 GMT+02:00 Eliot Miranda <[email protected]>:
>>>>
>>>>> Hi Nicolai,
>>>>>
>>>>>      the stack starts as deep as the method's number of temporaries,
>>>>
>>>>
>>>> ok,
>>>>
>>>>
>>>>> which is the sum of the number of arguments
>>>>
>>>>
>>>> ok,
>>>>
>>>>
>>>>> plus the number of temporary variables that can exist in the stack
>>>>
>>>>
>>>> ok (what does "can exist in the stack" mean? They always do?)
>>>>
>>>>
>>>> Not necessarily.  The closure implementation moves temps that need it
>>>> into an indirect temp vector.  See eg my blog on the closure compiler.
>>>>
>>>> http://www.mirandabanda.org/cogblog/2008/06/07/closures-part-i/
>>>>
>>>>  plus one if there are any closed-over temporary variables that need to
>>>>> be in an indirection vector.  Then as execution proceeds the receiver and
>>>>> arguments are pushed on the stack, and are replaced by intermediate 
>>>>> results
>>>>> by sends it by the create array bytecode.
>>>>
>>>>
>>>> So, for a method with no blocks, the stack is just the number of
>>>> temporaries plus the number of args for the message send with the maximum
>>>> number of args?
>>>>
>>>>
>>>>
>>>> No.  What about this:
>>>>
>>>> ^Point x: 1 y: (self a: 1 b: 2 c: 3)
>>>>
>>>>
>>>> Before sending a:b:c: the stack is
>>>>
>>>> Point
>>>> 1
>>>> self
>>>> 1
>>>> 2
>>>> 3
>>>>
>>>>
>>>>  Any blocks within the method start with the sum of their number of
>>>>> arguments, their number of copied values (temp values they access
>>>>> read-only) plus their local temporaries.
>>>>>
>>>>
>>>> But this is not just added to the stack size, right?
>>>> I have a method with 9 local temporaris and a block in this method with
>>>> 8 local temporaries and the frameSize is still 16, (with the old compiler/
>>>> 56 with the new compiler).
>>>> So, method and block local temporaries not just sum up?
>>>> I tried different variations
>>>> - numberOfMethod temps smaller/equal/greater numberOfBlockTemp
>>>> - no/some/all method temporaries are accessed in the block closure.
>>>>
>>>> But I can not see a pattern :)
>>>>
>>>>
>>>> May be a bug in the old compiler.  The stack size is the max of the
>>>> separate sizes in the method and each block.
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> In the method and each block scope stack depth is the hence the sum of
>>>>> the number of temporaries plus the max execution depth. And the method's
>>>>> depth is the max of the method and that of any blocks within it.
>>>>
>>>>
>>>> What is the execution depth of a method ? The number of "nested blocks"?
>>>>
>>>>
>>>> No, it is how many things it pushes in the stack at the deepest point.
>>>>  See my example above.
>>>>
>>>>
>>>>
>>>>> Then if that depth is 17 or greater it gets the LargeFrame flag set
>>>>> which means the VM allocates a 56 slot context, the compiler raising an
>>>>> error if the depth is greater than 56.
>>>>>
>>>>> HTH
>>>>> Eliot (phone)
>>>>>
>>>>
>>>> Here are two carefully handcrafted methods :)
>>>>
>>>>
>>>> fooSmall
>>>> |t1 t2 t3 t4 t5 t6 t7 t8|
>>>> t1:=1.
>>>> t2:=2.
>>>> t3:=3.
>>>> t4:=4.
>>>> t5:=5.
>>>> t6:=6.
>>>> t7:=7.
>>>> t8:= 8.
>>>> t1:=[:i | |b1 b2 b3 b4 c1 c2 c3 c4 x|
>>>>     b1:=1. b2:=2. b3:=3. b4:=4.
>>>>     c1:=1. c2:=2. c3:=3. c4:=4.
>>>>     x:=1.
>>>>     x+t1 + b1+b2+b3+b4 + c1 + c2 + c3 + c4] value:1.
>>>> ^ t1 + t2 + t3 + t4 + t5 + t6 + t7 + t8
>>>>
>>>>
>>>> fooLarge
>>>> |t1 t2 t3 t4 t5 t6 t7|
>>>> t1:=1.
>>>> t2:=2.
>>>> t3:=3.
>>>> t4:=4.
>>>> t5:=5.
>>>> t6:=6.
>>>> t7:=7.
>>>> t1:=[:i | |b1 b2 b3 b4 c1 c2 c3 c4 x|
>>>>     b1:=1. b2:=2. b3:=3. b4:=4.
>>>>     c1:=1. c2:=2. c3:=3. c4:=4.
>>>>     x:=1.
>>>>     x+t1 +t2 + t3 + t4 + t5+ b1+b2+b3+b4 + c1 + c2 + c3 + c4] value:1.
>>>> ^ t1 + t2 + t3 + t4 + t5 + t6 + t7
>>>>
>>>>
>>>>
>>>> They differ only in the number of tempraries (t1-t8 / t1-t7) and the
>>>> number of copied values for the block closure (1 / 5).
>>>>
>>>> with the old compiler:
>>>> fooSmall frameSize -> 16
>>>> fooLarge frameSize -> 56
>>>>
>>>>
>>>>
>>>> the opal compiler computes the opposite sizes
>>>> fooSmall frameSize -> 56
>>>> fooLarge frameSize -> 16
>>>>
>>>>
>>>>
>>>> Looks like a bug in the Opal compiler :-).  Well found.
>>>>
>>>>
>>>>
>>>> I am confused.
>>>>
>>>>
>>>> No you're not.  You've found a bug.  Now find its cause....
>>>>
>>>
>>>
>>> No, I am still confused :)
>>>
>>> Maybe you can help me with, how the old compiler computes the stack
>>> frame in this examples:
>>>
>>> I changed CompiledMethod>>#needsFrameSize:
>>> to write the value for self numTemps and newFrameSize to the Transcript
>>> and compiled some simple functions:
>>>
>>> foo
>>>     |a b|
>>>     a:=1.
>>>     b:=1.
>>>     ^ a+b
>>>
>>> numTemps:2
>>> frameSize: 2
>>>
>>> ok, two temps and two pushes on the stack
>>>
>>>
>>> foo
>>>     ^ [ 1+1 ]
>>>
>>> numTemps:0
>>> frameSize: 2
>>>
>>> ok, no temps and two pushs (push constant:1/push constant:1) on the stack
>>>
>>>
>>>
>>>
>>>
>>> foo
>>>     ^ [|a b| a:=1. b:=1. a+b ]
>>>
>>> numTemps:0
>>> frameSize: 4
>>>
>>> ok, no (method) temps, why is the stackframe 4? Two block local temps
>>> and two pushs.
>>>
>>>
>>> |x y|
>>> x:=1.
>>> y:=1.
>>>     ^ [|a b| a:=1. b:=1. a+b ]
>>>
>>> numTemps:2
>>> frameSize: 2
>>>
>>> Now what? Adding method temps enlarges the number of temps, ok. But the
>>> stackframe decreases?
>>>
>>
>> Looks like a bug.  The stack size needed in
>>
>> |x y|
>> x:=1.
>> y:=1.
>>     ^ [|a b| a:=1. b:=1. a+b ]
>>
>> is 4, unless the compiler is optimizing away the a+b and is replacing the
>> block with [1] ?  The stack size of the outer method is 3 (2 temps + 1 for
>> the push of either 1 or the block).
>>
>
>
> No optimization, the bytecode is:
>
> 13 <76> pushConstant: 1
> 14 <68> popIntoTemp: 0
> 15 <76> pushConstant: 1
> 16 <69> popIntoTemp: 1
> 17 <8F 00 00 0A> closureNumCopied: 0 numArgs: 0 bytes 21 to 30
> 21     <73> pushConstant: nil
> 22     <73> pushConstant: nil
> 23     <76> pushConstant: 1
> 24     <68> popIntoTemp: 0
> 25     <76> pushConstant: 1
> 26     <69> popIntoTemp: 1
> 27     <10> pushTemp: 0
> 28     <11> pushTemp: 1
> 29     <B0> send: +
> 30     <7D> blockReturn
> 31 <7C> returnTop
>
>
>
The pushConstant: nil are here to initialize the local temps, so they
should be followed by a popIntoTemp: 0 (resp. 1)
If those pop have been optimized away, so should the push, otherwise this
makes a depth 6 necessary (2 temps + 4 push), and an imbalanced stack -
probably harmless, the blockReturn unwind correctly (?).




>
>>
>>
>>
>>> Nicolai
>>>
>>>
>>>
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> On Aug 18, 2014, at 11:32 PM, Nicolai Hess <[email protected]> wrote:
>>>>>
>>>>> > Hi,
>>>>> >
>>>>> > on what depends the stack size for a compiled method?
>>>>> > I try to figure out, why the old compiler and the opal compile
>>>>> generate different
>>>>> > compiled method headers.
>>>>> > I think this comes from a wrong stack size computed by opal, but I
>>>>> can not figure
>>>>> > out how the stack size is computed.
>>>>> >
>>>>> > Old Compiler
>>>>> > PolygonMorph>>#lineSegmentsDo:
>>>>> > header -> "primitive: 0
>>>>> >  numArgs: 1
>>>>> >  numTemps: 3
>>>>> >  numLiterals: 23
>>>>> >  frameSize: 56"
>>>>> >
>>>>> > Opal compiler:
>>>>> > PolygonMorph>>#lineSegmentsDo:
>>>>> > header -> "primitive: 0
>>>>> >  numArgs: 1
>>>>> >  numTemps: 3
>>>>> >  numLiterals: 23
>>>>> >  frameSize: 16"
>>>>> >
>>>>>
>>>>
>>>>
>>>> Eliot (phone)
>>>>
>>>
>>>
>>
>>
>> --
>> best,
>> Eliot
>>
>
>

Re: [Pharo-dev] Stack size for compiled methods (issue 13854 Crash/hang on array access)

Reply via email to