Re: [Pharo-dev] Stack size for compiled methods (issue 13854 Crash/hang on array access)

Eliot Miranda Sat, 23 Aug 2014 03:05:22 -0700

On Fri, Aug 22, 2014 at 11:35 PM, Nicolai Hess <[email protected]> wrote:


> 2014-08-22 22:50 GMT+02:00 Nicolas Cellier <
> [email protected]>:
>
>
>>
>>
>> 2014-08-21 9:09 GMT+02:00 Nicolai Hess <[email protected]>:
>>
>> 2014-08-21 7:14 GMT+02:00 Eliot Miranda <[email protected]>:
>>>
>>>
>>>>
>>>>
>>>> On Wed, Aug 20, 2014 at 10:15 PM, Nicolai Hess <[email protected]>
>>>> wrote:
>>>>
>>>>>
>>>>> 2014-08-19 19:02 GMT+02:00 Eliot Miranda <[email protected]>:
>>>>>
>>>>> Hi Nicolai,
>>>>>>
>>>>>>
>>>>>> On Aug 19, 2014, at 11:58 AM, Nicolai Hess <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>> Thank you eliot,
>>>>>>
>>>>>>
>>>>>> 2014-08-19 7:29 GMT+02:00 Eliot Miranda <[email protected]>:
>>>>>>
>>>>>>> Hi Nicolai,
>>>>>>>
>>>>>>>      the stack starts as deep as the method's number of temporaries,
>>>>>>
>>>>>>
>>>>>> ok,
>>>>>>
>>>>>>
>>>>>>> which is the sum of the number of arguments
>>>>>>
>>>>>>
>>>>>> ok,
>>>>>>
>>>>>>
>>>>>>> plus the number of temporary variables that can exist in the stack
>>>>>>
>>>>>>
>>>>>> ok (what does "can exist in the stack" mean? They always do?)
>>>>>>
>>>>>>
>>>>>> Not necessarily.  The closure implementation moves temps that need it
>>>>>> into an indirect temp vector.  See eg my blog on the closure compiler.
>>>>>>
>>>>>> http://www.mirandabanda.org/cogblog/2008/06/07/closures-part-i/
>>>>>>
>>>>>>  plus one if there are any closed-over temporary variables that need
>>>>>>> to be in an indirection vector.  Then as execution proceeds the receiver
>>>>>>> and arguments are pushed on the stack, and are replaced by intermediate
>>>>>>> results by sends it by the create array bytecode.
>>>>>>
>>>>>>
>>>>>> So, for a method with no blocks, the stack is just the number of
>>>>>> temporaries plus the number of args for the message send with the maximum
>>>>>> number of args?
>>>>>>
>>>>>>
>>>>>>
>>>>>> No.  What about this:
>>>>>>
>>>>>> ^Point x: 1 y: (self a: 1 b: 2 c: 3)
>>>>>>
>>>>>>
>>>>>> Before sending a:b:c: the stack is
>>>>>>
>>>>>> Point
>>>>>> 1
>>>>>> self
>>>>>> 1
>>>>>> 2
>>>>>> 3
>>>>>>
>>>>>>
>>>>>>  Any blocks within the method start with the sum of their number of
>>>>>>> arguments, their number of copied values (temp values they access
>>>>>>> read-only) plus their local temporaries.
>>>>>>>
>>>>>>
>>>>>> But this is not just added to the stack size, right?
>>>>>> I have a method with 9 local temporaris and a block in this method
>>>>>> with 8 local temporaries and the frameSize is still 16, (with the old
>>>>>> compiler/ 56 with the new compiler).
>>>>>> So, method and block local temporaries not just sum up?
>>>>>> I tried different variations
>>>>>> - numberOfMethod temps smaller/equal/greater numberOfBlockTemp
>>>>>> - no/some/all method temporaries are accessed in the block closure.
>>>>>>
>>>>>> But I can not see a pattern :)
>>>>>>
>>>>>>
>>>>>> May be a bug in the old compiler.  The stack size is the max of the
>>>>>> separate sizes in the method and each block.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> In the method and each block scope stack depth is the hence the sum
>>>>>>> of the number of temporaries plus the max execution depth. And the 
>>>>>>> method's
>>>>>>> depth is the max of the method and that of any blocks within it.
>>>>>>
>>>>>>
>>>>>> What is the execution depth of a method ? The number of "nested
>>>>>> blocks"?
>>>>>>
>>>>>>
>>>>>> No, it is how many things it pushes in the stack at the deepest
>>>>>> point.  See my example above.
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Then if that depth is 17 or greater it gets the LargeFrame flag set
>>>>>>> which means the VM allocates a 56 slot context, the compiler raising an
>>>>>>> error if the depth is greater than 56.
>>>>>>>
>>>>>>> HTH
>>>>>>> Eliot (phone)
>>>>>>>
>>>>>>
>>>>>> Here are two carefully handcrafted methods :)
>>>>>>
>>>>>>
>>>>>> fooSmall
>>>>>> |t1 t2 t3 t4 t5 t6 t7 t8|
>>>>>> t1:=1.
>>>>>> t2:=2.
>>>>>> t3:=3.
>>>>>> t4:=4.
>>>>>> t5:=5.
>>>>>> t6:=6.
>>>>>> t7:=7.
>>>>>> t8:= 8.
>>>>>> t1:=[:i | |b1 b2 b3 b4 c1 c2 c3 c4 x|
>>>>>>     b1:=1. b2:=2. b3:=3. b4:=4.
>>>>>>     c1:=1. c2:=2. c3:=3. c4:=4.
>>>>>>     x:=1.
>>>>>>     x+t1 + b1+b2+b3+b4 + c1 + c2 + c3 + c4] value:1.
>>>>>> ^ t1 + t2 + t3 + t4 + t5 + t6 + t7 + t8
>>>>>>
>>>>>>
>>>>>> fooLarge
>>>>>> |t1 t2 t3 t4 t5 t6 t7|
>>>>>> t1:=1.
>>>>>> t2:=2.
>>>>>> t3:=3.
>>>>>> t4:=4.
>>>>>> t5:=5.
>>>>>> t6:=6.
>>>>>> t7:=7.
>>>>>> t1:=[:i | |b1 b2 b3 b4 c1 c2 c3 c4 x|
>>>>>>     b1:=1. b2:=2. b3:=3. b4:=4.
>>>>>>     c1:=1. c2:=2. c3:=3. c4:=4.
>>>>>>     x:=1.
>>>>>>     x+t1 +t2 + t3 + t4 + t5+ b1+b2+b3+b4 + c1 + c2 + c3 + c4] value:1.
>>>>>> ^ t1 + t2 + t3 + t4 + t5 + t6 + t7
>>>>>>
>>>>>>
>>>>>>
>>>>>> They differ only in the number of tempraries (t1-t8 / t1-t7) and the
>>>>>> number of copied values for the block closure (1 / 5).
>>>>>>
>>>>>> with the old compiler:
>>>>>> fooSmall frameSize -> 16
>>>>>> fooLarge frameSize -> 56
>>>>>>
>>>>>>
>>>>>>
>>>>>> the opal compiler computes the opposite sizes
>>>>>> fooSmall frameSize -> 56
>>>>>> fooLarge frameSize -> 16
>>>>>>
>>>>>>
>>>>>>
>>>>>> Looks like a bug in the Opal compiler :-).  Well found.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I am confused.
>>>>>>
>>>>>>
>>>>>> No you're not.  You've found a bug.  Now find its cause....
>>>>>>
>>>>>
>>>>>
>>>>> No, I am still confused :)
>>>>>
>>>>> Maybe you can help me with, how the old compiler computes the stack
>>>>> frame in this examples:
>>>>>
>>>>> I changed CompiledMethod>>#needsFrameSize:
>>>>> to write the value for self numTemps and newFrameSize to the
>>>>> Transcript and compiled some simple functions:
>>>>>
>>>>> foo
>>>>>     |a b|
>>>>>     a:=1.
>>>>>     b:=1.
>>>>>     ^ a+b
>>>>>
>>>>> numTemps:2
>>>>> frameSize: 2
>>>>>
>>>>> ok, two temps and two pushes on the stack
>>>>>
>>>>>
>>>>> foo
>>>>>     ^ [ 1+1 ]
>>>>>
>>>>> numTemps:0
>>>>> frameSize: 2
>>>>>
>>>>> ok, no temps and two pushs (push constant:1/push constant:1) on the
>>>>> stack
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> foo
>>>>>     ^ [|a b| a:=1. b:=1. a+b ]
>>>>>
>>>>> numTemps:0
>>>>> frameSize: 4
>>>>>
>>>>> ok, no (method) temps, why is the stackframe 4? Two block local temps
>>>>> and two pushs.
>>>>>
>>>>>
>>>>> |x y|
>>>>> x:=1.
>>>>> y:=1.
>>>>>     ^ [|a b| a:=1. b:=1. a+b ]
>>>>>
>>>>> numTemps:2
>>>>> frameSize: 2
>>>>>
>>>>> Now what? Adding method temps enlarges the number of temps, ok. But
>>>>> the stackframe decreases?
>>>>>
>>>>
>>>> Looks like a bug.  The stack size needed in
>>>>
>>>> |x y|
>>>> x:=1.
>>>> y:=1.
>>>>     ^ [|a b| a:=1. b:=1. a+b ]
>>>>
>>>> is 4, unless the compiler is optimizing away the a+b and is replacing
>>>> the block with [1] ?  The stack size of the outer method is 3 (2 temps + 1
>>>> for the push of either 1 or the block).
>>>>
>>>
>>>
>>> No optimization, the bytecode is:
>>>
>>> 13 <76> pushConstant: 1
>>> 14 <68> popIntoTemp: 0
>>> 15 <76> pushConstant: 1
>>> 16 <69> popIntoTemp: 1
>>> 17 <8F 00 00 0A> closureNumCopied: 0 numArgs: 0 bytes 21 to 30
>>> 21     <73> pushConstant: nil
>>> 22     <73> pushConstant: nil
>>> 23     <76> pushConstant: 1
>>> 24     <68> popIntoTemp: 0
>>> 25     <76> pushConstant: 1
>>> 26     <69> popIntoTemp: 1
>>> 27     <10> pushTemp: 0
>>> 28     <11> pushTemp: 1
>>> 29     <B0> send: +
>>> 30     <7D> blockReturn
>>> 31 <7C> returnTop
>>>
>>>
>>>
>> The pushConstant: nil are here to initialize the local temps, so they
>> should be followed by a popIntoTemp: 0 (resp. 1)
>> If those pop have been optimized away, so should the push, otherwise this
>> makes a depth 6 necessary (2 temps + 4 push), and an imbalanced stack -
>> probably harmless, the blockReturn unwind correctly (?).
>>
>>
> No, as far as I understand block local temps the "pushConstant: nil" *are*
> the temp vars.
>

exactly.


> stack
> stack
> stack <- method stack pointer
> stack <- block temp 0 (initialized with pushConstant: nil)
> stack <- block temp 1 (initialized with pushConstant: nil)
> stack <- block stack pointer
> stack ....
>
>
>
>>
>>
>>>
>>>>
>>>>
>>>>
>>>>> Nicolai
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> On Aug 18, 2014, at 11:32 PM, Nicolai Hess <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> > Hi,
>>>>>>> >
>>>>>>> > on what depends the stack size for a compiled method?
>>>>>>> > I try to figure out, why the old compiler and the opal compile
>>>>>>> generate different
>>>>>>> > compiled method headers.
>>>>>>> > I think this comes from a wrong stack size computed by opal, but I
>>>>>>> can not figure
>>>>>>> > out how the stack size is computed.
>>>>>>> >
>>>>>>> > Old Compiler
>>>>>>> > PolygonMorph>>#lineSegmentsDo:
>>>>>>> > header -> "primitive: 0
>>>>>>> >  numArgs: 1
>>>>>>> >  numTemps: 3
>>>>>>> >  numLiterals: 23
>>>>>>> >  frameSize: 56"
>>>>>>> >
>>>>>>> > Opal compiler:
>>>>>>> > PolygonMorph>>#lineSegmentsDo:
>>>>>>> > header -> "primitive: 0
>>>>>>> >  numArgs: 1
>>>>>>> >  numTemps: 3
>>>>>>> >  numLiterals: 23
>>>>>>> >  frameSize: 16"
>>>>>>> >
>>>>>>>
>>>>>>
>>>>>>
>>>>>> Eliot (phone)
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> best,
>>>> Eliot
>>>>
>>>
>>>
>>
>


-- 
best,
Eliot

Re: [Pharo-dev] Stack size for compiled methods (issue 13854 Crash/hang on array access)

Reply via email to