2014-08-21 9:09 GMT+02:00 Nicolai Hess <[email protected]>: > 2014-08-21 7:14 GMT+02:00 Eliot Miranda <[email protected]>: > > >> >> >> On Wed, Aug 20, 2014 at 10:15 PM, Nicolai Hess <[email protected]> >> wrote: >> >>> >>> 2014-08-19 19:02 GMT+02:00 Eliot Miranda <[email protected]>: >>> >>> Hi Nicolai, >>>> >>>> >>>> On Aug 19, 2014, at 11:58 AM, Nicolai Hess <[email protected]> wrote: >>>> >>>> Thank you eliot, >>>> >>>> >>>> 2014-08-19 7:29 GMT+02:00 Eliot Miranda <[email protected]>: >>>> >>>>> Hi Nicolai, >>>>> >>>>> the stack starts as deep as the method's number of temporaries, >>>> >>>> >>>> ok, >>>> >>>> >>>>> which is the sum of the number of arguments >>>> >>>> >>>> ok, >>>> >>>> >>>>> plus the number of temporary variables that can exist in the stack >>>> >>>> >>>> ok (what does "can exist in the stack" mean? They always do?) >>>> >>>> >>>> Not necessarily. The closure implementation moves temps that need it >>>> into an indirect temp vector. See eg my blog on the closure compiler. >>>> >>>> http://www.mirandabanda.org/cogblog/2008/06/07/closures-part-i/ >>>> >>>> plus one if there are any closed-over temporary variables that need to >>>>> be in an indirection vector. Then as execution proceeds the receiver and >>>>> arguments are pushed on the stack, and are replaced by intermediate >>>>> results >>>>> by sends it by the create array bytecode. >>>> >>>> >>>> So, for a method with no blocks, the stack is just the number of >>>> temporaries plus the number of args for the message send with the maximum >>>> number of args? >>>> >>>> >>>> >>>> No. What about this: >>>> >>>> ^Point x: 1 y: (self a: 1 b: 2 c: 3) >>>> >>>> >>>> Before sending a:b:c: the stack is >>>> >>>> Point >>>> 1 >>>> self >>>> 1 >>>> 2 >>>> 3 >>>> >>>> >>>> Any blocks within the method start with the sum of their number of >>>>> arguments, their number of copied values (temp values they access >>>>> read-only) plus their local temporaries. >>>>> >>>> >>>> But this is not just added to the stack size, right? >>>> I have a method with 9 local temporaris and a block in this method with >>>> 8 local temporaries and the frameSize is still 16, (with the old compiler/ >>>> 56 with the new compiler). >>>> So, method and block local temporaries not just sum up? >>>> I tried different variations >>>> - numberOfMethod temps smaller/equal/greater numberOfBlockTemp >>>> - no/some/all method temporaries are accessed in the block closure. >>>> >>>> But I can not see a pattern :) >>>> >>>> >>>> May be a bug in the old compiler. The stack size is the max of the >>>> separate sizes in the method and each block. >>>> >>>> >>>> >>>> >>>>> >>>>> In the method and each block scope stack depth is the hence the sum of >>>>> the number of temporaries plus the max execution depth. And the method's >>>>> depth is the max of the method and that of any blocks within it. >>>> >>>> >>>> What is the execution depth of a method ? The number of "nested blocks"? >>>> >>>> >>>> No, it is how many things it pushes in the stack at the deepest point. >>>> See my example above. >>>> >>>> >>>> >>>>> Then if that depth is 17 or greater it gets the LargeFrame flag set >>>>> which means the VM allocates a 56 slot context, the compiler raising an >>>>> error if the depth is greater than 56. >>>>> >>>>> HTH >>>>> Eliot (phone) >>>>> >>>> >>>> Here are two carefully handcrafted methods :) >>>> >>>> >>>> fooSmall >>>> |t1 t2 t3 t4 t5 t6 t7 t8| >>>> t1:=1. >>>> t2:=2. >>>> t3:=3. >>>> t4:=4. >>>> t5:=5. >>>> t6:=6. >>>> t7:=7. >>>> t8:= 8. >>>> t1:=[:i | |b1 b2 b3 b4 c1 c2 c3 c4 x| >>>> b1:=1. b2:=2. b3:=3. b4:=4. >>>> c1:=1. c2:=2. c3:=3. c4:=4. >>>> x:=1. >>>> x+t1 + b1+b2+b3+b4 + c1 + c2 + c3 + c4] value:1. >>>> ^ t1 + t2 + t3 + t4 + t5 + t6 + t7 + t8 >>>> >>>> >>>> fooLarge >>>> |t1 t2 t3 t4 t5 t6 t7| >>>> t1:=1. >>>> t2:=2. >>>> t3:=3. >>>> t4:=4. >>>> t5:=5. >>>> t6:=6. >>>> t7:=7. >>>> t1:=[:i | |b1 b2 b3 b4 c1 c2 c3 c4 x| >>>> b1:=1. b2:=2. b3:=3. b4:=4. >>>> c1:=1. c2:=2. c3:=3. c4:=4. >>>> x:=1. >>>> x+t1 +t2 + t3 + t4 + t5+ b1+b2+b3+b4 + c1 + c2 + c3 + c4] value:1. >>>> ^ t1 + t2 + t3 + t4 + t5 + t6 + t7 >>>> >>>> >>>> >>>> They differ only in the number of tempraries (t1-t8 / t1-t7) and the >>>> number of copied values for the block closure (1 / 5). >>>> >>>> with the old compiler: >>>> fooSmall frameSize -> 16 >>>> fooLarge frameSize -> 56 >>>> >>>> >>>> >>>> the opal compiler computes the opposite sizes >>>> fooSmall frameSize -> 56 >>>> fooLarge frameSize -> 16 >>>> >>>> >>>> >>>> Looks like a bug in the Opal compiler :-). Well found. >>>> >>>> >>>> >>>> I am confused. >>>> >>>> >>>> No you're not. You've found a bug. Now find its cause.... >>>> >>> >>> >>> No, I am still confused :) >>> >>> Maybe you can help me with, how the old compiler computes the stack >>> frame in this examples: >>> >>> I changed CompiledMethod>>#needsFrameSize: >>> to write the value for self numTemps and newFrameSize to the Transcript >>> and compiled some simple functions: >>> >>> foo >>> |a b| >>> a:=1. >>> b:=1. >>> ^ a+b >>> >>> numTemps:2 >>> frameSize: 2 >>> >>> ok, two temps and two pushes on the stack >>> >>> >>> foo >>> ^ [ 1+1 ] >>> >>> numTemps:0 >>> frameSize: 2 >>> >>> ok, no temps and two pushs (push constant:1/push constant:1) on the stack >>> >>> >>> >>> >>> >>> foo >>> ^ [|a b| a:=1. b:=1. a+b ] >>> >>> numTemps:0 >>> frameSize: 4 >>> >>> ok, no (method) temps, why is the stackframe 4? Two block local temps >>> and two pushs. >>> >>> >>> |x y| >>> x:=1. >>> y:=1. >>> ^ [|a b| a:=1. b:=1. a+b ] >>> >>> numTemps:2 >>> frameSize: 2 >>> >>> Now what? Adding method temps enlarges the number of temps, ok. But the >>> stackframe decreases? >>> >> >> Looks like a bug. The stack size needed in >> >> |x y| >> x:=1. >> y:=1. >> ^ [|a b| a:=1. b:=1. a+b ] >> >> is 4, unless the compiler is optimizing away the a+b and is replacing the >> block with [1] ? The stack size of the outer method is 3 (2 temps + 1 for >> the push of either 1 or the block). >> > > > No optimization, the bytecode is: > > 13 <76> pushConstant: 1 > 14 <68> popIntoTemp: 0 > 15 <76> pushConstant: 1 > 16 <69> popIntoTemp: 1 > 17 <8F 00 00 0A> closureNumCopied: 0 numArgs: 0 bytes 21 to 30 > 21 <73> pushConstant: nil > 22 <73> pushConstant: nil > 23 <76> pushConstant: 1 > 24 <68> popIntoTemp: 0 > 25 <76> pushConstant: 1 > 26 <69> popIntoTemp: 1 > 27 <10> pushTemp: 0 > 28 <11> pushTemp: 1 > 29 <B0> send: + > 30 <7D> blockReturn > 31 <7C> returnTop > > > The pushConstant: nil are here to initialize the local temps, so they should be followed by a popIntoTemp: 0 (resp. 1) If those pop have been optimized away, so should the push, otherwise this makes a depth 6 necessary (2 temps + 4 push), and an imbalanced stack - probably harmless, the blockReturn unwind correctly (?).
> >> >> >> >>> Nicolai >>> >>> >>> >>>> >>>> >>>> >>>> >>>>> >>>>> On Aug 18, 2014, at 11:32 PM, Nicolai Hess <[email protected]> wrote: >>>>> >>>>> > Hi, >>>>> > >>>>> > on what depends the stack size for a compiled method? >>>>> > I try to figure out, why the old compiler and the opal compile >>>>> generate different >>>>> > compiled method headers. >>>>> > I think this comes from a wrong stack size computed by opal, but I >>>>> can not figure >>>>> > out how the stack size is computed. >>>>> > >>>>> > Old Compiler >>>>> > PolygonMorph>>#lineSegmentsDo: >>>>> > header -> "primitive: 0 >>>>> > numArgs: 1 >>>>> > numTemps: 3 >>>>> > numLiterals: 23 >>>>> > frameSize: 56" >>>>> > >>>>> > Opal compiler: >>>>> > PolygonMorph>>#lineSegmentsDo: >>>>> > header -> "primitive: 0 >>>>> > numArgs: 1 >>>>> > numTemps: 3 >>>>> > numLiterals: 23 >>>>> > frameSize: 16" >>>>> > >>>>> >>>> >>>> >>>> Eliot (phone) >>>> >>> >>> >> >> >> -- >> best, >> Eliot >> > >
