Re: [Pharo-project] Some bugs

Eliot Miranda Tue, 06 Sep 2011 14:50:22 -0700

On Tue, Sep 6, 2011 at 2:37 PM, Eliot Miranda <[email protected]>wrote:


>
>
> On Tue, Sep 6, 2011 at 2:10 PM, Michael Roberts <[email protected]> wrote:
>
>> Hi Eliot,
>>
>> Nicolas' change that we put on the tracker didn't really work as
>> intended (Nicolas did you see that?), but i'll let him respond to the
>> points he was making.
>>
>> For me (point #1) It would be better for you to try and tell us the
>> classes/methods that are either obviously broken or need the attention
>> to get to a series of fixes.
>>
>> e.g., ignoring the loop highlight for the moment, the pc mapping goes
>> wrong right at the start of this example:
>>
>> toDoOutsideTemp
>>        | temp collection |
>>        collection := OrderedCollection new.
>>        1 to: 5 do: [ :index |
>>                temp := index.
>>                 collection add: [ temp ] ].
>>
>> by highlighting pc=2 (IIRC) which is the assignment, but the first
>> highlight should be pc=3 which is the send of #new. It doesn't
>> correctly map/avoid the extra instructions for the array
>> initialisation for copying the values outside the block. is it
>> possible to fix that first? where do we look?
>>
>
> Well, this works in Squeak.  The pc ranges must be determined after
> generating the method *with closure analysis*.  i.e. the code ranges must be
> derived form the same resulting bytecode.  To derive 2 for the pc of the
> send of new the system I guess must be compiling for source ranges
> differently from normal compilation.  Haver you compares Squeak to Pharo for
> this example?
>

and then things get better, but not ideal, if the send of
startOfLastStatement: is removed from
Parser>>statements:innerBlock:blockNode:, i.e. if

hereType ~~ #rightBracket ifTrue:
[[theBlockNode startOfLastStatement: (start := self startOfNextToken).
  (returns := self matchReturn)
 ...
becomes
hereType ~~ #rightBracket ifTrue:
[[start := self startOfNextToken.
  (returns := self matchReturn)
 ...

I think my confusion was that I wanted to remember in the block node the
/end/ of the last statement, not the beginning, and I wasn't thinking at all
clearly at the time.  Looks like startOfLastStatement in BlockNode should be
removed.

With the above change the debugger highlights the "to: limit do:", but not
the entire block as it would for a non-optimized block.  For me, I want the
debugger to highlight the entire thing, from to: expr though to the closing
']'.  So I think more needs to be done (i.e. note the position of the
closing ']' and pass this to the BlockNode).


>> I am just wondering how this can be approached in stages if possible.
>> i know it is all related at some level.  stepping into the code/method
>> that does the majority of the pc mapping is a funny experience ;-)
>> because the method is structured in such a way to expose a number of
>> these bugs.
>>
>> for point #2 ideally we would not ask for a step and have nothing
>> happen. However i wonder if we can get the highlight areas correct,
>> even if we have to "step" through hidden parts. This assumes we can
>> suppress the highlight for the hidden bits which i think we already
>> have...but only if we can do one without the other.  I would then
>> write tests for the highlights, and we would have some protection for
>> further changes.
>>
>
> Right.  First thing is to get the source ranges correct.  Next thing is to
> modify the debugger so that step has a visible effect, i.e. that the
> debugger steps until the source range changes.
>
>
>>
>> thanks,
>> Mike
>>
>>
>> On Mon, Sep 5, 2011 at 5:14 PM, Eliot Miranda <[email protected]>
>> wrote:
>> > Hi Nicolas,
>> >     that's quite a lot to absorb.  Exactly what do you want me to
>> review?
>> >  The first thing it seems to me is to define what we want to be
>> highlighted
>> > at different bytecodes in an optimized loop.  A second thing is to think
>> > about how the debugger can avoid stepping through the hidden sends that
>> are
>> > part of a loop.  What I mean is that
>> >     1 to: n do: [:i| ....]
>> > is actually
>> >      i := 1.
>> >      [i <= n] whileTrue: [
>> > and so the debugger steps through i:= 1 and i <= n, even though these
>> aren't
>> > visible in the source.  That's why one has to click several times to get
>> > into the body of the loop.  The debugger could for example use the
>> source
>> > ranges for the bytecodes to step until the source range changes.  So if
>> > there is the same source range info for all the bytecodes involved in
>> the
>> > loop iteration (not the loop body) the debugger will be able to respond
>> to
>> > step properly.  What do you think?
>> >
>> > [sorry for not having responded sooner; esug and personal issues have
>> got in
>> > the way. apologies]
>> > On Thu, Aug 25, 2011 at 1:06 PM, Nicolas Cellier
>> > <[email protected]> wrote:
>> >>
>> >> A few more notes:
>> >>
>> >> - Encoder>>rawSourceRanges answer a mapping AST Node -> sourceRange
>> >>  Gnerally, I would expect sourceRange to be an Interval, but this has
>> >> to be confirmed.
>> >>  Theses ranges are constructed at source code Parse time (see senders
>> >> of #noteSourceRange:forNode:)
>> >> - The program counters are assigned to AST nodes at byte code #generate
>> >> time
>> >> - for inlined macros, like #to:do: in our case, this pc is after the
>> >> initialize and test statements.
>> >>
>> >> in CompiledMethod>>#rawSourceRangesAndMethodDo:
>> >> I just evaluated this:
>> >>
>> >> methNode encoder rawSourceRanges collect: [:ival |
>> >>        sourceText copyFrom: ival first to: ival last]
>> >>
>> >> And what I see is that the original to:do: message (before macros
>> >> inlining) has the right range
>> >> {1
>> >>        to: 5
>> >>        do: [:index |
>> >>                temp := index.
>> >>                collection
>> >>                        add: [temp]]}->a Text for 'to: 5 do: [ :index |
>> >>                temp := index.
>> >>                collection add: [ temp ] ]'
>> >> This node is the original #to:do: and has pc=42
>> >>
>> >> There is also
>> >> {a LeafNode}->a Text for '[ :index |
>> >>                temp := index.
>> >>                collection add: [ temp ] ]'
>> >> which has a nil pc so it won't be taken into account in the source map.
>> >>
>> >> But one of the messages produced by the inlining has this curious
>> range:
>> >> {index <= 5}->a Text for 'to: 5 do: [ :index |
>> >>                temp := index.
>> >>                c'
>> >> This node has pc = 40, so it will be selected before the correct one
>> >> above has a chance to be.
>> >> {index <= 5} is the test statement produced by inlining, and this
>> >> seems to be the incorrect highlighting we see.
>> >> Thus we know we have to concentrate on macro inlining.
>> >> This happens in MessageNode>>transformToDo:
>> >> We'll see this.
>> >>
>> >> In the interim, I played with eliminating the nodes having
>> unitilialized
>> >> pc
>> >> Let us try to evaluate this snippet in debugger's inspector:
>> >> (methNode encoder rawSourceRanges keys select: [:e | e pc > 0])
>> >> collect: [:node |
>> >>        | ival |
>> >>        ival := methNode encoder rawSourceRanges at: node.
>> >>        node -> (sourceText copyFrom: ival first to: ival last)]
>> >>
>> >> Oh god, a bug in the debugger:
>> >> DoItIn: homeContext
>> >>        ^ ((homeContext namedTempAt: 2) encoder rawSourceRanges keys
>> >>                select: [:e | e pc > 0])
>> >>                collect: [:node |
>> >>                        | ival |
>> >>                        ival := (node namedTempAt: 2) encoder
>> >> rawSourceRanges at: node.
>> >>                        node
>> >>                                -> (sourceText copyFrom: ival first to:
>> >> ival last)]
>> >> (node namedTempAt: 2) does not mean a thing...
>> >> A confusion occurred between the homeContext (DoItIn: method argument)
>> >> and node (the block argument)
>> >> Nevermind... forget about it.
>> >>
>> >> Now let's just concentrate on MessageNode>>transformToDo:
>> >> We see this code:
>> >>        test := MessageNode new
>> >>                                receiver: blockVar
>> >>                                selector: (increment key > 0 ifTrue:
>> [#<=]
>> >> ifFalse: [#>=])
>> >>                                arguments: (Array with: limit)
>> >>                                precedence: precedence from: encoder
>> >>                                sourceRange: (myRange first to:
>> blockRange
>> >> first).
>> >>
>> >> So the intention seems to select 'to: 5 do: '
>> >>
>> >> But we see this:
>> >> BlockNode>>noteSourceRangeStart:end:encoder:
>> >>        "Note two source ranges for this node.  One is for the debugger
>> >>         and is of the last expression, the result of the block.  One is
>> >> for
>> >>         source analysis and is for the entire block."
>> >>        encoder
>> >>                noteSourceRange: (start to: end)
>> >>                forNode: self closureCreationNode.
>> >>        startOfLastStatement
>> >>                ifNil:
>> >>                        [encoder
>> >>                                noteSourceRange: (start to: end)
>> >>                                forNode: self]
>> >>                ifNotNil:
>> >>                        [encoder
>> >>                                noteSourceRange: (startOfLastStatement
>> to:
>> >> end - 1)
>> >>                                forNode: self]
>> >>
>> >> So it seems intentional to select only the last instruction of the
>> >> block for the debugger.
>> >> We'd better not change this without prior asking Eliot.
>> >> But obviously, this is not what is expected by the #transformToDo:
>> >>
>> >> Nicolas
>> >>
>> >> 2011/8/25 Stéphane Ducasse <[email protected]>:
>> >> > tx nicolas this is a cool way to help :)
>> >> >
>> >> > Stef
>> >> >
>> >> > On Aug 24, 2011, at 9:54 PM, Nicolas Cellier wrote:
>> >> >
>> >> >> So the entries of interest for highlighting are
>> >> >>
>> >> >> Debugger>>contentsSelection
>> >> >> Debugger>>pcRange
>> >> >> CompiledMethod>>debuggerMap
>> >> >> DebuggerMethodMap class>>forMethod:
>> >> >> DebuggerMethodMap>>rangeForPC:contextIsActiveContext:
>> >> >>
>> >> >> Then you see the DebuggerMethodMap>>forMethod:methodNode: takes both
>> a
>> >> >> CompiledMethod and its #methodNode.
>> >> >> CompiledMethod>>methodNode invokes the Parser to get the
>> >> >> AbstractSyntaxTree from method source, and if it ever fails ends up
>> by
>> >> >> trying to decompile the byteCodes.
>> >> >>
>> >> >> This is the easy part. Now we to deal with #abstractPCForConcretePC:
>> >> >> and #abstractSourceMap.
>> >> >>
>> >> >> By reading CompiledMethod>>abstractPCForConcretePC: you should
>> quickly
>> >> >> understand that a concrete PC is a byte offset of current byteCode
>> >> >> (the offsets displayed in the byteCode view) while the abstractPC is
>> >> >> just the rank of current byteCode in the list of byteCodes
>> >> >> instructions composing the CompiledMethod. This is just because
>> >> >> "byteCodes" may spread on several bytes beside their name...
>> >> >> This will use InstructionStream and InstructionClient which are just
>> >> >> an iterator and a sort of visitor on byteCode instructions.
>> >> >> So this is not really interesting.
>> >> >>
>> >> >> The more interesting part is #abstractSourceMap
>> >> >> There is a first step to obtain
>> >> >> CompiledMethod>>rawSourceRangesAndMethodDo:
>> >> >> This is the most important part.
>> >> >> The rest is again a mapping from concretePC (instruction byte
>> offset)
>> >> >> to abstractPC (instruction rank).
>> >> >> And some build of a dictionary mapping instruction rank (abstractPC)
>> >> >> -> selected range.
>> >> >>
>> >> >> Note that the last trick seems to use a regenerated CompiledMethod
>> >> >> (theMethodToScan) rather than the original CompiledMethod. There is
>> no
>> >> >> assertion whether these two are equivalent or not. A priori, they
>> >> >> should, unless the Compiler changed since last compilation or if its
>> >> >> behaviour is affected by some Preferences... Would we introduce some
>> >> >> customizable Compiler optimizations that this could become a problem
>> >> >> (We would then add to map decompiled AST to source code AST,
>> probably
>> >> >> with guesses, unless the CompiledMethod would contain debugger
>> >> >> friendly hints...)
>> >> >> We will consider this is not a problem by now.
>> >> >>
>> >> >> So let's now concentrate on rawSourceRangesAndMethodDo:
>> >> >> The nice thing is that you now can just debug this
>> >> >>
>> >> >> (ClosureTests>>#testToDoOutsideTemp) methodNode
>> >> >> rawSourceRangesAndMethodDo: [:rs :mth | ]
>> >> >>
>> >> >> and see how it goes in Squeak. I did not look in Pharo yet, but I
>> >> >> would be amazed to see it much different.
>> >> >> It's now late, and my spare time is off, but you have clues to get
>> >> >> more insights. I wish you good debugging, and come back to me if it
>> >> >> ever goes in deeper complications.
>> >> >>
>> >> >> Cheers
>> >> >>
>> >> >> Nicolas
>> >> >>
>> >> >> 2011/8/24 Michael Roberts <[email protected]>:
>> >> >>>
>> >> >>>>
>> >> >>>> Ok I'm curious to know then.
>> >> >>>
>> >> >>> Here is a little trace from this example method:
>> >> >>>
>> >> >>> toDoOutsideTemp
>> >> >>> | temp collection |
>> >> >>> collection := OrderedCollection new.
>> >> >>> 1 to: 5 do: [ :index |
>> >> >>> temp := index.
>> >> >>> collection add: [ temp ] ]
>> >> >>>
>> >> >>> Trace is start,stop position of the highlight for each 'step over'.
>> >> >>>
>> >> >>> Whilst the numbers are hard to visualise, below you can see how
>> they
>> >> >>> slightly diverge.
>> >> >>> Left Pharo  Right  Squeak
>> >> >>>
>> >> >>> 50, 73     71, 73       diff
>> >> >>> 71, 73     71, 73
>> >> >>> 50, 73     50, 73
>> >> >>> 108, 115   79, 121      diff
>> >> >>> 79, 121    79, 121
>> >> >>> 108, 115   108, 115
>> >> >>> 132, 144   132, 144
>> >> >>> 147, 146   146, 146     (diff negative size means no highlight)
>> >> >>> 146, 146   146, 146
>> >> >>> 79, 121    79, 121
>> >> >>> 108, 115   108, 115
>> >> >>> 132, 144   132, 144
>> >> >>> 147, 146   146, 146
>> >> >>> 146, 146   146, 146
>> >> >>> 79, 121    79, 121
>> >> >>> 108, 115   108, 115
>> >> >>> 132, 144   132, 144
>> >> >>> 147, 146   146, 146
>> >> >>> 146, 146   146, 146
>> >> >>> 79, 121    79, 121
>> >> >>> 108, 115   108, 115
>> >> >>> etc...
>> >> >>> For example the first difference is because Pharo shows the whole
>> >> >>> assignment
>> >> >>> of the first line as the first send, even though it is not.
>> >> >>> The second difference is that Pharo shows the assignment inside the
>> >> >>> block as
>> >> >>> the first highlight of the loop even though the to:do should be
>> >> >>> highlighted....but both Pharo & Squeak get the to:do: wrong when
>> they
>> >> >>> choose
>> >> >>> to show it.
>> >> >>> hope you get the idea...
>> >> >>> Mike
>> >> >>
>> >> >
>> >> >
>> >> >
>> >>
>> >
>> >
>> >
>> > --
>> > best,
>> > Eliot
>> >
>>
>>
>
>
> --
> best,
> Eliot
>
>


-- 
best,
Eliot

Re: [Pharo-project] Some bugs

Reply via email to