[Pharo-dev] Esteban's ChangeLog week of 14 January 2019

2019-01-20 Thread estebanlm
Hello!

This is my weekly ChangeLog, from 14 January 2019 to 20 January 2019.
You can see it in a better format by going here: 
http://log.smallworks.eu/web/search?from=14/1/2019=20/1/2019

ChangeLog
=

14 January 2019:


*I just promoted a new VM as stable for Pharo 7.0. 

It is "201901051900-7a3c6b6" which is not the latest-latest (which is from 
10/01), but it seems very likely to be stable enough :) 


cheers! 
Esteban



Re: [Pharo-dev] DebugSession>>activePC:

2019-01-20 Thread Eliot Miranda
Hi Marcus,

On Fri, Jan 18, 2019 at 5:42 AM Marcus Denker 
wrote:

>
> > On 18 Jan 2019, at 14:26, ducasse  wrote:
> >
> > I simply love the dynamic rewriting this is just too cool. We should
> systematically use it.
> > I will continue to use it in any deprecation.
> >
>
> On my TODO is to make it stand-alone and provide is as a “compatibility
> transform”, too.
>
> So we can add it to methods that we want to keep for compatibility, but
> they will nevertheless transform the code automatically.
> (this then might be disabled in production to not transform)
>
> > Now I have a simple question (You can explain it to me over lunch one of
> these days).
> >
> > I do not get why RBAST would not be a good representation for the
> compiler?
> > I would like to know what is the difference.
> >
> I think it is a good one. I have not yet seen a reason why not. But
> remember, Roel left Squeak because his visitor pattern for the compiler was
> rejected as a dumb idea… so there are definitely different views on core
> questions.
>
> E.g. the RB AST is annotated and the whole things for sure uses a bit more
> memory than the compiler designed for a machine from 1978.
>
> > You mean that before going from BC to AST was difficult?
>
> You need to do the mapping somehow, the compiler needs to remember the BC
> offset in the code generation phase and the AST (somehow) needs to store
> that information (either in every node or some table).
>
> > How opal performs it? It does not use the source of the method to
> recreate the AST but he can do it from the BC?
> >
>
> It uses the IR (which I still am not 100% sure about, it came from the old
> “ClosureCompiler” Design and it turned out to be quite useful, for example
> for the mapping: every IR node retains the offset of the BC it creates,
> then the IR Nodes
> retain the AST node that created them.
>
> -> so we just do a query: “IRMethod, give me the IRInstruction that
> created BC offset X. then “IR, which AST node did create you? then the AST
> Node: what is your highlight interval in the source?
>
> The devil is in the detail as one IR can produce multiple byte code
> offsets (and byte codes) and one byte code might be created by two IR
> nodes, but it does seem to work with some tricks.
> Which I want to remove by improving the mapping and even the IR more…
> there is even the question: do we need the IR? could we not do it simpler?
>
> The IR was quite nice back when we tried to do things with byte code
> manipulation (Bytesurgeon), now it feels a bit of an overkill. But it
> simplifies e.g. the bc mapping.
>

I find Bytesurgeon functionality, specifically a bytecode dis/assembler
very useful, but wouldn't use it for the back end of the bytecode
compiler.  It adds overhead that has no benefit.  But I use my version,
MethodMassage, for lots of things:

- transporting compiled methods from one dialect to another, e.g. to do
in-image JIT compilation omg a method from Pharo in Squeak.
- generating JIT test cases
- generating methods that can't be generated from Smalltalk source, e.g. an
accessor for a JavaScript implementation above Smalltalk where inst var 0
is the prototype slot and nil is unbound, and so in a loop one wants to
fetch the Nth inst var from a temporary initialized to self, and ion the
value is non-nil return it, otherwise setting the temporary to
the prototype slot, hence walking up the prototype chain until an
initialized inst var is found.

I based mine around the messages that InstructionStream sends to the client
in the interpretFooInstructionFor: methods; a client that catches
doesNotUnderstand: then forms the basis of the disassembler.  Simple and
light-weight.

Marcus

_,,,^..^,,,_
best, Eliot


Re: [Pharo-dev] DebugSession>>activePC:

2019-01-20 Thread Eliot Miranda
Hi Marcus,

On Fri, Jan 18, 2019 at 5:15 AM Marcus Denker via Pharo-dev <
pharo-dev@lists.pharo.org> wrote:

>
> > On 11 Jan 2019, at 20:28, Eliot Miranda  wrote:
> >
> > Hi Thomas,
> >
> >  forgive me, my first response was too terse.  Having thought about it
> in the shower it becomes clear :-)
> >
> >> On Jan 11, 2019, at 6:49 AM, Thomas Dupriez <
> tdupr...@ens-paris-saclay.fr> wrote:
> >>
> >> Hi,
> >>
> >> Yes, my question was just of the form: "Hey there's this method in
> DebugSession. What is it doing? What's the intention behind it? Does
> someone know?". There was no hidden agenda behind it.
> >>
> >> @Eliot
> >>
> >> After taking another look at this method, there's something I don't
> understand:
> >>
> >> activePC: aContext
> >> ^ (self isLatestContext: aContext)
> >>ifTrue: [ interruptedContext pc ]
> >>ifFalse: [ self previousPC: aContext ]
> >>
> >> isLatestContext: checks whether its argument is the suspended context
> (the context at the top of the stack of the interrupted process). And if
> that's true, activePC: returns the pc of **interruptedContext**, not of the
> suspended context. These two contexts are different when the debugger opens
> on an exception, so this method is potentially returning a pc for another
> context than its argument...
> >>
> >> Another question I have to improve the comment for this method is:
> what's the high-level meaning of this concept of "activePC". You gave the
> formal definition, but what's the point of defining this so to speak? What
> makes this concept interesting enough to warrant defining it and giving it
> a name?
> >
> > There are two “modes” where a pc us mapped to a source range.  One is
> when stepping a context in the debugger (the context is on top and is
> actively executing bytecodes).  Here the debugger stops immediately before
> a send or assignment or return, so that for sends we can do into or over,
> or for assignments or returns check stack top to see what will be assigned
> or returned.  In this mode we want the pc of the send, assign or return to
> map to the source range for the send, or the expression being assigned or
> returned.  Since this is the “common case”, and since this is the only
> choice that makes sense for assignments ta and returns, the bytecode
> compiler constructs it’s pc to source range map in terms of the pc of the
> first byte if the send, assign or return bytecode.
> >
> > The second “mode” is when selecting a context below the top context.
> The pc for any context below the top context will be the return pc for a
> send, because the send has already happened.  The compiler could choose to
> map this pc to the send, but it would not match what works for the common
> case. Another choice would appear be to have two map entries, one for the
> send and one for the return pc, both mapping to the source range.  But this
> wouldn’t work because the result of a send might be assigned or returned
> and so there is a potential conflict.  I stead the reasonable solution is
> to select the previous pc for contexts below the top of context, which will
> be the pc for the start of the send bytecode.
> >
>
>
> I checked with Thomas
>
> -> for source mapping, we use the API of the method map. The map does the
> “get the mapping for the instruction before”, it just needs to be told that
> we ask the range for an active context:
>
> #rangeForPC:contextIsActiveContext:
>
> it is called
>
> ^aContext debuggerMap
> rangeForPC: aContext pc
> contextIsActiveContext: (self isLatestContext: aContext) ]
>
> So the logic was move from the debugger to the Map. (I think this is even
> your design?), and thus the logic inside the debugger is not needed
> anymore.
>

"Design" is giving my code a little too much respect.  I was desperately
trying to get something to work to be able to deploy Cog with the new
closure model.  I happily admit that DebuggerMethodMap in Squeak is ugly
code.  It had to be extended recently to handle full blocks. But it would
be great to rewrite it.

I dream of a harmonisation of Squeak/Pharo/Cuis execution classes such that
we have the same Context, CompiledCode, CompiledBlock, CompiledMethod,
debuggerMap and BytecodeEncoder (which is effectively the back end of the
compiler that generates bytecode, and the interface to the debugger when
bytecode is analyses or executed in the debugger), which would make my life
easier maintaining the VM and execution classes, especially as we introduce
Sista.  I think Opal as a separate compiler is great; the work you've done
with mustBeBoleanMagic is superb.  At the same time the Squeak compiler is
fine for its job and I still find it easier to understand than Opal
(probably because I'm not spending enough time with it).

One major reason for the harmonization is bug fixes.  Right now I use
SistaV1 and 64-bits as my default work environment, in a 64-bit Squeak
image that was compiled with V3PlusClosures, so that half the