Dear David,

On Thu, Sep 2, 2010 at 6:20 AM, David Matthews
<[email protected]> wrote:
> Dear Yue,
> You've obviously made quite a lot of progress.
>
   Thanks! It was a pleasant experience to read the Poly/ML's compiler code.

> Yue Li wrote:
>>
>> 1. In the printed IR of a program, seems the print out has two
>> sections, for example:
>>
>>> 2 * 3;
>>
>> BLOCK(DECL #1{0 uses} =
>>  GLOBAL (FUN "run_call2(1)",
>>      LAMBDAINLINE(
>>       run_call2(1)
>>       CL=false CR=0 LEV=0 LOCALS=0
>>       ARGS=G
>>       ARGLIVES=
>>       RES=G CLOS=()
>>       LOCAL(1,1) G $(INDIRECT(0, PARAM(0,1)) G, INDIRECT(1,
>> PARAM(0,1)) G)){LAMBDA}
>>      ) (*GLOBAL*){early} G $(LIT <long> G);
>>  DECL #2{0 uses} = LOCAL(0,1);
>>  MUTUAL();
>>  RECCONSTR(LOCAL(0,2))
>>  )
>> BLOCK(DECL #1{0 uses} =
>>  GLOBAL (FUN "run_call2(1)",
>>      LAMBDAINLINE(
>>       run_call2(1)
>>       CL=false CR=0 LEV=0 LOCALS=0
>>       ARGS=G
>>       ARGLIVES=
>>       RES=G CLOS=()
>>       LOCAL(1,1) G $(INDIRECT(0, PARAM(0,1)) G, INDIRECT(1,
>> PARAM(0,1)) G)){LAMBDA}
>>      ) (*GLOBAL*){early} G $(LIT <long> G);
>>  DECL #2{0 uses} = LOCAL(0,1);
>>  MUTUAL();
>>  RECCONSTR(LOCAL(0,2))
>>  )
>> POLY_SYS_amul{early} G $(LIT2 G, LIT3 G)
>
> I've had to go back to the code to try to understand what's going on here.
>  It is actually printing the tree twice and that's a mistake.  I added code
> to print the tree in the CODETREE functor, I think because in a very few
> cases it wasn't being printed, and omitted to remove the original printing
> code.  I'll fix that.
>
   OK.

>> The first section contains two same blocks, and always the block is
>> duplicated in the first section for every input ML program. The second
>> section contains an expression. It seems that the second expression is
>> included in the blocks of the first section, and I'm a little bit
>> confused here, so why the first section duplicates the blocks, and why
>> the expression
>>
>> POLY_SYS_amul{early} G $(LIT2 G, LIT3 G)
>>
>> is separately printed out?
>
> This is actually the output of the first phase of processing of the
> generated code-tree.  In this case the original code contained some inline
> functions which have been stripped out leaving this code.  It will actually
> be code-generated and evaluated by the compiler to reduce it to a constant.
>  That's not going to provide any saving in this particular case since it's
> going to be evaluated only once but it may well do if this occurred inside a
> more complex function.
>

   I see, no wonder such optimized expression is generated for some
program, and is not generated for some other of my examples.

>> 2. TAGTEST(x, y, z) is used in several different places such as
>> generation of the boolean constants true and false, or testing whether
>> a list is empty, but I still couldn't find documentation of TagTest
>> from the source code, and I'm wondering what is the semantics of
>> TAGTEST?
>
> TAGTEST is really just an equality operation.  It occurs primarily with
> datatypes but that does include the bool type which is essentially
> datatype bool = false | true
> TAGTEST(c1, c2, exp)  is essentially the same as "c1 = exp".  c2 is the

   I see.

> maximum value of the tag (i.e. one less than the number of tags since
> they're numbered from zero) and is used if a sequence of tests is converted
> into an indexed jump.
>
   Ok, no problem.


>> 3. When I printed the IR of a signature, say:
>>
>>> signature Set =
>>
>> # sig
>> #   type element
>> #   type set
>> #   val empty : set
>> # end;
>> BLOCK(LIT0)
>> BLOCK(LIT0)
>>
>> I suppose here again BLOCK(LIT0) is duplicated, and I'm wondering why
>> this signature is represented like this simple instead of using a
>> tuple whose elements are the decls, and its internal representations?
>
> Declaring a signature doesn't generate any code so it's putting in zero here
> as a place-holder.

   Are signatures preserved only until type checking phase before the
code generation phase?

>Try declaring a structure.
>
    Yes, and I got a tuple containing IR for each function/value
definitions of the structure.

>> 4. I found some documentation about the members of LAMBDA, but still
>> couldn't figure out how
>> CL, CR and LOCALS are determined. In all my testing cases, always
>> CL=false, and CR=0. Also for LOCALS, I thought it indicates the times
>> that local variables are used. For instance:
>
>> Here I don't quite understand why LOCALS in the first Block is always
>> 0, and why the LOCALS below is 3.
>
> There are some values that are only used in later phases of the
> code-generator.  It's simpler to have a single datatype and printer for the
> code-tree for all the phases and have the early phases just set the unused
> values to zero.  CL, CR, LEV, LOCALS, CLOS and ARGLIVES are all of this
> sort.

   I see, seems these flags are set with initial values in the
generated blocks (the duplicated ones), and are computed in the
stripped version later.

>  ARGS and RES are set in the initial phase.  ARGS defines the the
> number and type of the arguments: F for Floating point and G for General
> (i.e. everything else) and RES the type of the result, again either F or G.
>  ML functions are all expected to have precisely one argument and one result
> but the front end generates LAMBDAs that take more arguments for curried
> functions.
>

  Could you introduce a little bit on how CR CL and LOCALS are
determined in the later phase of code generation? Thanks again!

Best,

Yue


> David
>
_______________________________________________
polyml mailing list
[email protected]
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

Reply via email to