Re: pdd21 vs. find_global

2006-07-02 Thread Patrick R. Michaud
On Sat, Jul 01, 2006 at 05:10:59PM -0500, Chip Salzenberg wrote:
 Darn, find_global has collided with pdd21.
 
 Currently find_global is prepared to accept a key or a namespace, and
 distinguishing namespaces from arrays is starting to get just a little
 too polymorphic for an opcode.

Agreed.  And find_global gets a bit overloaded anyway.

 you change to
 
 $P99 = get_namespace key_or_array
 $P0 = $P99['foo']
 
 which also incidentally encourages(!) compilers to cache namespace pointers.

Ooh.  I like it very much!

Pm


Re: [perl #39696] perl6 makefile should relocate

2006-07-03 Thread Patrick R. Michaud
On Mon, Jul 03, 2006 at 07:37:00PM -0700, Will Coleda wrote:
 # New Ticket Created by  Will Coleda 
 # Please include the string:  [perl #39696]
 # in the subject line of all future correspondence about this issue. 
 # URL: https://rt.perl.org/rt3/Ticket/Display.html?id=39696 
 
 
 Currently the perl6 makefile is in the root parrot config dir: config/ 
 gen/makefiles/perl6.in
 
 It should move closer to home: languages/perl6/config/root.in

Feel free to make this switch.  :-)

Pm


Re: a smarter form of whitespace

2006-07-05 Thread Patrick R. Michaud
On Tue, Jul 04, 2006 at 12:57:16PM -0700, Allison Randal wrote:
 --
 
 token start { ^emptyline*$ }
 
 regex emptyline { ^^ $$ \n }
 
 token ws { [sp | \t]* }
 
 --

The above grammar doesn't have a grammar statement; as a result
the regexes are being installed into the '' namespace.

 If I match this against a string of 7 newlines, it returns 7 emptyline
 matches, and each match is a single newline. This is the behavior I want
 for newlines.

I tried it with a grammar statement and it seems to work:



$ cat ar.pg
grammar XYZ;

token start { ^emptyline*$ }

rule emptyline { ^^ $$ \n }

token ws { [sp | \t]* }

$ ./parrot compilers/pge/pgc.pir ar.pg ar.pir
$ cat xyz.pir
.sub main :main
load_bytecode 'PGE.pbc'
load_bytecode 'ar.pir'
load_bytecode 'dumper.pbc'
load_bytecode 'PGE/Dumper.pbc'

$P0 = find_global 'XYZ', 'start'
$P1 = $P0(\n\n\n\n\n\n\n, 'grammar' = 'XYZ')

'_dumper'($P1)
.end
$ ./parrot xyz.pir
VAR1 = PMC 'XYZ' = \n\n\n\n\n\n\n @ 0 {
emptyline = ResizablePMCArray (size:7) [
PMC 'XYZ' = \n @ 0,
PMC 'XYZ' = \n @ 1,
PMC 'XYZ' = \n @ 2,
PMC 'XYZ' = \n @ 3,
PMC 'XYZ' = \n @ 4,
PMC 'XYZ' = \n @ 5,
PMC 'XYZ' = \n @ 6
]
}
$ 

-

Pm


Re: a smarter form of whitespace

2006-07-06 Thread Patrick R. Michaud
On Thu, Jul 06, 2006 at 12:29:12AM -0700, Allison Randal wrote:
 $ cat xyz.pir
 .sub main :main
 load_bytecode 'PGE.pbc'
 load_bytecode 'ar.pir'
 load_bytecode 'dumper.pbc'
 load_bytecode 'PGE/Dumper.pbc'
 
 $P0 = find_global 'XYZ', 'start'
 $P1 = $P0(\n\n\n\n\n\n\n, 'grammar' = 'XYZ')
 
 What the original didn't have is the 'grammar' named argument when 
 calling the start rule. When I replace the previous line with:
 
   $P1 = $P0(\n\n\n\n\n\n\n)
 
 then your sample code exhibits the same problem. I assume this means 
 that the reason overriding ws wasn't working is because it was calling 
 the default version of ws in the root namespace. But, if it was 
 defaulting to the root namespace, why was it able to find any of the 
 rules? Shouldn't it have complained that it couldn't find emptyline?

At the moment (and this may be incorrect), PGE looks for named rules
via inheritance, and if not found that way it looks in the available
symbol tables using the find_name opcode.

So, the match was able to find the rules because they are in the 
current namespace, but when it came time to find the rule for ?ws 
there was a ws method available (the default) and so that one
was used.

Again, this may not be the correct behavior; I've been using S12 as
the guide here, in that a method call first considers methods from
the class hierarchy and fails over to subroutine dispatch.

Pm


Re: I'm pre-hackathoning at OSCON, not post-hackathoning

2006-07-10 Thread Patrick R. Michaud
On Fri, Jul 07, 2006 at 01:36:06PM -0700, Chip Salzenberg wrote:
 I'm unable to hang around Portland after Friday afternoon, I'm sorry to
 report, so Saturday hackathoning will miss me.  However, I will be arriving
 a day _early_ so I'll be in Portland all day Sunday.  I understood Patrick
 to be in a similar situation, so he might be there Sunday too.

Yes, I'm now targeting any hackathoning in Portland to occur on 
the Sunday before OSCON instead of the Saturday after.

Pm


Re: Ruby on Parrot

2006-07-10 Thread Patrick R. Michaud
On Fri, Jul 07, 2006 at 10:07:57AM -0600, Kevin Tew wrote:
 I based the initial PGE grammar for PRuby off of  
 svn://rubyforge.org/var/svn/rubygrammar/grammars/antlr-v3/trunk/ruby.g 
 which is in complete.
 I'm looking for a BNF style description of the Ruby grammar.  Otherwise 
 I will have to dig into :pserver:[EMAIL PROTECTED]:/src/parse.y.

I'll be glad to provide any help that I can in building a PGE
version of the grammar -- just let me know where I can help.

Pm


Re: [perl #39776] [BUG] PGE core dump

2006-07-10 Thread Patrick R. Michaud
On Sun, Jul 09, 2006 at 07:15:07PM -0700, Kevin Tew wrote:
 ../../parrot ../../compilers/pge/pgc.pir 
 --output=lib/pruby_grammar_gen.pir lib/pruby.pg
 Method 'reduce' not found
 current instr.: 'parrot;PGE::Exp::Quant;reduce' pc 4358 
 (compilers/pge/PGE/Exp.pir:402)
 ...
 pruby.pg is available at http://tewk.com/pruby.pg

This is probably due to a syntax error in the pruby.pg grammar 
itself.  In particular, the line

token EXPONENT { ( e | E ) ( + | - )? PRubyGrammar::DIGITS }

should probably read

token EXPONENT { ( e | E ) ( \+ | - )? PRubyGrammar::DIGITS }

After making this change on my system the grammar appears to
compile correctly.

I totally agree that PGE probably needs to provide better syntax
error checking in situations such as this, thus I'm leaving this
ticket open, or will add a new more descriptive one soon.

Also, FWIW, I think that the grammar will read much more cleanly
if the PRubyGrammar:: qualifiers are taken out of the rules --
they aren't needed if the initial match call is coded correctly.
(Punie uses these in its grammar, and isn't a good model in this
respect.)

For example, I would write the above EXPONENT token as:

token EXPONENT { ( e | E ) ( \+ | - )? DIGITS }

or perhaps better is:

token EXPONENT { [eE] [+\-]? DIGITS }

Thanks,

Pm


Re: Java Script in Parrot

2006-07-10 Thread Patrick R. Michaud
On Sun, Jul 09, 2006 at 04:11:55PM -0700, chromatic wrote:
 On Sunday 09 July 2006 02:15, Vishal Soni wrote:
 
  I am not an expert on which approach is the way to go:
  1. Hack Mozilla's JavaScript excution engine to generate PIR.
 
 If there's a fairly direct correspondence between JS bytecode (if there is 
 such a thing; I have no idea -- whatever internal ops it uses to represent a 
 program to execute), this may be easiest to start.
 
  2. Use the Compiler Tool Chain developed by Parrot Wizards to implement
  JavaScript engine.
 
 This is probably the best long-term approach, at least if you find someone 
 good to write the grammar.  (I hate parsing.)

FWIW, I'm more than happy to help with the grammar, especially if
there's an existing definition to work from.

Pm


Re: HLL, perl6

2006-07-10 Thread Patrick R. Michaud
On Mon, Jul 10, 2006 at 12:10:37AM -0400, Will Coleda wrote:
 I am currently trying to add some PGE to tcl (for the [expr] command,  
 where the optok parsing will be very helpful).
 
 While debugging, I noticed that perl6 isn't using the .HLL directive:  
 I suspect the namespace lookup issues I'm having (and perl6 isn't)  
 might be de to this difference.
 
 Some intrepid coder want to try to switch to using .HLL instead of a  
 simple .namespace directive?

I tried this back in March, but namespace handling at the time
wasn't really up to the task.  But overall namespaces have been
vastly improved since then, so I'll probably make another attempt
at it soon (unless someone else wants to take a crack at it, in
which case I'll be glad to provide any help needed).

Pm


Re: HLL root globals and empty keys (was Re: test of get_namespace opcode)

2006-07-10 Thread Patrick R. Michaud
On Sat, Jul 08, 2006 at 01:57:58PM -0700, Chip Salzenberg wrote:
 Relative is the usual apposite to absolute, but we have a three-way
 logic here, so appositives don't really work.  I think that hll is the
 best I can think of, and given the existing .HLL directive, its meaning
 is immediately clear:
 
  .HLL 'perl5', perl5_group
  .namespace ['Foo']
 
  $P0 = get_global 'x'# ['perl5';'Foo';'x']
  $P0 = get_global ['Bar'], 'x'   # ['perl5';'Foo';'Bar';'x']
 
  $P0 = get_hll_global 'x'  # ['perl5';'x']
  $P0 = get_hll_global ['Corge'], 'x'   # ['perl5';'Corge';'x']
 
  $P0 = get_abs_global 'x'  # ['x']
  $P0 = get_abs_global ['parrot'], 'x'  # ['parrot';'x']

Pardon me for coming in late to the thread -- this past week I
was on a trip with limited network access and I'm just now catching
up.

What's the status on the above...has it been blessed/implemented yet?
This looks to me like exactly what is needed/desired for the various 
HLL's I'm working with.

Thanks,

Pm


Re: Java Script in Parrot

2006-07-10 Thread Patrick R. Michaud
On Mon, Jul 10, 2006 at 09:19:14PM +0100, Norman Nunley, Jr wrote:
 There's a rules grammar in http://svn.openfoundry.org/pugs/misc/ 
 JavaScript-FrontEnd/Grammar.pm
 
 When I last attempted to compile it with PGE, it gave up the ghost in  
 the character class definitions.

Wow, thanks for the update.  PGE seems to be having trouble with
the -xyz rules, which are currently unimplemented.  But the
grammar is also using incorrect regex syntax -- the statements
like:

rule no_LineTerminator_here {
  [ ws  -LineTerminator*? ]
}

rule USP  { Zs-TAB-VT-FF-SP-NBSP }

need to eliminate the inner angles, as in:

rule no_LineTerminator_here {
  [ ws  -LineTerminator*? ]
}

rule USP  { +Zs-TAB-VT-FF-SP-NBSP }

But I think the no_LineTerminator_here rule probably needs
to be rewritten altogether to avoid the  conjunction.

At any rate, this is a very useful start; I think it could
be updated quite quickly.  Thanks!

Pm


 On 10 Jul 2006, at 20:47, Patrick R. Michaud wrote:
 
 On Sun, Jul 09, 2006 at 04:11:55PM -0700, chromatic wrote:
 On Sunday 09 July 2006 02:15, Vishal Soni wrote:
 
 I am not an expert on which approach is the way to go:
 1. Hack Mozilla's JavaScript excution engine to generate PIR.
 
 If there's a fairly direct correspondence between JS bytecode (if  
 there is
 such a thing; I have no idea -- whatever internal ops it uses to  
 represent a
 program to execute), this may be easiest to start.
 
 2. Use the Compiler Tool Chain developed by Parrot Wizards to  
 implement
 JavaScript engine.
 
 This is probably the best long-term approach, at least if you find  
 someone
 good to write the grammar.  (I hate parsing.)
 
 FWIW, I'm more than happy to help with the grammar, especially if
 there's an existing definition to work from.
 
 Pm
 
 


Re: HLL root globals and empty keys (was Re: test of get_namespace opcode)

2006-07-10 Thread Patrick R. Michaud
On Mon, Jul 10, 2006 at 02:53:15PM -0700, Chip Salzenberg wrote:
 On Mon, Jul 10, 2006 at 03:23:56PM -0500, Patrick R. Michaud wrote:
  On Sat, Jul 08, 2006 at 01:57:58PM -0700, Chip Salzenberg wrote:
   Relative is the usual apposite to absolute, but we have a three-way
   logic here, so appositives don't really work.  I think that hll is the
   best I can think of, and given the existing .HLL directive, its meaning
   is immediately clear:
.HLL 'perl5', perl5_group
.namespace ['Foo']
$P0 = get_global ['Bar'], 'x'   # ['perl5';'Foo';'Bar';'x']
$P0 = get_hll_global ['Corge'], 'x'   # ['perl5';'Corge';'x']
$P0 = get_abs_global ['parrot'], 'x'  # ['parrot';'x']
  
  What's the status on the above...has it been blessed/implemented yet?
  This looks to me like exactly what is needed/desired for the various 
  HLL's I'm working with.
 
 Allison has blessed it except for the detail of the _hll_ in the HLL
 selection.  I haven't started implementing it yet, though nothing stands
 in my way technically.
 
 I've suggested that get_namespace follow exactly the same pattern, but
 so far she hasn't commented on that suggestion at all.

I really like both of these suggestions.  We also noted on #parrot that
get_hll_global would really simplify things for the Tcl folks, which
currently go through a macro to achieve the same effect.

 BTW I expect find_global to keep working for a good while.  The only thing
 that may change incompatibly in _any_ of this is the meaning of:
 
 PMC = get_namespace KEY,STR
 
 which currently starts from the HLL root but which I'm proposing should
 start at the current namespace.  *If* that additional proposal goes forward,
 any place you currently have the above, you would just change it to:
 
 PMC = get_hll_namespace KEY,STR

I'm not currently using get_namespace in any form, so I have no problem
with this switch.  

FWIW, a quick grep of the parrot tree seems to indicate that the only
places using get_namespace outside of the Parrot sources and tests
are in languages/dotnet (12 occurrences) and languages/tcl (2 occurrences).

Pm


OSCON hackathon

2006-07-10 Thread Patrick R. Michaud
For those who are interested in doing hackathoning at OSCON,
we're currently planning to do things on Sunday the 23rd.
I'll see if I can find a designated place for us to meet
and work.

However, for those who cannot make it on Sunday, I notice that
Monday and Tuesday at OSCON are primarily dedicated for
tutorial sessions, so people arriving after Sunday and/or
not attending or presenting tutorials can perhaps continue
hacking activities on those days...?

Pm


On Mon, Jul 10, 2006 at 03:59:23PM -0700, Darren Duncan wrote:
 At 2:24 PM -0500 7/10/06, Andy Lester wrote:
 On Jul 10, 2006, at 12:39 PM, Patrick R. Michaud wrote:
 
 Yes, I'm now targeting any hackathoning in Portland to occur on
 the Sunday before OSCON instead of the Saturday after.
 
 I'll be in Monday afternoon and leaving Friday afternoon so nyeah!
 
 If a virgin car-pool arrangement works out, I will be travelling to 
 OSCON (for just the no-cost hallway track) on Sunday the 23rd (so 
 maybe catch the end of a sunday event) and back on Friday the 28th. 
 I mainly expect to do my hackathoning Monday to Thursday, with 
 whoever's there. -- Darren Duncan


Re: Ruby on Parrot

2006-07-12 Thread Patrick R. Michaud
On Tue, Jul 11, 2006 at 02:41:19PM -0600, Kevin Tew wrote:
 It parses my simple puts.rb example, but parse time is really slow..  2 
 minutes.
 I'm sure I've made some dumb grammar mistakes that is slowing it down.

Well, the first thing to note is that subrule calls can be comparatively
slow, so I think you might get a huge improvement by eliminating
the sp subrule from 

token ws {[sp|[\t]]*}

resulting in

token ws { [ \t]* }

(Also, sp is a capturing subrule, so that means a separate Match 
object is being created and stored for every space encountered
in the source program.  In such cases ?sp might be better.)

Along a similar vein, I think that a rule such as

rule statement {
  ALIAS fitem fitem
  |ALIAS global_variable [global_variable|back_reference]
  |UNDEF undef_list
  |statement2 [IF |UNLESS |WHILE |UNTIL] expression_value
  |statement2 RESCUE statement
  |BEGIN \{ compound_statement \}
  |END \{ compound_statement \}
  |command_call
  |statement2
}

may be quite a bit slower than the more direct

rule statement {
  alias fitem fitem
  |alias global_variable [global_variable|back_reference]
  |undef undef_list
  |statement2 [if|unless|while|until] expression_value
  |statement2 rescue statement
  |begin \{ compound_statement \}
  |end \{ compound_statement \}
  |command_call
  |statement2
}

but I haven't tested this at all to know if the difference
in speed is significant.  I do know that the regex engine will
have more optimization possibilities with the second form than
with the first.  (If one stylistically prefers the keyword tokens 
not appear as barewords in the rule, then 'alias', 'undef',
etc. work equally well for constant literals.)

It's also probably worthwhile to avoid backtracking and re-parsing 
complex subrules such as statement2 above.  In the above, a plain
statement2 w/o if/unless/while/until/rescue ends up being parsed
three separate times before the rule succeeds.  Better might be:

rule statement {
  |alias fitem fitem
  |alias global_variable [global_variable|back_reference]
  |undef undef_list
  |begin \{ compound_statement \}
  |end \{ compound_statement \}
  |statement2 [ [if|unless|while|until] expression_value
| rescue statement 
]?
  |command_call
}

(In fact, looking at the grammar I'm not sure that command_call
is really needed, since statement2 already covers that.  But I'm
not a Ruby expert.)

Anyway, let me know if any of the above suggestions make sense
or provide any form of improvement in parsing speed.

Thanks!

Pm

 Patrick R. Michaud wrote:
 On Fri, Jul 07, 2006 at 10:07:57AM -0600, Kevin Tew wrote:
   
 I based the initial PGE grammar for PRuby off of  
 svn://rubyforge.org/var/svn/rubygrammar/grammars/antlr-v3/trunk/ruby.g 
 which is in complete.
 I'm looking for a BNF style description of the Ruby grammar.  Otherwise 
 I will have to dig into :pserver:[EMAIL PROTECTED]:/src/parse.y.
 
 
 I'll be glad to provide any help that I can in building a PGE
 version of the grammar -- just let me know where I can help.
 
 Pm
   


Re: [TODO] Implement .loadlib pragma in IMCC

2006-07-12 Thread Patrick R. Michaud
On Wed, Jul 12, 2006 at 10:25:43AM -0700, Allison Randal wrote:
 It occurs to me, after thinking about it overnight, that the .loadlib 
 directive shouldn't operate at :immediate time, but at :init time, 
 because it's more common to want a library to load when you run the code 
 than to load only when you compile the code.

This might not seem totally related (but it is somewhat related)...

The perl6 compiler has a custom string type, currently called 
Perl6Str.  What's the canonically correct mechanism for creating 
an object of that type?

$P0 = new 'Perl6Str'
$P0 = new .Perl6Str
$P0 = new [ 'Perl6Str' ]

At different stages of Parrot development I've seen different 
answers to this question, so it'd be helpful to know what's correct.

(Also, if the answer is somehow different for Parrot's built-in 
types, such as Undef or Integer, I'd like to know that.)

Pm


Re: Creating a New Object (was Re: [TODO] Implement .loadlib pragma in IMCC)

2006-07-12 Thread Patrick R. Michaud
On Wed, Jul 12, 2006 at 11:36:56AM -0700, chromatic wrote:
 On Wednesday 12 July 2006 11:27, Patrick R. Michaud wrote:
  The perl6 compiler has a custom string type, currently called
  Perl6Str.  What's the canonically correct mechanism for creating
  an object of that type?
 
  $P0 = new 'Perl6Str'
  $P0 = new .Perl6Str
  $P0 = new [ 'Perl6Str' ]
 
 I tend to use:
 
   .local int str_type
   str_type = find_type [ 'Perl6Str' ]
 
   .local pmc p6str
   p6str= new str_type


Along similar lines...

   - If another HLL wants to create a Perl6Str, how does it do it?
   - If another HLL wants to create a subclass of Perl6Str...?

Pm


Re: [TODO] Implement .loadlib pragma in IMCC

2006-07-12 Thread Patrick R. Michaud
On Wed, Jul 12, 2006 at 12:18:51PM -0700, Allison Randal wrote:
 Leopold Toetsch wrote:
 
 Well, there was already one very legitimate usage of compile time
 loadlib, which is now using C.loadlib for that:
 
 We certainly need both compile-time and runtime loading of libraries. 
 So, it's just a question of which syntax to use for which case.
 
 chromatic suggests .include for load this library at compile and run 
 time. The .include directive is currently being used to mean inline 
 the entire source code for this file here. But, I've always thought of 
 that as a hack we put in before we had library loading working. Any 
 thoughts?

I think I'm confused by or totally misunderstanding the proposal.  
I think we have two very different sorts of library at play here:  
dynamic libraries (with a .so extension on my system), and libraries
of parrot code (with .pir, .pbc, and .pasm extensions).

IIUC, the loadlib opcode (and the new .loadlib directive) are used
strictly for dynamic libraries -- on my system those are files with .so
extensions.  loadlib and .loadlib aren't used for .pbc files...  
that's the domain of the load_bytecode opcode.  load_bytecode can
be used for loading .pbc/.pir files at runtime, at load-time via :load,
or at compile-time with :immediate.

.include is currently compile-time only, and only works with .pir/.pasm
files (i.e., one cannot include a .pbc).  In addition, any 
.included source honors the current .HLL and 
.namespace settings, which isn't true for files (.pir/.pbc) that are 
obtained via load_bytecode opcode.

So, if the proposal is that .include means load a .pbc/.pir library 
whenever the including file is compiled or loaded in a manner analogous
to load_bytecode, then I'm still wanting a way to get source files 
that are compile-only and honor any .namespace directives.

But as I said, I think I must be misunderstanding what is being said,
so feel free to re-explain or correct my misunderstanding.

Pm


Re: [TODO] Implement .loadlib pragma in IMCC

2006-07-12 Thread Patrick R. Michaud
On Wed, Jul 12, 2006 at 03:51:53PM -0400, Bob Rogers wrote:
From: Leopold Toetsch [EMAIL PROTECTED]
Date: Wed, 12 Jul 2006 21:15:44 +0200
 
On Wed, Jul 12, 2006 at 01:27:24PM -0500, Patrick R. Michaud wrote:
 The perl6 compiler has a custom string type, currently called 
 Perl6Str.  What's the canonically correct mechanism for creating 
 an object of that type?
 
 $P0 = new 'Perl6Str'
 $P0 = new .Perl6Str
 $P0 = new [ 'Perl6Str' ]
 
2) only works, *if* the lib, which defines that type is already
   loaded (via :immediate/loadlib or .loadlib), because it's
   translated to new_p_ic, i.e. the type name is converted to
   a type number at compile time, which speeds up run time
   object creation.
 
 So the type is bound to a number in the .pbc?  Isn't this dangerous for
 types that are not built in?  Couldn't this number mean something
 different if libraries happen to get loaded in a different order?

IIUC, the type is bound to a number in the .pbc only for the second 
form (.Perl6Str).  And yes, it is dangerous for the non-built-in types, 
which is why I think the note in DEPRECATED.pod is likewise dangerous:

=item Class name IDs
... will require a dot in front
  $P0 = new Integer   = $P0 = new .Integer

AFAICT, the only safe form for the non-builtin types is to use
a string, a key, or the separate find_type lookup...which is what
prompted my original question in this thread about which form 
is canonically (and operationally) correct.

Pm


Re: [perl #39809] PGE crash on parrot;PGE::Exp::Quant;reduce

2006-07-12 Thread Patrick R. Michaud
On Wed, Jul 12, 2006 at 08:04:01PM -0700, Chris Dolan wrote:
 As simple token containing :i causes PGE to crash with an attempted
 method call on Undef.
 
 Steps to reproduce:
1) Create a grammar file called foo.pg that has one line:
   token foo { :i a }

As I read S05, a modifier has to occur at the *very* beginning
of a regex (or group) in order to work.  In other words, no whitespace
before modifiers in a regex (because whitespace may have some
other meta-syntactic meaning with :sigspace).  Thus

token foo {:i a }

works, while

token foo { :i a }

is an error, since the ':' acts as a cut operator that doesn't
have anything to cut.

But I admit that since we've gone to regex/token/rule, then perhaps
leading whitespace prior to a modifier should be ignored.  That
probably needs a ruling from p6l or @Larry.

Pm


suggestions for new pdd21

2006-07-14 Thread Patrick R. Michaud
Allison just updated pdd21, it looks great!  Here's a first
cut at some suggested changes and wordsmithing to the text.
Feel free to adopt, ignore, or discuss any of the suggestions
below as you see fit.


: A non-nested namespace may appear in Parrot source code as the string
: Ca or the key C[a].
:
: A nested namespace b inside the namespace a will appear as the key
: C[a; b].

Reads better if non-nested is removed from the first statement.

: =head2 Naming Conventions
: 
: There are many different ways to implement a namespace and Parrot's target
: languages display a wide variety of them.  By implementing an API and standard
: conventions, it should be possible to allow interoperability while still
: allowing each one to choose the best internal representation.

Rephrase the first sentence to:

   Parrot's target languages have a wide variety of namespace models.

: Each HLL must store public items in a namespace named with the lowercased
: name of the HLL.  This is the HLL root namespace.  For instance, Tcl's user-
: created namespaces should live in the Ctcl namespace.  This eliminates any
: accidental collisions between languages.
: 
: This namespace must be stored at the first level in parrot's namespace
: hierarchy.  ...

Change This namespace to An HLL root namespace (avoid ambiguity).

: Each HLL must store implementation internals (private items) in a namespace
: named with an underscore and the lowercased name of the HLL.  For instance,
: Tcl's implementation internals should live in the C_tcl namespace.

I think this should read

Each HLL must store implementation internals (private items) in an
HLL namespace named with an underscore and the lowercased name of 
the HLL.  For instance, Tcl's implementation internals should 
live in the C_tcl HLL namespace.

: =item get_global
: 
: $P1 = $P2.get_global($S3)
: 
: Retrieve a global variable $P1 from the namespace $P2, with the name
: $S3.

What's the meaning of global in this context?  Some part of me
wants this to be simply get_symbol.  Or are we contrasting global
with lexical or private?  See also the get*_global and set*_global
opcodes below, which I think should be get_symbol and set_symbol.

: =item get_name
: 
: $P1 = $P2.get_name()
: 
: Gets the name of the namespace $P2 as an array of strings.  For example,
: if the current language is Perl 5 and the current Perl 5 namespace is
: Some::Module (that's Perl 5 syntax), then get_name() on that namespace
: returns an array of perl5, Some, Module. 

Perhaps better written as:

For example, if $P2 is the Perl 5 Some::Module namespace within the 
Perl 5 HLL, then get_name() on $P2 returns an array of perl5, 
Some, Module.

: =item get_namespace
: 
: $P1 = $P2.get_namespace($P3)
:  
: Ask the compiler $P2 to find its namespace which is named by the
: elements of the array in $P3.  Returns a namespace PMC on success and a
: null PMC on failure.  A null PMC or an empty array retrieves the HLL's
: base namespace.

Swap the order of the last two sentences, thus:

Ask the compiler $P2 to find its namespace which is named by the
elements of the array in $P3.  If $P3 is a null PMC or an empty
array, retrieves the base namespace for the HLL.  Returns a namespace
PMC on success and a null PMC on failure.

Since this is a method, it would also be nice if $P3 could be
an optional parameter to obtain the base namespace for the HLL.

: =item load_library
: 
:   $P1.load_library($P2, $P3)
:
: Ask this compiler to load a library/module named by the elements of the array
: in $P2, with optional control information in $P3.
: [...]
: The meaning of $P3 is compiler-specific.  The only universal legal value is
: Null, which requests a normal load.  The meaning of normal varies, but
: the ideal would be to perform only the minimal actions required.

Since we have slurpy named parameters in Parrot, why not simply leave $P3
off and use (optional) named parameters here to specify options?

: =item get_global
: 
: $P1 = get_global $S2
: $P1 = get_hll_global $S2
: $P1 = get_root_global $S2
: 
: Retrieve the symbol named $S2 in the current namespace, HLL root
: namespace, or true root namespace.

Again, perhaps get_symbol might be more appropriate than
get_global, especially since the description itself says
Retrieve the symbol

Thanks,

Pm


Re: suggestions for new pdd21

2006-07-17 Thread Patrick R. Michaud
On Mon, Jul 17, 2006 at 09:52:35AM -0700, Allison Randal wrote:
 : =item get_global
 : 
 : $P1 = $P2.get_global($S3)
 : 
 : Retrieve a global variable $P1 from the namespace $P2, with the name
 : $S3.
 
 What's the meaning of global in this context?  Some part of me
 wants this to be simply get_symbol.  Or are we contrasting global
 with lexical or private?  See also the get*_global and set*_global
 opcodes below, which I think should be get_symbol and set_symbol.
 
 I was also leaning in that direction, but the problem is that symbol 
 can also be a lexical symbol. Here we're specifically accessing symbols 
 from the global symbol table (the global tree of namespaces), so 
 global is the simplest way to identify it.

Another possibility is to take a cue from the find_name opcode
(which searches lexically, in namespaces, and global) and use get_name
and set_name.  But now I think I'm bikeshedding this one, so I'll
be quiet.  

(It would be easier to avoid bikeshedding on opcode names if Parrot 
didn't already have so many naming systems to choose from.  :-)

 Since we have slurpy named parameters in Parrot, why not simply leave $P3
 off and use (optional) named parameters here to specify options?
 
 Chip/Leo, do the various named parameter passing techniques work on 
 low-level PMC's defined in C?

Oh, I had forgotten that little detail.  Well, never mind.  :-)

Pm


Re: FAQ Questions (WAS: ICU advantages? was Re: Problems Installing Parrot)

2006-07-21 Thread Patrick R. Michaud
On Fri, Jul 21, 2006 at 02:12:57PM -0400, Mr. Shawn H. Corey wrote:
 Chris Dolan wrote:
  1. Do I need root privileges to install Parrot? Do I need it for Cage
  Cleaners?
 
 You don't even need root at all.  You can build in a local directory and
 not install.

In fact, for those who are developing parrot the current
recommendation is to *not* install it, as it often produces later
collisions between installed and local copies of parrot.

Pm


Re: t/compilers/pge/p6regex/01-regex.t test 118 needs ICU?

2006-07-23 Thread Patrick R. Michaud
On Sat, Jul 22, 2006 at 08:54:36PM -0400, Bob Rogers wrote:
Content-Description: message body text
After building Parrot without ICU, 01-regex.t test 118 fails as
 follows:
   t/compilers/pge/p6regex/01-regex.
   # Failed test (t/compilers/pge/p6regex/01-regex.t at line 59)
   #  got: 'no ICU lib loaded
 The attached patch seems to take care of this.  Is this a reasonable way
 to do it?

Looks perfect.  Applied (r13457), thanks.

Pm


Re: [perl #39930] AutoReply: [BUG] concat unicode+iso-8859-1 doesn't work w/o ICU

2006-07-24 Thread Patrick R. Michaud
On Mon, Jul 24, 2006 at 03:57:25PM -0700, Pm wrote:
 Found this bug while doing stuff --without-icu today...
 
 Concatenation of a unicode string with an ASCII string
 works even if ICU isn't available.
 
 Concatenation of a unicode string with a Unicode string
 works even if ICU isn't available.
 
 Concatenation of a unicode string with an iso-8859-1 string
 fails with no ICU lib loaded if ICU isn't available.

On a possibly related note: for systems that *do* have ICU,
concatenating a unicode: string with an ascii:  or unicode: 
string appears to result in a different encoding than concatenating
with iso-8859-1.  Thus:

$S0 = unicode:A
$S1 = ascii:B
$S2 = concat $S0, $S1
print $S2# outputs AB
   
$S0 = unicode:A
$S1 = unicode:B
$S2 = concat $S0, $S1
print $S2# outputs AB

$S0 = unicode:A
$S1 = iso-8859-1:B
$S2 = concat $S0, $S1
print $S2# outputs A\x00B\x00

This particular behavior isn't necessarily a bug, but it is
at least somewhat unexpected.

Pm


Re: [perl #39926] :init attribute (was Re: Implement .loadlib pragma in IMCC)

2006-07-24 Thread Patrick R. Michaud
On Mon, Jul 24, 2006 at 01:03:41PM -0700, Chip Salzenberg wrote:
 On Wed, Jul 12, 2006 at 10:25:43AM -0700, Allison Randal wrote:
  It occurs to me, after thinking about it overnight, that the .loadlib 
  directive shouldn't operate at :immediate time, but at :init time, 
  because it's more common to want a library to load when you run the code 
  than to load only when you compile the code.
  
  Which leaves us with :immediate for the rare cases when you really want 
  to load a library at compile time.
 
 Oddly enough, while :init is obviously a good thing, it does not exist.

However, as discussed briefly at Sunday's hackathon, it would be
really nice if we had some sort of pragma (I propose :init) that 
indicates a subroutine is to be executed whenever the sub is loaded,
whether that occurs via a load_bytecode or because a module is 
being run directly from parrot.

Background:  Currently we have :main, :load, and :immediate pragmas.
A sub marked :main is executed when a .pir or .pbc file is called
directly from the parrot command line, but are not automatically
called when that file is obtained via load_bytecode.  A sub marked
:load is executed when a .pir or .pbc file is called via load_bytecode,
but not when the .pir/.pbc is loaded from the command line.

I'd like there to be an :init pragma to mark subs that are to be
executed anytime the file is loaded.  In the case of loading from
the command line, the :init subs should be executed prior to the
:main sub.

(Currently the .pir/.pbc files I write work around this by explicitly
calling the :load subs from the :main one.  While this is workable
when there's just one such sub, it requires a bit more work when the
.pir or .pbc is being produced from several (often generated) sources.)

Note that :init as I've proposed is not the same as writing 
:main :load, since only one :main sub is executed (whereas there could
be multiple :init subs).

Pm


Re: PGE/TGE and the future.

2006-08-03 Thread Patrick R. Michaud
On Fri, Jul 28, 2006 at 08:46:50AM -0600, Kevin Tew wrote:
 I'm seeking information regarding TGE's design goals, aspirations, 
 future plans, etc.
 
 I see that Perl6 implements its own version of PAST and POST nodes.
 Is it possible to share basic PAST and POST nodes and extend them for 
 particular  HLL needs?
 I know that different HLLs  share a lot  of the same semantics, they 
 also have huge differences.

I'm in the process of unifying the various PAST implementations into a
common form that can (hopefully) be used by many languages.  The basics
will be available, but there will also be a way to add extensions or
override behaviors for various HLLs.

 If there is a common set of goals or objectives for compiler tools that 
 Cardinal can contribute to or utilize, I would like to do so.
 This is an offer to help build and test the compiler tools. :)
 
 I just want to start a discussion about compiler tools and learn what 
 information, I've missed.

Excellent!  At the moment I'm working on updating the compiler tools
draft into an official PDD for comments and suggestion, as well as
putting some code that demonstrates the pieces.  Look for this in
the next five days or so!

Pm


Re: [perl #40002] TGE Refactor / Compiler Tools Object

2006-08-03 Thread Patrick R. Michaud
On Fri, Jul 28, 2006 at 08:43:27AM -0700, Kevin Tew wrote:
 
 What bullet items will the TGE refactor consist of?

Keeping in mind that the TGE refactor really also includes refactoring
PAST and POST, we have...

 * better command-line arg processor, like getopts, but returning a capture

Yes.

 * optimization levels based on level, group related optimizations which 
 may occur during different transform steps

Eventually this will happen, but I don't know if it'll be in the first
round of refactoring.

 * support for languages other than PIR
 * generic PAST/POST nodes for short-circut ands and ors
 * basic conditional and case constructs, there exists a common semantic 
 for if/else, it should be represented in a common way in PAST
 * for and while loop generation

Yes, yes, yes, and yes.

 * label management.
 * scope management.

Scope management definitely in this first refactor; label management may
wait slightly (or I'll just invite someone else to do it :-).

 *38761 http://rt.perl.org/rt3/Ticket/Display.html?id=38761* *[TODO] 
 TGE, precompile more http://rt.perl.org/rt3/Ticket/Display.html?id=38761*

I'll wait and see on this one.

   [EMAIL PROTECTED]
 *39831 http://rt.perl.org/rt3/Ticket/Display.html?id=39831* *TGE - 
 Needs more diagnostics on failure. 
 http://rt.perl.org/rt3/Ticket/Display.html?id=39831*

Definitely.

   [EMAIL PROTECTED]
 *39854 http://rt.perl.org/rt3/Ticket/Display.html?id=39854* 
 *[PATCH] 
 adds preamble section to tge grammar to allow for includes and global 
 defines http://rt.perl.org/rt3/Ticket/Display.html?id=39854*

I'm working this one out.  There *will* be a way to set pragmas (e.g.,
so that the :language(...) modifier isn't specified on every transform).

   [EMAIL PROTECTED]
 *39897 http://rt.perl.org/rt3/Ticket/Display.html?id=39897* 
 *[PATCH] 
 TGE - add basic syntax error 
 http://rt.perl.org/rt3/Ticket/Display.html?id=39897*
   [EMAIL PROTECTED]
 *39905 http://rt.perl.org/rt3/Ticket/Display.html?id=39905* *[TODO] 
 TGE - line number reporting. 
 http://rt.perl.org/rt3/Ticket/Display.html?id=39905*

We'll definitely add better line number reporting.

   [EMAIL PROTECTED]
 *39913 http://rt.perl.org/rt3/Ticket/Display.html?id=39913* *[BUG] 
 TGE - Can't use } in the transform definitions. 
 http://rt.perl.org/rt3/Ticket/Display.html?id=39913*

In discussions with Allison at OSCON, I noted that we needed to reconsider
the syntax slightly.  We don't want TGE to have to know how to parse every
language, and it may not be reasonable to expect every compiler to expose
a parser.  So, if we're going to allow other languages in the transform
bodies, we may want a hereis or podly {{...}}, {{{...}}} syntax to
delimit the transform bodies.  At the moment I'm leaning towards the {{...}}
form, if only because PGE is already using it.

Pm


Re: [perl #39905] [TODO] TGE - line number reporting.

2006-08-03 Thread Patrick R. Michaud
On Fri, Jul 21, 2006 at 03:03:07PM -0700, Will Coleda wrote:
 # New Ticket Created by  Will Coleda 
 # Please include the string:  [perl #39905]
 # in the subject line of all future correspondence about this issue. 
 # URL: http://rt.perl.org/rt3/Ticket/Display.html?id=39905 
 
 
 Once a .tg file is compiled to a .pir file, any errors in the  
 embedded PIR are reported against the line number
 of the generated PIR file.
 
 Instead, the line numbers should be reported against the original .tg  
 file.

Is there an imcc pragma for setting the line number to be reported
for an error?  Or what's the general approach to getting the generated
PIR file to report the correct line number?

Pm


Re: [perl #40002] TGE Refactor / Compiler Tools Object

2006-08-03 Thread Patrick R. Michaud
On Thu, Aug 03, 2006 at 11:21:57AM -0600, Kevin Tew wrote:
 Patrick R. Michaud via RT wrote:
 
 In discussions with Allison at OSCON, I noted that we needed to reconsider
 the syntax slightly.  We don't want TGE to have to know how to parse every
 language, and it may not be reasonable to expect every compiler to expose
 a parser.  So, if we're going to allow other languages in the transform
 bodies, we may want a hereis or podly {{...}}, {{{...}}} syntax to
 delimit the transform bodies.  At the moment I'm leaning towards the 
 {{...}}
 form, if only because PGE is already using it.
 
 Pm
 
   
 How about here doc style?
 This was mentioned on IRC by either Coke or Particle, I had the the same 
 idea.

Sorry, heredoc is what I meant by hereis above.  But yes, I'm thinking we'll
allow some form of heredoc.

Actually, {{, {{{,   may end up simply being shortcuts that say heredoc 
with }}, }}}, 
as the end marker

Pm


Re: [perl #40069] [PGE] value can't be used as a rule name.

2006-08-04 Thread Patrick R. Michaud
On Fri, Aug 04, 2006 at 01:50:19AM -0700, ambs @ cpan. org wrote:
 # New Ticket Created by  [EMAIL PROTECTED] 
 # Please include the string:  [perl #40069]
 # in the subject line of all future correspondence about this issue. 
 # URL: http://rt.perl.org/rt3/Ticket/Display.html?id=40069 
 
 If we define a rule or token named 'value', things do not work. The PGE
 compiler works, but then the grammar does not run.
 
 Probably this happens with other names as well. This should not happen.

I know what's happening, but I'm not sure how to work around it.
Perhaps MMD would resolve the problem.

The problem:  Match objects have methods defined on them such as
.from(), .to(), and .value().  As a result, defining a subrule
named 'value' in a grammar is probably overriding the method of the
same name defined for Match objects.

The quick conclusion is that at the moment it's difficult to have 
rules named 'value', 'from', 'to', or any of the other predefined 
methods on Match objects.  

However, I wonder if MMD can resolve this, because then the compiler 
would presumably dispatch to the correct method based on the argument 
types.

At any rate, this is definitely something we should address; it's
likely to wait for some other refactors to take place.

Thanks!

Pm


Re: Grammar question

2006-08-11 Thread Patrick R. Michaud
On Fri, Aug 11, 2006 at 04:43:55PM +0100, Alberto Simões wrote:
 Hi
 
 Today in #parrot a question was done:
 
   rule foo { bar* }
 
 should be considered:
 
   rule foo { ?wsbar*?ws }
 
 or
 
   rule foo { ?ws(bar?ws)* }

In the past we've always gone with the former.

If bar is also a rule, then it presumably
eats its own whitespace.  If one wants
to grab the whitespace as well, as in the
latter example, it's not too difficult to write

rule foo { [bar ]* }

Pm


Re: [perl #40187] [PATCH] PGE simple grammar test file

2006-08-17 Thread Patrick R. Michaud
On Thu, Aug 17, 2006 at 01:22:11AM -0700, Nuno Carvalho wrote:
  After some more discussion on #parrot  I've rewrote a very simple
 test file to evaluate some very simple PGE grammars. [...]

Applied, thanks!

Pm



Re: [perl #40178] None Must Die

2006-08-17 Thread Patrick R. Michaud
On Thu, Aug 17, 2006 at 09:12:07PM +0200, Leopold Toetsch wrote:
 A releated change:
 
   $S0 = hsh['no_such_key']
 
 used to return an empty STRING*, it'll soon return a NULL STRING*. 

Just a note (to copy from irc #parrot) -- this will cause a number
of things in PGE and perl6 to break, as I rely on the return empty string
behavior.  In particular, changing it to return NULL will mean
in many places that I will have to replace single-line key lookups 

$S0 = hsh['key']

with

$S0 = hsh['key']
unless null $S0 goto label
$S0 = ''
  label:


I don't mind if 

$P0 = hsh['no_such_key']   (get_pmc_keyed)

returns NULL, but it would be nice if 

$S0 = hsh['no_such_key']   (get_string_keyed)

could continue to return a string (''), the same way that 

$I0 = hsh['no_such_key']   (get_int_keyed)
$N0 = hsh['no_such_key']   (get_number_keyed)

still return ints (0) and nums (0.0).

In the meantime, I'm checking the PGE code to see how many 
lookups will actually be affected.  If it's a small number,
I'll withdraw my objection.

Pm


Re: [perl #40178] None Must Die

2006-08-17 Thread Patrick R. Michaud
On Thu, Aug 17, 2006 at 12:55:54PM -0700, Chip Salzenberg wrote:
 {DESIGNER ALERT - Allison, what think?}
 
 On Thu, Aug 17, 2006 at 12:31:11PM -0700, Patrick R. Michaud via RT wrote:
  I rely on the return empty string behavior.  In particular, changing it
  to return NULL will mean in many places that I will have to replace
  single-line key lookups
  
  $S0 = hsh['key']
  
  with
  
  $S0 = hsh['key']
  unless null $S0 goto label
  $S0 = ''
label:
 
 Indeed.  I think we can reduce the pain of dealing with this to the point
 where you'll hardly feel it.  For example, I really like Python's lookup
 semantic where you can provide the default value on the call.

FWIW, I reviewed the PGE code and found only five cases where I think
the code will have to change if hsh['nokey'] returns NULL, so I have to
substantially reduce the weight of my previous objection.  (I still 
think it's nicer if get_string_keyed always returns a string, the same 
as get_int_keyed and get_number_keyed always return an int or num, but
it's not a major pain for me if it doesn't.)

 How about a 'default' opcode that provides a value instead of null?  It
 would work for strings and PMCs.  Something like:
 
  $S0 = default hsh['key'], ''

I can think of a lot of cases where this would be really useful to me.

Pm


Re: [perl #40178] None Must Die

2006-08-17 Thread Patrick R. Michaud
On Thu, Aug 17, 2006 at 01:11:00PM -0700, jerry gay wrote:
 On 8/17/06, Chip Salzenberg [EMAIL PROTECTED] wrote:
  $S0 = default hsh['key'], ''
 [...]
  $P0 = new .Undef
  $P1 = default hsh['key1'], $P0
  $P1 = default hsh['key2'], $P0
  ...
 
 It would work without the lookups too:
 
  $S0 = default $S0, ''# if $S0 is null, assign it ''
 
 what say?
 
 
 default is ugly. err is sexy.
 
 ## if $S0 is null, assign it ''
 #pasm
 err $S0, ''
 err $S1, $S2, ''
 
 #pir
 $S0 //= ''
 $S0 = err ''
 $S0 = err $S1, ''

There might be a cognitive dissonance here with err, since
Cerr in pasm/pir is testing for null, while  Cerr in
perl6 tests for definedness.  While it doesn't much matter
for strings in the examples above, it might make a difference for

$S0 = err hsh['key1'], 'foo'

Pm


Re: [perl #40210] [TODO] Provide a way for PGE's dump to go to string

2006-08-22 Thread Patrick R. Michaud
On Mon, Aug 21, 2006 at 07:16:46AM -0700, Will Coleda wrote:
 While the primary use of dump is for immediate debug output (and  
 therefore puts is ok), being able to get at the string it generates  
 is *very* useful for testing.

I've refactored the existing 'dump' method into separate
'dump_str' and 'dump' methods in runtime/parrot/library/PGE/Dumper.pir.
Available as of r14306.

Thanks!

Pm


Re: [perl #40319] [PATCH] PGE test file written in PIR revisited

2006-09-11 Thread Patrick R. Michaud
On Mon, Sep 11, 2006 at 08:32:26AM -0700, Nuno Carvalho wrote:
 
 Attached to this message you can find a patch to
 't/compilers/pge/06-grammar.t'.

Many thanks for your excellent work on 06-grammar.t .  It's
a nice addition.

After applying the patch, I get 1 subtest UNEXPECTEDLY SUCCEEDED.
I'm presuming it's test #10, which for some reason is marked todo.
Is there a reason it's a todo test?  (Apologies if this has already
been covered somewhere and I missed it.)

Pm


Re: [perl #40319] [PATCH] PGE test file written in PIR revisited

2006-09-13 Thread Patrick R. Michaud
On Mon, Sep 11, 2006 at 08:32:26AM -0700, Nuno Carvalho wrote:
 Have done some cleannig in the file t/compilers/pge/06-grammar.t, also
 haded one PMC to have an array of reasons to todo tests. Also haded a
 new test grammar.
 
 Attached to this message you can find a patch to
 't/compilers/pge/06-grammar.t'.

Excellent!

Now applied, with a small fix to the todo logic (r14606), and 
fixing PGE's bug that was the reason for the todo in the first place
(r14607).

Thanks!

Pm


Re: Motivation for /alpha+/ set Array not Match?

2006-09-22 Thread Patrick R. Michaud
On Fri, Sep 22, 2006 at 10:22:52PM +0800, Audrey Tang wrote:
 Moreover:
 
/foo bar bar foo+/
 
 should set $foo to an Array with two Match elements, the first being a
 simple match, and the second has multiple positional submatches.
 
 The thinking behind the separate treatment is that in a contiguous  
 quantified
 match, it does make sense to ask the .from and .to for the entire  
 range, which
 is very hard to do if it's an Array (which can have 0 elements,  
 rendering $foo[-1].to
 dangerous).  


Out of curiosity, why not:

/foo bar bar $xyz:=(foo+)/

and then one can easily look at $xyz.from and $xyz.to, as well
as get to the arrayed elements?  (There are other possibilities as
well.)

I'm not arguing in favor of or against the proposal, just pointing
out that there are ways in the existing scheme to get at what is
wanted.

Pm


Re: special named assertions

2006-09-27 Thread Patrick R. Michaud
On Wed, Sep 27, 2006 at 11:59:32AM -0700, David Brunton wrote:
 A quick scan of S05 reveals definitions for these seven special named 
 assertions:
   [...]

I don't think that '...' or ... are really named assertions.

I think that !xyz (as well as +xyz and -xyz) are simply special forms
of the named assertion xyz.

I should probably compare your list to what PGE has implemented and see if
there are any differences -- will do that later tonight.

Pm



Re: special named assertions

2006-09-27 Thread Patrick R. Michaud
On Wed, Sep 27, 2006 at 09:12:02PM +, [EMAIL PROTECTED] wrote:
 The documentation should distinguish between those that are just 
 pre-defined characters classes (E.G., alpha and digit) and 
 those that are special builtins (E.G., before ... and commit.  
 The former are things that you should be freely allowed to redefine 
 in a derived grammar, while the other second type may want to be 
 treated as reserved, or at least mention that redefining them may 
 break things in surprising ways.

FWIW, thus far in development PGE doesn't treat before ...
and commit as special built-ins -- they're subrules, same
as alpha and digit, that can indeed be redefined by 
derived grammars.

And I think that one could argue that redefining alpha or
digit could equally break things in surprising ways.  

I'm not arguing against the idea of special builtins or saying it's
a bad idea -- designating some named assertions as special/non-derivable 
could enable some really nice optimizations and implementation shortcuts  
that until now I've avoided.  I'm just indicating that I haven't
come across anything yet in the regex implementation that absolutely
requires that certain named assertions receive special treatment
in the engine.

Thanks,

Pm

  -- Original message --
 From: Patrick R. Michaud [EMAIL PROTECTED]
  On Wed, Sep 27, 2006 at 11:59:32AM -0700, David Brunton wrote:
   A quick scan of S05 reveals definitions for these seven special named 
  assertions:
 [...]
  
  I don't think that '...' or ... are really named assertions.
  
  I think that !xyz (as well as +xyz and -xyz) are simply special forms
  of the named assertion xyz.
  
  I should probably compare your list to what PGE has implemented and see if
  there are any differences -- will do that later tonight.
  
  Pm
  
 
 
 


Re: FYI compiling PIR function calls

2006-09-28 Thread Patrick R. Michaud
On Thu, Sep 28, 2006 at 11:59:52AM -0700, chromatic wrote:
 On Thursday 28 September 2006 11:25, Allison Randal wrote:
   obj.{bar}()  # a string method name
   obj.{$S1}()

I'm not sure what's meant by a string method name above, but
I'd look at it as:

.local string abc

obj.'abc'()  # call 'abc' method of obj
obj.abc()# always the same as above
obj.{abc}()  # call method indicated by abc symbol
obj.{S0}()   # call method indicated by S0
obj.$S0()# call method indicated by $S0

Having obj.abc() always mean obj.'abc'() seems to me like it's
most in line with what PIR-authors expect.

As noted in the last instance, I don't know that we need a
separate obj.{$S0}() case since the dollar sign is sufficient
to distinguish exactly what was meant.  But there's also an
argument in favor of the consistency of {...} always meaning
evaluate this as opposed to treat this as literal given
by the bareword and 'abc' forms.

 To push a little more the other direction, is it possible for the 
 compiler to detect symbol and method name conflicts?  
 It's only the collision that makes a case ambiguous, right?

I don't think that the compiler always knows at compile time
what method names are available for any given object, so detecting
collisions could be problematic.  

However, it could certainly detect when a bareword symbol has 
been used as a method name and warn about it, requiring the use of an
explicit obj.{symbol}() or obj.'symbol'() form to disambiguate it.

Pm


Re: Null PMC access while parsing javascript

2006-10-11 Thread Patrick R. Michaud
On Wed, Oct 11, 2006 at 10:56:39PM +0200, Mehmet Yavuz Selim Soyturk wrote:
 I have rewritten the grammar. There are some problems though.
 
 - I don't know how to express thinks like: an identifier is
 [a..zA..Z_$]*, but not a keyword. Something like: rule identifier
 {!keyword[a..zA..Z_$]*} seems not to allow identifiers that have
 keywords as prefix.

For now, try adding a \b at the ends of the keyword rule:

token keyword { \b [ if | else | for | while | ... ] \b }

Then keyword will match only the exact keyword.

However, the \b metacharacter is disappearing soon, to be replaced
by ?wb and !wb.  In its place will be   and , making
the above:

token keyword {  [ if | else | for | while | ... ]  }

I'll get  and  added into PGE today/tomorrow.

 - I couldn't make comments work.
 - I don't know how to handle unicode,
 - How to accomplish semicolon insertion?

I'll have to look at the grammar a bit and see what I can come up
with here.

Pm


Re: Null PMC access while parsing javascript

2006-10-12 Thread Patrick R. Michaud
On Wed, Oct 11, 2006 at 04:34:17PM -0500, Patrick R. Michaud wrote:
 On Wed, Oct 11, 2006 at 10:56:39PM +0200, Mehmet Yavuz Selim Soyturk wrote:
  I have rewritten the grammar. There are some problems though.
  
  - I don't know how to express thinks like: an identifier is
  [a..zA..Z_$]*, but not a keyword. Something like: rule identifier
  {!keyword[a..zA..Z_$]*} seems not to allow identifiers that have
  keywords as prefix.
 [...]
 However, the \b metacharacter is disappearing soon, to be replaced
 by ?wb and !wb.  In its place will be   and , making
 the above:
 
 token keyword {  [ if | else | for | while | ... ]  }
 
 I'll get  and  added into PGE today/tomorrow.

OOPS!  I just looked at PGE, and apparently  and  
(and their Unicode equivalents) have been been implemented
since late July.  So, go ahead and use the above definition
for keyword.  :-)

  - I couldn't make comments work.
  - I don't know how to handle unicode,
  - How to accomplish semicolon insertion?
 
 I'll have to look at the grammar a bit and see what I can come up
 with here.

I'm still working on this one.

Pm


classnames and HLL namespaces -- help!

2006-10-19 Thread Patrick R. Michaud
First, my apologies to Chip for this message -- I know he's
probably already answered this question for me a couple of
times but I've either forgotten, I'm too dense, or I just
can't find the answers now that I need them.  So, with
appropriate contrition for asking yet again...

After the changes introduced by pdd21, I'm lost as to how 
to deal with classname conflicts when multiple HLL namespaces 
are involved.  I have a very real example from PGE, but bear 
with me as I present some background.  Also, note that I've
simplified a few details here for illustration, so if
you compare this to the actual PGE code you'll notice some
(insignificant) differences.

Background - before pdd21
-

When PGE was first implemented, everything in Parrot tended to
go in a global shared namespace (or at least that's all I knew about).
Therefore, to avoid namespace conflicts, I wrote PGE with PGE::
prefixes in all of its classnames.  Therefore we have classes like:

PGE::Match- base class for Match objects
PGE::Grammar  - base class for Grammar objects
PGE::Exp  - base class for nodes in the regex AST

The PGE::Exp class is itself subclassed into different node
types representing literals, groups, anchors, quantifiers, closures,
etc.  The current PGE code has these expression node subclasses named
with a prefix of PGE::Exp::, but they probably should have been
named without the Exp:: portion, thus we have AST classes like:

PGE::Literal  - literal expressions
PGE::Group- non-capturing group
PGE::CGroup   - capturing group
PGE::Subrule  - subrule call
PGE::Closure  - embedded closure

Here's code to create PGE::Exp and its subclasses:

.sub __onload :load
$P0 = newclass 'PGE::Exp'
$P0 = subclass 'PGE::Exp', 'PGE::Literal'
$P0 = subclass 'PGE::Exp', 'PGE::Group'
$P0 = subclass 'PGE::Exp', 'PGE::CGroup'
$P0 = subclass 'PGE::Exp', 'PGE::Subrule'
$P0 = subclass 'PGE::Exp', 'PGE::Closure'
# ...
.end

Okay, so far so good.  Now let's move into the world
postulated by pdd21_namespaces...


After pdd21
---

According to pdd21, each HLL gets its own hll_namespace.
PGE is really a form of HLL compiler, so it should have
its own hll_namespace, instead of using parrot's hll namespace:

.HLL 'pge', ''

Now then, the 'PGE::' prefixes on the classnames were
just implementation artifacts of working in a globally
flat namespace -- as a high-level language PGE really
ought to be referring to its classes as 'Match',
'Exp', 'Literal', etc.  So, if we're in the PGE HLL,
we ought to be able to drop the 'PGE::' prefix from
our classnames and namespaces.

So, here's the revised version of the code to create
the classes:

.HLL 'pge', ''

.sub __onload :load
$P0 = newclass 'Exp'
$P0 = subclass 'Exp', 'Literal'
$P0 = subclass 'Exp', 'Group'
$P0 = subclass 'Exp', 'CGroup'
$P0 = subclass 'Exp', 'Subrule'
$P0 = subclass 'Exp', 'Closure'
# ...
.end

This code fails when run from parrot, because Parrot seemingly
already has a class named 'Closure':

$ ./parrot ns.pir
Class Closure already registered!
current instr.: '__onload' pc 19 (ns.pir:9)
$

So, this brings me to my question:  What is the official 
best practice pattern for HLLs to create their own classes
such that we avoid naming conflicts with existing classes
in Parrot and other HLLs?



Anticipating the answer that the classname arguments to Csubclass
should be namespace keys instead of strings, as in:

$P0 = subclass [ 'pge'; 'Exp' ], [ 'pge'; 'Closure' ]

what namespace directive do we use to define the methods 
for the Closure class?

Thanks in advance, and apologies if I've overlooked the obvious.

Pm


Re: classnames and HLL namespaces -- help!

2006-10-19 Thread Patrick R. Michaud
On Thu, Oct 19, 2006 at 10:01:29PM -0400, Matt Diephouse wrote:
 Patrick R. Michaud [EMAIL PROTECTED] wrote:
  According to pdd21, each HLL gets its own hll_namespace.
 PGE is really a form of HLL compiler, so it should have
 its own hll_namespace, instead of using parrot's hll namespace:
 
 .HLL 'pge', ''
 
 I don't know that that's necessarily the case, but it's definitely an
 option. You can just as easily argue that it's a library.

Agreed, but I think my questions equally apply to something
like .HLL 'perl6'.  

In PGE's case, if we simply want to treat it as a library for now,
in the [ 'parrot'; 'PGE'; ... ] namespace, I think we could do that
for a while.  But with perl6 and other languages joining parrot
soon, I'm not sure it's something we should postpone for too much
longer.

 So, this brings me to my question:  What is the official
 best practice pattern for HLLs to create their own classes
 such that we avoid naming conflicts with existing classes
  in Parrot and other HLLs?
 
 This is unspecced. ATM, all classes go into the 'parrot' HLL. This is
 a relic of the past and I think it needs to change. I'm pretty sure
 that HLL classes will have to go into the HLL's root namespace (this
 needs to happen anyway to prevent namespace pollution). That leaves us
 with the question of how to differentiate core PMCs from HLL PMCs. I'm
 not sure how to handle that, but that's what a spec is for.

Why is the differentiation necessary -- wouldn't core PMCs simply
be part of the 'parrot' HLL?

 We discussed some of this briefly at the OSCON hackathon, when we
 talked about changing the class internals so that a Class isa
 Namespace. That discussion hasn't led to any changes yet as Chip has
 been kidnapped by his Real Life (tm).

I'm afraid I wasn't able to keep up with all of the details and
implications of that discussion at the hackathon.  I'll be glad
to chime in where I can, but I still don't understand some of
the details.

Thanks,

Pm


Re: Re: classnames and HLL namespaces -- help!

2006-10-19 Thread Patrick R. Michaud
On Thu, Oct 19, 2006 at 11:20:56PM -0400, Matt Diephouse wrote:
 Patrick R. Michaud [EMAIL PROTECTED] wrote:
  On Thu, Oct 19, 2006 at 10:01:29PM -0400, Matt Diephouse wrote:
  ATM, all classes go into the 'parrot' HLL. [...]  I'm pretty sure
  that HLL classes will have to go into the HLL's root namespace (this
  needs to happen anyway to prevent namespace pollution). That leaves us
  with the question of how to differentiate core PMCs from HLL PMCs. I'm
  not sure how to handle that, but that's what a spec is for.
 
 Why is the differentiation necessary -- wouldn't core PMCs simply
 be part of the 'parrot' HLL?
 
 That's the place to put them. But how do you make the core PMCs
 visible to the compiler and not to the user? 

Two off-the-top-of-my-head possibilities:

1.  Reference core PMCs by their .ClassName constants as opposed 
to their stringified names.  Then stringified names are _always_
hll classes.

2.  Provide an opcode that allows us to lookup class names 
in other hlls; i.e., allow the equivalent of things like

$I0 = find_type [ 'parrot'; 'String' ]
$P0 = new $I0

$I0 = find_type [ 'pge'; 'Match' ]
$P1 = new $I0

 Perhaps this will be clearer if I demonstrate with code.  I imagine
 that this Perl 6:
 
my $obj = Perl6Object.new()
 
 will translate to something like this PIR:
 
.lex '$obj', $P0
$P0 = new 'Perl6Object' # do Perl6 classes have sigils?
$P0.INIT()

(If Perl6 classes have sigils, it's probably '::', just like package names.)

Actually, now that you mention this, perhaps it would end up
being more along the lines of:

.lex '$obj', $P0  # declare lexicaly scoped '$obj'
$P1 = find_name 'Perl6Object' # find class for 'Perl6Object'
$P0 = $P1.'new'() # send 'new' message to Perl6Object

and then the 'new' method of the 'Perl6Object' class (likely inherited
from a base 'Class' type in Perl6 hll-space) takes care of finding 
the correct Parrot object type, calling Parrot's Cnew opcode with
that type, invoking INIT, and returning the resulting object to be 
placed in $P0.  

 But that means if the user writes this Perl 6:
 
my $obj = ResizablePMCArray.new()
 
 this PIR will be generated:
 
.lex '$obj', $P0
$P0 = new 'ResizablePMCArray' # oh no! this isn't an actual Perl6
 class - it's namespace pollution!
$P0.INIT()

 We need to somehow differentiate between Perl6Object and
 ResizablePMCArray. Especially given the possibility that the user will
 write this:
 
class ResizablePMCArray { ... }

There aren't any barewords in Perl 6, so all bare classnames have
to be predeclared in order to get past the compiler, and then it's
fairly certain we're talking about a Perl 6 class and not a Parrot
class.  

I suspect that if the Perl 6 programmer really wants to be using 
the Parrot ResizablePMCArray, it will need to be imported into 
the perl6 hll_namespace somehow, or otherwise given enough details
so that perl6's 'ResizablePMCArray' class object knows that it's
the Parrot class and not the Perl6 one.

 This isn't too much different from using keyed class names like
 ['pge'; 'Closure'] like you guessed in your first email. But this
 places classes next to their namespaces, which is a good thing. But we
 probably do need keyed class names to support this:
 
class Foo::Bar { ... }

I'm expecting that both PGE and perl6 will be translating names
like Foo::Bar into an array of [ 'Foo'; 'Bar' ], and then looked
up relative to the current namespace.

All of which might seem to indicate that 'class is a namespace' is
the right approach, or at least that perl6 will be modeling it that way.

Thanks, Matt -- this is turning into a really helpful and useful
discussion, at least for me.

Pm


Re: classnames and HLL namespaces -- help!

2006-10-22 Thread Patrick R. Michaud
On Sat, Oct 21, 2006 at 07:10:21PM +0200, Leopold Toetsch wrote:
 Am Donnerstag, 19. Oktober 2006 23:19 schrieb Patrick R. Michaud:
      .HLL 'pge', ''
 
      .sub __onload :load
          $P0 = newclass 'Exp'
   [...]
          $P0 = subclass 'Exp', 'Closure'
          # ...
      .end
  [...]
  So, this brings me to my question:  What is the official
  best practice pattern for HLLs to create their own classes
  such that we avoid naming conflicts with existing classes
  in Parrot and other HLLs?
 
   .HLL 'pge', ''
 
 is implying the toplevel namespace ['pge']. The Cnewclass 'Exp' therfore is 
 created as ['pge';'Exp']. But you are subclassing that to an existing 
 (because unqualified) 'Closure' name.
 
 IMHO this should look like this:
 
   .HLL 'pge', ''
   ...
   cl = newclass 'Exp' # ['pge'; 'Exp']
   ...
   .namespace ['Exp']  # ['pge'; 'Exp']
   ...
   scl = subclass 'Exp', ['Exp'; 'Closure']  # ['pge'; 'Exp'; 'Closure']
   ...

I strongly disagree.  I don't think that a subclass should have to
be named as a sub-namespace of its parent class.

Put another way, if Num isa Object, and Int isa Num,
does that mean that I would have to do...?

.hll 'perl6', ''

$P0 = newclass 'Object'
$P1 = subclass 'Object', ['Object'; 'Num']
$P2 = subclass ['Object'; 'Num'], ['Object'; 'Num'; 'Int']

Normally I would expect 'Object', 'Int', and 'Num' to have their
own top-level namespaces within the HLL namespace, and not require
classnames to always include the list of parent classes.

Pm


Re: classnames and HLL namespaces -- help!

2006-10-23 Thread Patrick R. Michaud
On Sun, Oct 22, 2006 at 11:38:10PM +0200, Leopold Toetsch wrote:
 Am Sonntag, 22. Oktober 2006 20:56 schrieb Patrick R. Michaud:
 
  I strongly disagree.  I don't think that a subclass should have to
  be named as a sub-namespace of its parent class.
 
 Namespace and classes are currently totally orthogonal. You are declaring a 
 subclass (not a sub-namespace) with all the implications for naming it.

Okay, I'll rephrase to avoid the classname/namespace confusion(*):

I don't think that a subclass' name should have to include the
names of its parent classes.  From your earlier message:

On Sat, Oct 21, 2006 at 07:10:21PM +0200, Leopold Toetsch wrote:
 IMHO this should look like this:

   .HLL 'pge', ''
   ...
   cl = newclass 'Exp' # ['pge'; 'Exp']
   ...
   .namespace ['Exp']  # ['pge'; 'Exp']
   ...
   scl = subclass 'Exp', ['Exp'; 'Closure']  # ['pge'; 'Exp'; 'Closure']
   ...

It's the ['Exp'; 'Closure'] that bothers me here -- I don't think
that a subclass should have to include the name of its parent in
the class name.  It should be:

scl = subclass 'Exp', 'Closure'# ['pge'; 'Closure']

However, writing either this or

scl = subclass 'Exp', ['Closure']  # ['pge'; 'Closure']

gives me the class Closure already registered error that
started this thread.

-

(*):  AFAICT, it's also not true that classnames and namespaces 
are currently totally orthogonal, since the class' methods
have to be placed in a namespace that matches the classname.
So, a class named [ 'Exp'; 'Closure' ] must place its methods
in a [ 'Exp'; 'Closure' ] namespace.  

Pm


Re: OO Requirements [was Re: classnames and HLL namespaces -- help!]

2006-10-23 Thread Patrick R. Michaud
On Mon, Oct 23, 2006 at 05:49:08PM +0100, Jonathan Worthington wrote:
 Allison Randal wrote:
 I think the object model needs a thorough going over in general 
 Yup. It's on the list right after I/O, threads, and events.
 ...
 Ruby is a serious OO language, but it's not finished yet. For that 
 matter, Perl 6 is partially implemented. But, I entirely agree on the 
 core point that pushing these languages forward will help push Parrot 
 forward.
 
 And pushing Parrot's OO support forward will enable these languages to 
 be pushed forwards some more.  :-)
 
 Would it be a good idea to start collecting requirements together from 
 different language implementors so that when the time comes to work on 
 the OO PDD, there is already a good description of what it needs to do?  
 If so, I'm happy to make a start on a first cut and maintain it (e.g. 
 accept patches to it from anyone who wants to contribute but doesn't 
 have a commit bit).

I'll be very happy to see this and contribute where I can.

For my immediate/near-term future needs, I'm reasonably happy
with Parrot's existing implementation, with the exception that
classnames in HLLs seem to conflict with Parrot's pre-existing
classnames (and perhaps those of other HLLs).

Pm


Re: [perl #40443] Separate vtable functions from methods (using :vtable)

2006-10-26 Thread Patrick R. Michaud
On Wed, Oct 25, 2006 at 11:02:59PM -0700, Allison Randal wrote:
 [EMAIL PROTECTED] via RT wrote:
 On Sun Oct 01 16:22:10 2006, mdiep wrote:
 At the OSCON 2006 Hackathon, it was decided that we should separate  
 vtables from methods and add a new :vtable label for PIR subs to mark  
 them as vtable functions. 
 
 Just to check, that this is still meant to happen? Anyone feel it should
 be put off until the objects/namespaces stuff is sorted out, or shall I
 just dive right in?
 
 This is the main thing Chip and I talked about in our last face-to-face 
 meeting. We came up with 3 basic parameters: whether a method is a 
 vtable method, whether it has a vtable name distinct from the method 
 name, and whether it has a method name at all (or is anonymous, i.e. 
 only a vtable method). The interface I scrawled out over coffee is:
 
 # method name is the same as vtable name
 .sub get_string :method :vtable
 
 # accessible as either $obj.stringify() or vtable
 .sub stringify :method :vtable('get_string')
 
 # accessible only as vtable
 .sub get_string :method :anon :vtable
 .sub stringify :method :anon :vtable('get_string')
 ...

+1

Pm


Re: Anyone relying on objects stringifying to class names?

2006-10-28 Thread Patrick R. Michaud
On Sat, Oct 28, 2006 at 06:50:05PM +0100, Jonathan Worthington wrote:
 So, I want to get rid of this and allow this v-table method to just 
 dispatch to a user implementation or a fallback. But before I do that, I 
 wanted to check if anyone is relying on the behavior? I'd really rather 
 not break working code without giving folks a chance to fix it, but this 
 behavior needs to die. I'm amazed, nobody has killed it already.
 
 I propose this is removed in a week, please respond if you'd have an 
 issue with that or think that's too short.

I think it's too long.  :-)

Does anything fail if you eliminate it (e.g., via make tests)?  
If no, then I think it's okay to eliminate, and we'll see
if anyone carps about it.  But that's just my $0.02.

Pm


set_pmc_keyed_int delegates to set_pmc_keyed...?

2006-11-04 Thread Patrick R. Michaud
Yesterday and today I've been working on a Capture PMC type
for Parrot, and I'm running into all sorts of interesting issues
when dealing with subclassing.  (For those who aren't familiar with
Captures, a Capture is essentially a container object that has
both indexed (array) and keyed (hash) components, like Match object.)

Here's the latest...

Currently src/pmc/default.pmc has lots of functions like the 
following:

/* Converts Ckey to a PMC key and calls Cset_integer_keyed() with it
and Cvalue.  */

void set_integer_keyed_int (INTVAL key, INTVAL value) {
PMC* r_key = INT2KEY(INTERP, key);
DYNSELF.set_integer_keyed(r_key, value);
}

If I understand how this works, this means that any subclass of
ParrotObject that doesn't define its own set_integer_keyed_int vtable 
entry is automatically re-dispatched to set_integer_keyed instead.
The same is true for others -- e.g., set_pmc_keyed_int is forwarded to
set_pmc_keyed, etc.

Well, okay, but that doesn't work if a compiler uses PIR to create
a subclass of a PMC class that makes a distinction between
keyed access and integer access.  For example:

$P99 = subclass 'Capture', 'Match'
$P1 = new 'Match'
$P1['abc'] = 1 # store value in hash component
$P1[0] = 2 # store value in array component

Because 'Match' doesn't define its own set_integer_keyed_int
vtable entry, it ought to be inheriting the one from Capture.
But unfortunately, the default.pmc function above gets in the
way, and redispatches the keyed_int call as a keyed call, so that
the last instruction stores a 2 in the object's hash component instead
of its array component.

Here's a more complete example showing how inheritance isn't
working properly for a subclass:

$ cat z.pir
.sub main :main
$P0 = new .Capture # create Capture object
$P0['alpha'] = 1   # store value in hash component
$P0[0] = 2 # store value in array component
$I0 = elements $P0 # display size of array  (should be 1)
print $I0
print \n

# create a 'Match' subclass of Capture
$P99 = subclass 'Capture', 'Match'

$P1 = new 'Match'  # create Match object
$P1['alpha'] = 1   # store value in hash component
$P1[0] = 2 # store value in array component
$I1 = elements $P1 # display size of array (should be 1)
print $I1
print \n
.end

$ ./parrot z.pir
1
0

Any thoughts about how we should resolve this?

Pm


Re: set_pmc_keyed_int delegates to set_pmc_keyed...?

2006-11-05 Thread Patrick R. Michaud
On Sat, Nov 04, 2006 at 05:18:22PM +0100, Leopold Toetsch wrote:
 Am Samstag, 4. November 2006 16:17 schrieb Patrick R. Michaud:
  Because 'Match' doesn't define its own set_integer_keyed_int
  vtable entry, it ought to be inheriting the one from Capture.
  But unfortunately, the default.pmc function above gets in the
  way, and redispatches the keyed_int call as a keyed call,
 
 Class inheritance from PMCs is very static still (like PMC-only cases). I 
 hope 
 that the :vtable patches will provide the base for a better solution. For 
 now, you can only implement the mssing _integer_keyed cases in Match so that 
 default isn't triggered. 

I don't think that's possible, is it?  Match is implemented as a
subclass of Capture, as in:

$P0 = subclass 'Capture', 'Match'

So, I can create the missing cases, but what do I put for the body
of the method to get to the corresponding method of Capture?

.namespace [ 'Match' ]
.sub set_integer_keyed_int :vtable
.param int key
.param int value

# ... how to do set_integer_keyed_int method of Capture?

.end


 We could of course remove the defaults too, but that 
 would need a very complete set of these keyed vtables on all PMCs.

How many of these would there be?  Doesn't this affect only those
classes that are built using ParrotObject ?

Pm


Re: set_pmc_keyed_int delegates to set_pmc_keyed...?

2006-11-06 Thread Patrick R. Michaud
On Sun, Nov 05, 2006 at 05:41:12PM +0100, Leopold Toetsch wrote:
 Am Sonntag, 5. November 2006 15:22 schrieb Patrick R. Michaud:
  So, I can create the missing cases, but what do I put for the body
  of the method to get to the corresponding method of Capture?
 
      .namespace [ 'Match' ]
      .sub set_integer_keyed_int :vtable
          .param int key
          .param int value
 
          # ... how to do set_integer_keyed_int method of Capture?
 
      .end
 
 A subclass of a PMC delegates to that PMC (via deleg_pmc.pmc). The PMC is the 
 first attribute of that class named '__value'. Your code would look like:
 
   .local pmc capt
   capt = getattribute SELF, '__value'
   capt[key] = value
 
 But this is all clumsy, and might/should change.
 
 Therefore I've ci'ed in r15111 another workaround in parrotobject.pmc, which 
 checks, if the parent isa PMC and in that case calls the deleg_pmc method 
 instead of the default.

Alas, this seems to work only for immediate subclasses of a PMC.
If we have a sub-subclass, then we're apparently back to the
same problem as before:

$ cat zz.pir
.sub main :main
$P0 = new .Capture # create Capture object
$P0['alpha'] = 1   # store value in hash component
$P0[0] = 2 # store value in array component
$I0 = elements $P0 # display size of array  (should be 1)
print $I0
print \n

# create a 'Match' subclass of Capture
$P99 = subclass 'Capture', 'Match'

$P1 = new 'Match'  # create Match object
$P1['alpha'] = 1   # store value in hash component
$P1[0] = 2 # store value in array component
$I1 = elements $P1 # display size of array (should be 1)
print $I1
print \n

# create a 'Exp' subclass of Match
$P99 = subclass 'Match', 'Exp'

$P2 = new 'Exp'# create Exp object
$P2['alpha'] = 1   # store value in hash component
$P2[0] = 2 # store value in array component
$I2 = elements $P2 # display size of array (should be 1)
print $I2
print \n

.end

$ ./parrot zz.pir
1
1
0
$  

Looking at the above, it seems to me that the crux of the problem
(short of an overall saner design) is that deleg_pmc is occuring
after default.pmc.  That seems backwards.  Perhaps any deleg_pmc
methods should be taking place before falling back to the PMC defaults.

We also have a similar problem currently taking place with PMC
methods -- methods defined in a PMC aren't being properly inherited
or re-delegated in ParrotObject subclasses.  For capture.pmc I've
put some workarounds for this into Capture's 'get_array' and 'get_hash'
methods (r15129), but it again points to something fundamentally
wrong with the way that method inheritance/delegation is being 
handled in ParrotObjects.

Pm


Re: Anyone relying on objects stringifying to class names?

2006-11-06 Thread Patrick R. Michaud
On Sat, Nov 04, 2006 at 08:46:37PM +, Jonathan Worthington wrote:
 Jonathan Worthington wrote:
 At the moment, if you have some ParrotObject instance, say foo, and do 
 something like:
 
  $S0 = foo
 
 Then $S0 will contain the name of the class.
 
 =item CSTRING *name()
 
 Erm, what the heck was I smoking when I wrote this...the name method 
 doesn't control what an object stringifies to at all. I managed to read 
 it as get_string. :-(
 
 Sorry 'bout that. And while this is marked as being bad in the comment, 
 I can't remove it since it's used (and I'm not even sure, how bad it is 
 now). PGE uses it for example.

Hmm, I can't recall where PGE might be using this -- could you
point to an example so I can make sure it's relatively sane?

Thanks,

Pm


Re: [perl #40626] [BUG] :vtable fails for subclasses of core classes

2006-11-07 Thread Patrick R. Michaud
On Wed, Nov 01, 2006 at 05:53:24PM -0800, Jonathan Worthington via RT wrote:
 (sorry for empty reply earlier)
 
 Patrick R.Michaud (via RT) wrote:
  The new :vtable pragma doesn't seem to work when used on methods
  of subclasses of core classes.  Here's a quick sample
  (I'm also adding this test to t/pmc/parrotobject.t):
 
  snip

 Thanks for the good test case, which has enabled me to get a fix to this 
 bug. :-)

Excellent!  Now could you get it to work in .pbc files as well?  ;-)

(You may have already addressed this in other threads regarding saved
properties of subroutines, I just wanted to provide another test case
to show the current item I'm blocking on.)

$ cat vt.pir
.sub main :main
$P99 = subclass 'Hash', 'Foo'
$P99 = subclass 'Hash', 'Bar'

$P1 = new 'Foo'
$S1 = $P1
say $S1

$P1 = new 'Bar'
$S1 = $P1
say $S1

.end

.namespace [ 'Foo' ]

.sub '__get_string' :method
.return('Hello world')
.end


.namespace [ 'Bar' ]

.sub 'get_string' :method :vtable
.return('Hello world')
.end

$ ./parrot vt.pir
Hello world
Hello world
$ ./parrot -o vt.pbc --output-pbc vt.pir
$ ./parrot vt.pbc
Hello world
Hash[0x7e6be0]
$

Thanks!

Pm


How do I associate methods with a compiler?

2006-11-08 Thread Patrick R. Michaud
Historically Parrot has considered a compiler to be an
invokable subroutine, such that the canonical sequence for
compiling something is:

.local string perl6_source
.local pmc perl6_compiler
perl6_compiler = compreg 'Perl6'
$P0 = perl6_compiler(perl6_source)

However, pdd21_namespaces.pod says that compilers have
methods such as 'parse_name', 'get_namespace', and 'load_library'
(see the section titled Compiler PMC API).

Recognizing that much of the details about compilers are still to
be specced, the naive version of my question is:  How do we get 
those API methods (and possibly other compiler-specific methods)
attached to the compiler sub?  

Or, in claiming that compilers have an API, should we instead
say that the canonical compilation sequence is to use compreg
to obtain a compiler object (not an invokable sub), and then
compile the source via a 'compile' method on the compiler object?
For example:

perl6_compiler = compreg 'Perl6'
$P0 = perl6_compiler.'compile'(perl6_source)

In asking the above questions I'm purposely avoiding, because Parrot
doesn't seem to support it yet, the possibility that the object
obtained via compreg is both invokable in its own right (like a sub)
and has methods attached to it.  I'm simply curious as to how we
conceptually model compilers in Parrot -- are they really like
subroutines or are they more traditional 'objects' that provide
a method-based interface for invoking a compilation.

Opinions welcome.  Personally I think I favor the a compiler is
an object with a 'compile' method model, and that Ccompreg gives
us back a compiler object as opposed to a subroutine-like thing.

Thanks in advance,

Pm


Re: How do I associate methods with a compiler?

2006-11-09 Thread Patrick R. Michaud
On Thu, Nov 09, 2006 at 09:55:05AM -0200, Adriano Rodrigues wrote:
 On 11/9/06, Patrick R. Michaud [EMAIL PROTECTED] wrote:
 Opinions welcome.  Personally I think I favor the a compiler is
 an object with a 'compile' method model, and that Ccompreg gives
 us back a compiler object as opposed to a subroutine-like thing.
 
 Would it not be possible to support both?  

Sure, it's possible to support both -- we can even handle both
types within the existing Parrot framework.  I think I'm basically 
asking which model will be considered the Parrot standard?

 The usage patterns could be something like:
 
   .local string perl6_source
   .local pmc perl6_compiler
   perl6_compiler = compreg 'Perl6'
   $P0 = perl6_compiler(perl6_source)
 
 (Yes, the same as Patrick's first snippet.) And then
 
   perl6_compiler = compreg 'Perl6', OBJ # OBJ is some constant
   $P0 = perl6_compiler.'compile'(perl6_source)

We could do this now without requiring an additional parameter
to Ccompreg, by adding suffixes to the compiler name.  For example:

$P0 = compreg 'Perl6_sub'# get subroutine view of compiler
$P0 = compreg 'Perl6_obj'# get object

But I find the suffix (or the use of an extra parameter) a bit
overblown.  I'd rather come up with a standard API for compilers
that supports all of this, and then individual compiler subs can
deviate from that standard if it's really appropriate.

I've also just written a HLLCompiler base class (I would've called
it 'Compiler', but that name is currently taken in Parrot) that 
makes it easy to wrap a sub into a compiler object.  Thus 
registering a new compiler becomes:

load_bytecode 'Parrot/HLLCompiler.pbc'
   
.local pmc compile_object, compile_sub
compile_object = new [ 'HLLCompiler' ]
compile_sub = get_global 'name_of_compile_sub'
compile_object.'compsub'(compile_sub)
compreg 'MyCompiler', compile_object

Using the compiler is then:

.local pmc mycompiler
mycompiler = compreg 'MyCompiler'
$P0 = mycompiler.'compile'(source)

Thanks for the excellent comments!

Pm


Re: [perl #40806] [BUG] IMCC - embedded source locations (#line nnn file.pir)

2006-11-10 Thread Patrick R. Michaud
On Fri, Nov 10, 2006 at 08:23:56PM -0800, Chip Salzenberg via RT wrote:
 This *may* be a non-bug resulting from the conflation of PIR source
 file/line and HLL source file/line.
 
 Or it may indicate the need for separate setfile/setline [HLL line] and
 #line num file  [PIR line].

I think it's likely the latter, but I'm not familiar enough with
this particular point to know for sure.

Here's the use case I'm trying to address:  The tgc and pgc compilers
currently translate source input files to PIR; and these sources
may contain PIR fragments which are to be copied directly into the
PIR output.  However, when imcc encouters an error in one of these
fragments, it reports the error as occuring on the line in the
PIR output instead of the line of the source input.

As an example, here's what an input file to tgc might look like:

transform past (Perl6::Grammar) :language('PIR') {
.local pmc cnode, cpast
cnode = node['statement_list']
cpast = tree.'get'('past', cnode, 'Perl6::Grammar::statement_list');
.return (cpast)
}

A programmer (or Makefile) compiles this into PIR with

$ parrot compilers/tge/tgc.pir xyz.tg xyz_gen.pir

and then compiling the xyz_gen.pir file results in

$ parrot xyz_gen.pir
error:imcc:syntax error, unexpected ';', expecting '\n'
in file 'xyz_gen.pir' line 8
$

Thus, it reports the error in the generated PIR file and
not the original source file.  The programmer then has to
look at the generated PIR file to find the offending line,
and then mentally back-translate into the original source
to correct it.

What I'd like to see instead is

$ parrot xyz_gen.pir
error:imcc:syntax error, unexpected ';', expecting '\n'
in file 'xyz.tg' line 4

so that I can immediately find the problem in the original source
input file.

Pm


Re: [svn:parrot] r15517 - in trunk: . src

2006-11-13 Thread Patrick R. Michaud
On Mon, Nov 13, 2006 at 07:33:18PM -0800, [EMAIL PROTECTED] wrote:
 
 Log:
 Fix size mismatch errors, at least on Linux/PPC.  If this breaks 
 other platforms, there's a deeper bug somewhere and we need to 
 rethink t/tools/pbc_merge.t for the release.

Alas, it seems to break Linux/x86_64 -- output of 'make' is below.

Pm

=

[EMAIL PROTECTED]:~/parrot/trunk make
Compiling with:
xx.c
cc -I./include -D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS -DDEBUGGING -pipe 
-Wdeclaration-after-statement -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -fPIC 
-g -W -Wall -Wstrict-prototypes -Wmissing-prototypes -Winline -Wshadow 
-Wpointer-arith -Wcast-qual -Wwrite-strings -Waggregate-return -Winline 
-Wno-unused -Wsign-compare -falign-functions=16 -Wformat-nonliteral 
-Wformat-security -Wpacked -Wdisabled-optimization 
-mno-accumulate-outgoing-args -Wno-shadow -DHAVE_COMPUTED_GOTO -fPIC -I. -o 
xx.o -c xx.c
./parrot -o runtime/parrot/include/parrotlib.pbc 
runtime/parrot/library/parrotlib.pir
PackFile_unpack: Illegal BYTECODE_runtime/parrot/library/parrotlib.pir table 
segment size 316 (must be multiple of 8)!
directory_pack segment 'BYTECODE_runtime/parrot/library/parrotlib.pir' used 
size 316 but reported 316

make: *** [runtime/parrot/include/parrotlib.pbc] Error 1
[EMAIL PROTECTED]:~/parrot/trunk



Re: [svn:parrot] r15517 - in trunk: . src

2006-11-14 Thread Patrick R. Michaud
On Mon, Nov 13, 2006 at 10:27:19PM -0800, chromatic wrote:
 On Monday 13 November 2006 21:49, Patrick R. Michaud wrote:
  On Mon, Nov 13, 2006 at 07:33:18PM -0800, [EMAIL PROTECTED] wrote:
   Log:
   Fix size mismatch errors, at least on Linux/PPC.  If this breaks
   other platforms, there's a deeper bug somewhere and we need to
   rethink t/tools/pbc_merge.t for the release.
 
  Alas, it seems to break Linux/x86_64 -- output of 'make' is below.
 
 Too bad; I was aiming for 32-bit x86.
 
 Does this patch do anything for you?  I sort of really hate it, but I'm 
 curious about the results.

With the patch everything seems to work okay, but I agree
it's not pretty and that we're probably just masking something
deeper.

Pm



Re: How do I associate methods with a compiler?

2006-11-14 Thread Patrick R. Michaud
On 11/9/06, Patrick R. Michaud [EMAIL PROTECTED] wrote:
 Opinions welcome.  Personally I think I favor the a compiler is
 an object with a 'compile' method model, and that Ccompreg gives
 us back a compiler object as opposed to a subroutine-like thing.

For the record, it was decided (Allison++) during today's 
#parrotsketch meeting that the convention would be to have 
the 'compreg' opcode return an object with a 'compile' method, 
as opposed to returning an invokable sub.

Of course, individual language implementors may choose to
defy convention, and for the forseeable future the PIR 
and PASM compilers that come with Parrot will continue 
to be subroutine-like.  All of this just means that
callers to compreg need to know what they are getting back
or else figure it out at runtime.

I'm hoping to formalize some of the details for creating
compilers into a compilers pdd at some not-too-distant date.
Comments and suggestions welcomed.  At the moment we have
the following:

  - There's a HLLCompiler class (loadlib 'Parrot/HLLCompiler.pbc')
that can be used to quickly create compiler objects.

  - To register a new compiler using HLLCompiler:

load_bytecode 'Parrot/HLLCompiler.pbc'
.local pmc compile_sub, mycompiler

##   get the compilation subroutine
compile_sub = get_global 'compile'

##   create a new compiler object
mycompiler = new [ 'HLLCompiler' ]

##   register the language and compiler subroutine
mycompiler.'register'('MyCompiler', compile_sub)

  - To perform a compile:

mycompiler = compreg 'MyCompiler'
$P0 = mycompiler.'compile'('...source code...')

In addition, HLLCompiler provides a 'command_line' method
for acting as a standalone compiler; thus a .pir/.pbc can
simply have its :main sub delegate control directly
to HLLCompiler.  For example:

.sub main :main
.param pmc args

.local pmc mycompiler
mycompiler = compreg 'MyCompiler'
mycompiler.'command_line'(args)
.end

When invoked in this manner, compiler object will compile 
and execute any source file given on the command line, or 
enter an interactive mode if no source file is given.  
The result of the compilation can be controlled by the
--target command line (assuming the compilers involved
support this):

--target=parse   # output parse tree
--target=past# output ast (PAST)
--target=post# output opcode tree (POST)
--target=pir # output PIR

HLLCompiler also understands '--encoding' (for languages that
have specific character encoding requirements), and '--output'
to specify a file where the output should be placed.

Comments and other feedback are greatly appreciated.

Thanks!

Pm


Re: How do I associate methods with a compiler?

2006-11-14 Thread Patrick R. Michaud
On Tue, Nov 14, 2006 at 08:52:47PM -0800, Allison Randal wrote:
 
 Also for the record from the weekly meeting (which was actually today, 
 just a very long today): Yes, compilers are objects and compilation is a 
 method call. The compiler for TGE tree grammars is implemented this way, 
 and it's a very usable interface.

Our messages crossed in the mail.

 We might want to resurrect the 'compile' opcode as an indirect syntax 
 for making the 'compile' method call.

Maybe, but I can't see that this is worthy of a special opcode
(and presumably a vtable slot?).  There's just not a lot of
difference between:

$P0 = compile mycompiler, code# compile opcode
$P0 = mycompiler.'compile'(code)  # Parrot convention

Another advantage of using (true) method calls is that
it's easy to pass options and additional arguments to the
compiler:

$P0 = mycompiler.'compile'(code, 'target'='parse')

Pm


Re: [perl #40968] [BUG] :multi doesn't seem to work right

2006-11-23 Thread Patrick R. Michaud
On Wed, Nov 22, 2006 at 11:20:58PM +0100, Leopold Toetsch wrote:
 Am Mittwoch, 22. November 2006 21:03 schrieb Leopold Toetsch:
  Am Mittwoch, 22. November 2006 18:03 schrieb Patrick R.Michaud:
   Is this a bug (I think it is), or does the underscore in
  
   :multi mean something other than any argument?
 
  The meaning is 'any PMC' [1], and it of course can't be a bug as there are
  no specs ;)
 
 Implementing '_' as 'any type' wouldn't be that hard. 

I'll take the position that implementing '_' as 'any type' is
the semantic we really want here.  We already have a way to
say 'any PMC':

.sub foo :multi(pmc)

Here's a more detailed use case of why the current semantics
aren't useful.  For subroutine calls, PAST-pm tries to pass
constants directly to subroutines (when it can) rather than 
creating temporary PMCs and passing those.  For example, a HLL 
expression such as

3 .op. 4

gets turned into PIR like

$P0 = infix:.op.(3, 4)

instead of

$P0 = new .Integer
assign $P0, 3
$P1 = new .Integer
assign $P0, 4
$P2 = infix:.op.($P0, $P1)

The present case I've run into is the Perl 6 smart match
operator (infix:~~), where the operation to be performed depends
on the types of the arguments.  For example, a pattern match
is defined as:

.sub infix:~~  :multi(_, Sub)

For an expression such as $x ~~ /abc/, this works okay:

find_lex $P0, '$x' # get $x
find_name $P1, 'abc_sub'   # get sub for /abc/
$P2 = infix:~~($P0, $P1) # call infix:~~

But for something like 'abc' ~~ /abc/, we end up with

find_name $P1, 'abc_sub'   # get sub for /abc/
$P2 = infix:~~('abc', $P1)   # call infix:~~

and it won't find the :multi(_, Sub) given above.  So far I've
only found two workarounds -- one is to convert all arguments
to PMCs before calling any sub (this is what PAST-pm is
doing now), the other is to create additional 
:multi(string|int|num, ...) subs that redispatch to the '_' 
version:

.sub infix:~~ :multi(int, Sub)
.param pmc x
.param pmc y
.return infix:~~(x, y)
.end

.sub infix:~~ :multi(string, Sub)
.param pmc x
.param pmc y
.return infix:~~(x, y)
.end

# etc.

The first approach basically means we build a PMC for
every constant value, the second means that we end up
with three additional subs for every underscore in :multi.
And from a PIR-programmer perspective I don't see a lot
of downside to having '_' mean 'any value', but I can't
speak to the difficulty of implementing this in imcc.

Pm


Re: [perl #40646] [TODO] PGE - add tests for alpha+[_], alpha-[Jj], etc.

2006-11-24 Thread Patrick R. Michaud
On Thu, Nov 23, 2006 at 03:09:11PM -0800, Nuno Carvalho via RT wrote:
 I've tried to add some tests to the rx_subrule with some extras
 sensitive cases, but i'm failling two tests that i think that should
 pass. I have attached a patch that adds new tests. The ones requiring
 attention are the ones that fail. 

Excellent, thanks!

The two tests you contributed that were failing likely weren't testing
what you thought they were testing.  The original tests were:

 Pattern Target Match (y/n)
-[ab]+[cd]+ caad n
+alpha-[Jj]+aJc  n

Since the patterns are unanchored, the first test simply matches
any sequence of 'c' and 'd' characters, while the second matches
any sequence of alphabetic characters other than 'J' or 'j'.

So, being unconstrained, the first pattern matches the 'caad'
target by matching the initial 'c', and the second pattern
likewise matches the initial 'a'.

I've applied the patch (thanks!) and changed these two failing
tests to be anchored patterns, as in:

 Pattern   Target Match (y/n)
^-[ab]+[cd]+$ caad n
^+alpha-[Jj]+$aJc  n

They now work as expected.

Thanks again!

Pm


Re: Initial feedback on PAST-pm, or Partridge

2006-11-26 Thread Patrick R. Michaud
On Sun, Nov 26, 2006 at 08:30:32PM -0800, Allison Randal wrote:
 I had to poke into the guts of HLLCompiler, the new PAST, and the new 
 POST a fair bit in the process of getting Punie to work with them, so my 
 comments here are a mixture of user experience and implementation 
 details. I've grouped my comments into general categories.

Excellent.  Just as a general overall response -- PAST-pm is by
no means finished, so many of the items that seem to be missing
are simply cases of I haven't gotten to them yet so they aren't
implemented yet.

 Available node types:
 
 - There's no PAST::Stmt node type? I only see PAST::Stmts and PAST::Op. 
 But statements are composed of multiple ops.  So, everything is an op?

At present there's no PAST::Stmt node type, but one can be easily
added.  I thought about putting one in based on Punie's use of
PAST::Stmt, but I hadn't quite figured out exactly _why_ it's 
important so I thought I'd leave it out until I actually needed it
somewhere.  In many ways ops are already composed of multiple ops,
so a statement can be considered just another op.  (But I do see
why someone would want a PAST::Stmt abstraction -- on the other
hand, I didn't see how it changed the resulting POST/PIR output.)

 - There's no PAST::Label node type? How do you represent labels in the 
 HLL source?

I just haven't gotten to this part yet.

 - Is there no way to indicate what type of variable a PAST::Var is? 
 Scalar/Array/Hash? (high-level types, not low-level types)

Sure, that's what 'vtype' is -- it indicates the type of value
that the variable ought to hold.

My plan has been to follow the Perl6 concept of implementation types
and value types within PAST.  Thus far I've only put in the support
for the value types, as the vtype attribute (and vtype can be any
high-level type the language happens to support).  I'm expecting
to add an itype attribute at some point when we're a bit farther
along; I'm still working out the details.

 ---
 Meaningful naming: (Be kind to your compiler writers.)

I totally agree, and I'm not yet wedded to any particular naming
scheme.

 - In the PAST nodes, I grok 'name' as the operator/function name of a 
 PAST::Op and as the HLL variable name of a PAST::Var, but making it the 
 value of a PAST::Val is going to far. It was 'value' in the old PAST, 
 which makes more sense. You're passing named parameters into 'init', so 
 I can't see a reason not to use a more meaningful name for the attribute.

I don't have a problem with switching it to 'value', I went with
'name' primarily because every PAST::Node has a name and so it just
made sense to use it there.  But let me make another weak argument
in favor of 'name'.  If a HLL programmer writes

$a = 1.23456789E6;

then the rhs becomes a PAST::Val node.  How should we represent the 
value?  The parse-to-past translation could evaluate the contents of 
1.23456789E1 and store the result in 'value' as (.Float) 12.3456789, 
but unfortunately when convert that .Float back into a string for 
use as PIR code it comes out as 12.3457 -- i.e., the code looks
like:

$P0 = new .Float
$P0 = 12.3457
set_global '$a', $P0

I decided that in a number of cases like this, what we really want
to retain in PAST::Val is a precise string representation of the
value that goes in the resulting output, and not a native
representation that may lose precision in translation through
POST/PIR.  So, what we're really storing is the value's name
and not its value.(I did say it was a weak argument.)

Anyway, we can switch to 'value' if that's ultimately better;
I was just thinking that 'name' might be equally appropriate.

 - In PAST nodes, the attribute 'ctype' isn't actually storing a C 
 language type. Better name?

It really stands for constant type, and is one of 'i', 'n', or
's' depending on whether it can be treated as an int, num, or
string when being handled as a constant in PIR.

 - The attribute 'vtype' is both variable type in POST::Var and value 
 type in POST::Val. Handy generalization, but it's not clear from the 
 name that 'vtype' is either of those things.

I think you meant PAST::Var/PAST::Val here, as there isn't a POST::Var
or POST::Val.  But 'vtype' really stands for value type in both
cases -- it's the type of value returned by either a PAST::Var
or PAST::Val node.

 - The values for both 'ctype' and 'vtype' are obscure. Better to 
 establish a general system for representing types, than to include raw 
 Parrot types or 1-letter codes in the AST.

Ultimately I expect that the types that appear in 'vtype' will
be the types defined by the HLL itself.  For example, in perl6
one would see 'vtype'='Str' to indicate a Perl 6 string constant.
Unfortunately it's been difficult to illustrate this in real code 
because of the HLL classname conflicts that I've been reporting 
in other contexts.

I agree the values and name for 'ctype' are a bit obscure, and
will gladly accept any suggestions for improving it.  The 'ctype' 

Re: Initial feedback on PAST-pm, or Partridge

2006-11-26 Thread Patrick R. Michaud
On Sun, Nov 26, 2006 at 08:30:32PM -0800, Allison Randal wrote:
 Overall, the POST implementation is usable and I really like the new HLL 
 compiler module. I've got Punie working with the new toolchain to the 
 point that it's generating valid PIR code for many low-level constructs, 
 but some of the high-level constructs that worked under the previous 
 toolchain I still don't have working. 

Also, out of curiosity, which high-level constructs in punie aren't
working?

Pm


Re: Initial feedback on PAST-pm, or Partridge

2006-11-27 Thread Patrick R. Michaud
On Mon, Nov 27, 2006 at 10:52:13AM -0800, Allison Randal wrote:
 Patrick R. Michaud wrote:
 
 Also, out of curiosity, which high-level constructs in punie aren't
 working?
 
 What I've found so far are:
 
 - The top-level AST structure is off: my temporary hack to replace 
 PAST::Stmt and PAST::Exp with PAST::Stmts is producing extra temporary 
 variables in the PIR output. I need to refactor the top few tiers of 
 transformation rules, and maybe refactor the Punie parser grammar.

I'll gladly add PAST::Stmt and PAST::Exp nodes if that's at all
useful.  Just because they're there doesn't mean a compiler has to
use them.  :-)

 - Comma lists are also handled completely differently.

PAST itself doesn't know anything about comma lists -- it just
thinks of comma as being an operator like any other operator.
In perl6 the infix:, operator has 'list' associativity, so that
it ends up with a variable arity.  However, I recognize that some 
languages might need to keep the notion that commas are left-associative
with arity 2, so perhaps we need some form of 'list' pasttype
that would combine the operands together somehow?

 So, it's not a matter of missing features (aside from PAST::Label), it's 
 just a matter of adapting the code to a different way of thinking. I'll 
 work through these in the next few days and let you know what I find as 
 I go.

That'd be great.  I'm working on some refactors of HLLCompiler and
PAST right now, I don't think any of these will break existing code.

Pm


Re: Initial feedback on PAST-pm, or Partridge

2006-11-27 Thread Patrick R. Michaud
On Mon, Nov 27, 2006 at 01:13:52AM -0800, Allison Randal wrote:
 .sub '__onload' :load :init
 # load your modules
 $P1 = new [ 'HLLCompiler' ]
 $P1.'init'('language'='punie', 'parse_grammar'='Punie::Parser', 
 'ast_grammar'='Punie::AST::Grammar')
 .end
 .sub 'main' :main
 .param pmc args
 $P0 = compreg 'punie'
 $P1 = $P0.'command_line'(args)
 .return ($P1)
 .end

 [...]
 
 Standardized infrastructure code good. Make Ogg-itect happy. :)

We definitely want Ogg-itect to remain happy.  :-)

Now implemented in r15882 as shown above, sans the helper 'init' 
method (which I'll add later tonight).  Examples are in 
languages/perl6/ and languages/abc/ .

Time permitting tonight I will also refactor the monolithic
'command_line' method of HLLCompiler into separate shorter methods.

Pm


Re: Initial feedback on PAST-pm, or Partridge

2006-11-27 Thread Patrick R. Michaud
On Mon, Nov 27, 2006 at 05:28:59PM -0800, Allison Randal wrote:
 Patrick R. Michaud wrote:
 
 I'll gladly add PAST::Stmt and PAST::Exp nodes if that's at all
 useful.  Just because they're there doesn't mean a compiler has to
 use them.  :-)
 
 Well, I came to the conclusion that PAST::Exp was useless a while ago. 
 (Its entire point of existence was as a dummy node to be factored out at 
 the PAST-to-POST stage.) I do think PAST::Stmt is useful, but I want to 
 take a stab at refactoring it out first.

Excellent.  Let me know when/if you want PAST::Stmt added in, and any
attributes you want it to have.

 Oh, I should have mentioned that the patch I sent in to remove the dummy 
 'root' rule from the POST::Grammar was part of what was making Punie 
 work (because Punie's top-level node isn't a PAST::Block, it's a 
 PAST::Stmts). I can refactor that out, but in this case it seemed to 
 make more sense to refactor the compiler tool (since the other languages 
 still worked with the change).

POST really needs to have a POST::Sub at the top of the tree,
so the purpose of the 'root' rule in POST::Grammar is (going to be) 
to create a POST::Sub for the tree if the lower transformations
don't happen to return one.  I'll add that code shortly, and then
things should work properly even if the top-level node in PAST
isn't a PAST::Block.

Pm


Re: Initial feedback on PAST-pm, or Partridge

2006-11-27 Thread Patrick R. Michaud
On Mon, Nov 27, 2006 at 09:20:08PM -0800, Allison Randal wrote:
 Patrick R. Michaud wrote:
 
 Now implemented in r15882 as shown above, sans the helper 'init' 
 method (which I'll add later tonight).  Examples are in 
 languages/perl6/ and languages/abc/ .
 
 So, with a thumbs up on that modification, I've attached a patch that 
 does two things: a) keeps strict functionality boundaries so the 
 controller object does the controlling, and the compiler objects for 
 PAST and POST do only compiling; and b) makes it possible to override 
 the grammar used for the PAST-to-POST transformation. ABC passes all its 
 tests, and Perl6 doesn't fail any more tests than it was failing before. 
 (I made it a patch because it's a refactor that's easy to show but 
 convoluted to explain.)
 
 chromatic's suggestion is to replace the series of manual calls in 
 HLLCompiler's 'compile' method with an iterator over an array of 
 compiler tasks. 

I very much agree with chromatic -- indeed, this is mainly why I didn't
go with putting ostgrammar methods into the HLLCompiler object
before.  Having HLLCompiler effectively hardcode a sequence
of parser-astgrammar-ostgrammar feels a bit heavy-handed to me,
almost saying that we really expect you to always have exactly
the sequence source-parse-ast-ost-pir-bytecode, and you're
definitely using TGE for the intermediate steps.

I guess if we expect a lot of compilers to be making language-specific
derivations or replacements of the ast-ost stage then putting the
ost specifications into HLLCompiler makes some sense, but I
totally agree with chromatic that a more generic approach is
needed here.  And what I had been aiming for in terms of array
of compiler tasks was something like array of compiler stages,
where each compiler stage is itself a compiler (in the compreg
and HLL compiler sense) that does the transformation to the
next item in the list.  And each compiler stage knows the
details of how it performs its transformation, whether that's using
TGE or some other method.  Putting transformation details like
the ostbuilder and apply steps into HLLCompiler still feels wrong
to me somehow, although I did come around to agreeing with the
idea that the commonly repeated details for source-parse and
parse-ast belong in the default 'compile' method for compiler
objects.

Part of me really wishes that each compiler task would end
up being a standardized 'apply' or 'compile' subroutine
or method of each stage.  In other words, to have compilation
effectively become a sequence like:

.local pmc code
# source to parse tree
$P0 = get_hll_global ['Perl6::Grammar'], 'apply'
code = $P0(code, adverbs :flat :named)

# parse tree to ast
$P0 = get_hll_global ['Perl6::PAST::Grammar'], 'apply'
code = $P0(code, adverbs :flat :named)

# ast to ost
$P0 = get_hll_global ['POST::Grammar'], 'apply'
code = $P0(code, adverbs :flat :named)

# ost to result
$P0 = get_hll_global ['POST::Compiler'], 'apply'
code = $P0(code, adverbs :flat :named)

Here the 'apply' functions in Perl6::PAST::Grammar and
POST::Grammar are simply imported from TGE and do the steps
of creating the builder object and then applying the grammar.
The 'apply' function in Perl6::Grammar would just be a
standardized start rule for the parser grammar (and can
be directly specified as such in the .pg file).

If we could standardize at this level, then a compiler simply
specifies the sequence of things to be applied, and the above
instructions could be implemented with a simple iterator over
the sequence.  This is _really_ what I was attempting to get at 
by having separate compiler objects for PAST, POST, and friends, 
except that instead of calling the standard function 'apply' 
I was using 'compile'.  Part of me thinks that 'apply' and
'compile' are pretty much the same thing, in the sense that 
both refer to using some sort of transformer thing to
change from a source representation into an equivalent target.

-

At any rate, even if we go with the approach outlined in the
patch, I have to say that I'm not at all keen on the method
names 'astcompile', 'ostcompile', etc. in the patch. 
When I read 'astcompile' it sounds to me like it's a method 
to compile an ast into something else, when in fact the method 
in the patch is compiling some source into an ast.  (By analogy, 
we speak of Perl 6 compiler and PIR compiler as being 
things that consume Perl 6 and PIR, not the things that that 
produce Perl 6 or PIR.)

So at the very least I'd prefer to have those methods called
'get_ast' or 'make_ast' or something much less likely to
cause confusion.  Indeed, the reason why I went with simple
'parse' and 'ast' method names in the original is because the 
method name tells me what it is that I'm getting back, much like 
an accessor.

Pm


Re: Initial feedback on PAST-pm, or Partridge

2006-11-27 Thread Patrick R. Michaud
On Mon, Nov 27, 2006 at 10:13:21PM -0800, Allison Randal wrote:
 This fragment of a reply is the random bits that didn't make it into
 other topic-centered replies.

...and some quick responses before turning in for the night...

 Currently Parrot uses '__init' as the method for initializing
 new objects, thus I think 'init' is at least consistent with Parrot.
 
 Where it's inconsistent is in the arguments each takes, so you can't use
 the current 'init' methods as :vtable('init') methods. I'm half-way
 inclined to see that as a limitation in Parrot that needs to be fixed
 rather than a problem with these classes.

Having dealt with this in both PGE and at least two PAST
implementations, I certainly see it as a Parrot limitation.
Ultimately I want to have a method that can accept variable
arguments so that I can initialize a newly created object.
I chose 'init' because it seemed like the natural/obvious
name for such a method, but if there's a better name I'll
gladly switch.  I haven't found the Parrot :vtable('init')
to be all that useful, since there's not a parameterized
version of it beyond passing a single PMC.  And getting
arguments into a single PMC isn't all that fun or useful.

But come to think of it, if we had something like Capture PMCs
available as a standard type (and an easy way to generate
them in PIR), then the existing :vtable('init') would be
quite sufficient.  To steal from Perl 6's C \(...) 
capture syntax:

$P0 = new 'Foo::Bar', \(param1, param2, 'abc'=param3)

.sub 'init' :vtable
.param pmc args
# initialize self based on array/hash components of args pmc
# ...


 I've also thought about doing 'push' as a :vtable entry, and we can
 still easily do that, but there are at least two items in favor of 
 keeping a method-based approach:  
 (2) when we get a high-level transformation language into TGE, it's
 very likely that the operations on nodes will be method-based
 and not opcode-based.
 
 Well, the operations will be in a middle-level-language syntax. Whether
 the MLL uses a methody syntax or a procedural syntax doesn't matter,
 since either can be translated to either syntax in PIR.

My point is simply that it's far easier to go from a MLL
(whatever syntax) to PIR method calls than to generate specific
Parrot opcodes, because method calls have a very regular
syntax that Parrot opcodes don't.

 - One more comment in this department: move PIR generation out of the
 POST node objects. A tree-grammar that outputs PIR code strings isn't a
 final solution, but it's a more maintainable intermediate step than
 mingled syntax tree representation and code generation (remember P6C?).

I never really dealt with P6C.  :-),  Still, I can see about
moving the code generation out of the POST node objects; I may
do it as a lower priority though, since I don't think that
aspect is driving many design or implementation decisions for
us at this point.

 - In PGE grammars, what is the { ... } at the end of every proto 
 declaration supposed to do? 
 [...]
 But in the end, I didn't allow simple semicolon terminators
 simply because it wasn't valid Perl 6 syntax, and in many cases
 I think that having subtle differences isn't ideal as people
 may get confused about what is allowed where.  But I don't have
 a large objection to modifying the PGE::Grammar compiler to
 represent empty declarations with semicolons as well as
 yada-yada-yada blocks.
 
 Excellent.

Excellent as in ...? 
   [ ]  Go ahead and allow semicolons, since you don't have
a large objection.
   [ ]  Your explanation is excellent, stick with the yadas
 to avoid the subtle contrasts to Perl 6.

Pm


Re: [perl #40999] Latest version of parrot doesn't make test

2006-11-28 Thread Patrick R. Michaud
On Mon, Nov 27, 2006 at 02:59:45PM -0800, Bob Wilkinson wrote:
 I have recently installed parrot from svn, but get errors during
 make. I see the same errors on a x86 running unstable Debian, and
 a sparc running Gentoo.
 
 typing perl Configure.pl  make ends with:
 [...]
 gmake -C compilers/past-pm
 gmake[1]: Entering directory
 `/home/bob/src/parrot/compilers/past-pm'
 /usr/bin/perl5.8.8 -MExtUtils::Command -e rm_rf PAST-pm.pbc
 ../../runtime/parrot/library/PAST-pm.pbc
 No root path(s) specified at
 /usr/lib/perl5/5.8.8/ExtUtils/Command.pm line 105
 ../../parrot ../../compilers/tge/tgc.pir
 --output=POST/Grammar_gen.pir POST/Grammar.tg
 PackFile_unpack: Bytecode not valid for this interpreter:
 fingerprint mismatch
 Parrot VM: Can't unpack packfile
 ...

The Bytecode not valid for this interpreter means that parrot
is finding a .pbc file somewhere that is from an older
(and incompatible) version of Parrot.  So, I'd say that either:

1.  you need to run make realclean before running Configure.pl
and starting the build, to get rid of any old .pbc files, or

2.  there's an earlier version of Parrot installed somewhere
(e.g., from a previous make install), and the build process 
is inadvertently picking up the files from that previous
installation.

FWIW, most of us who are working with svn head recommend against
ever performing a 'make install', just to avoid Parrot getting
confused by multiple incompatible versions.

Pm


Re: [perl #41000] Can't compile simple parrot example with latest stable parrot

2006-11-28 Thread Patrick R. Michaud
On Mon, Nov 27, 2006 at 03:15:17PM -0800, Bob Wilkinson wrote:
 # New Ticket Created by  Bob Wilkinson 
 # Please include the string:  [perl #41000]
 # in the subject line of all future correspondence about this issue. 
 # URL: http://rt.perl.org/rt3/Ticket/Display.html?id=41000 
 
 My second 10 line parrot program didn't compile, so I looked at the
 examples at http://www.parrotcode.org/examples/pasm.html, and pasted
 the following into a file.
 
 [EMAIL PROTECTED]:~/src/parrotcode$ cat ex1.pasm
 new P0, .PerlInt
 set P0, 123
 new P1, .PerlInt
 set P1, 321
 add P1, P1, P0
 print P1
 print \n
 end
 [EMAIL PROTECTED]:~/src/parrotcode$ ../parrot-0.4.7/parrot ex1.pasm
 error:imcc:syntax error, unexpected DOT
 in file 'ex1.pasm' line 1
 
 I am not sure how to proceed.

I think the example is out of date.  IIRC, the .PerlInt type
was removed some time ago, and so the code above should now
use .Integer instead of .PerlInt.

Pm



Re: [perl #41014] [PATCH] Autobox Native Types for MultiSubs

2006-11-29 Thread Patrick R. Michaud
On Wed, Nov 29, 2006 at 08:49:27PM +0100, Leopold Toetsch wrote:
 Am Mittwoch, 29. November 2006 05:50 schrieb Matt Diephouse:
  It also means that string, int, and float no longer work as MMD  
  types -- you can't distinguish between native types and PMCs. I think  
  this is the right way to go now that we have autoboxing; I don't see  
  any reason to differentiate.
 
 I don't think this is the best strategy. It seriously prevents 
 all native type optimizations. While 'Integer' should be 
 MMD-distancewise close to 'int', it should not be the same.

Just to repeat my comments from #parrot, I agree with Leo that
treating int as always (and only) being identical to the 
autobox type feels very wrong somehow.

However, if we're short-term restricted to choosing between the 
existing implementation (where '_' doesn't match native types) 
and this patch (where we can't differentiate native types), 
I definitely want the patch.  It's the lesser of the two evils.

IWBNI the native types were considered in both their native and 
autoboxed aspects for purposes of selecting candidate MMD subs, 
with a match on native types resulting in a shorter MMD distance 
than those involving autoboxing.

Pm


Re: Re: [perl #41014] [PATCH] Autobox Native Types for MultiSubs

2006-11-29 Thread Patrick R. Michaud
On Wed, Nov 29, 2006 at 04:43:59PM -0500, Matt Diephouse wrote:
 Leopold Toetsch [EMAIL PROTECTED] wrote:
 Am Mittwoch, 29. November 2006 05:50 schrieb Matt Diephouse:
  It also means that string, int, and float no longer work as MMD
  types -- you can't distinguish between native types and PMCs. I think
  this is the right way to go now that we have autoboxing; I don't see
  any reason to differentiate.
 
 I don't think this is the best strategy. It seriously prevents all native 
 type optimizations. While 'Integer' should be MMD-distancewise 
 close to 'int', it should not be the same.
 
 What native type optimizations? Using S, I, and N registers? If using
 an I register is faster, wouldn't you want to unbox an Integer PMC and
 use an I register anyway?

Sure, but Parrot is unboxing for us already, without us having
to do anything special:

.sub 'foo' :multi(_, String)
.param int abc
.param pmc xyz
...
.end


foo(1, 'xyz') # boxes 'xyz', leaves 1 alone
foo($P0, $P1) # unboxes $P0 to an int, leaves $P1 alone

If I understand things correctly, specifying :multi(_, String)
doesn't actually do any form of coercion, it simply says that
this sub is called only if the second argument is compatible
with the String type.

This is true even if we have a :multi with a native type:

.sub 'bar' :multi(int)
.param pmc abc
##  abc is an autoboxed int for us here, even though this 
##  sub can only be reached if the (single) argument
##  is an integer register or integer constant
...
.end

.sub 'baz' :multi(pmc)
.param string def
##  Baz can only be called with a pmc argument, and that
##  pmc (whatever it is) is auto-unboxed into a string register.
...
.end


I'm not at all an expert on the topic of multis, but it sounds to 
me as though :multi is being somehow conflated with when to 
auto[un]?box.  I think :multi should limit itself to being a way 
of selecting which sub(s) to call, while autoboxing should be
based solely on the arguments of the caller and parameters of the
called sub (once that sub has been chosen by :multi).

Now then, for purposes of selecting the sub(s) to call, :multi 
can take into account the fact that native arguments can autobox,
and call a sub that specifies the autoboxed type in :multi
(but preferring a sub with the native type in :multi, if one
exists).

And I don't think :multi should go the other way -- i.e., to
assume that a boxed type will match a native type in :multi.
With what I just described there's already a way to get that
semantic, namely:

.sub 'foo' :multi(Integer)   
.param int xyz
...
.end

With this, a foo(1) or foo($I0) call will still find the sub,
but won't do any boxing or unboxing.  If foo is called with an
Integer pmc argument, :multi will find the sub, and the Parrot
calling conventions will end up autounboxing the argument into 
an 'int', which is what we wanted.  And if foo(...) is called 
with something that isn't compatible with Integer (e.g., a subclass), 
then :multi won't select this sub at all.

But as I said, I'm no expert -- this is just my best stab
at how things ought to work,  at least in the short term 
until a more sophisticated Parrot object model is in place.
And as I also indicated, I don't have nearly as strong feelings
about this as I do about the fact that we need a way to
specify 'any type' (including native types) in the :multi
pragma.

Thanks,

Pm


Re: Initial feedback on PAST-pm, or Partridge

2006-12-07 Thread Patrick R. Michaud
On Wed, Dec 06, 2006 at 10:33:45PM -0800, Allison Randal wrote:
 - In PGE grammars, what is the { ... } at the end of every
 proto declaration supposed to do?
 [...]
 But in the end, I didn't allow simple semicolon terminators 
 simply because it wasn't valid Perl 6 syntax, and in many cases I
 think that having subtle differences isn't ideal as people may
 get confused about what is allowed where.  But I don't have a
 large objection to modifying the PGE::Grammar compiler to 
 represent empty declarations with semicolons as well as 
 yada-yada-yada blocks.
 Excellent.
 
 Excellent as in ...? 
[ ]  Go ahead and allow semicolons, since you don't have
 a large objection.
[ ]  Your explanation is excellent, stick with the yadas
  to avoid the subtle contrasts to Perl 6.
 
 I prefer option (A), allowing semicolons. The tricky thing is that we're
 adopting syntax from one use case into another use case. The yadas make
 perfect sense in the context of a Perl 6 program (where the yada means
 that the code body will later be filled in), but they make no sense as
 part of a Parrot parser (where the yada can't be filled in, and is just
 an artifact).

IIUC, PGE's use of yada is actually the same use case as Perl 6.  
The yadas in Perl 6 can be stubs to be filled in later, but S03
and S06 indicate that yadas are also used as the body in 
function prototypes, i.e., where the function is actually to be
defined somewhere else.  To me that feels exactly like what we have
here -- the grammar file is prototyping operator functions 
that are defined somewhere else.  (And, for several of the existing 
compilers, they really *are* function prototypes, in that the function 
body comes from a PIR function.)

 Not an immediate priority, though. And, maybe Perl 6 will change and 
 solve the problem for us before we get there. ;)

Sounds good to me.  It's an easy switch to allow the semicolons
when/if we decide to do that.

Pm


p6 variable binding in Parrot

2006-12-08 Thread Patrick R. Michaud
Does anyone have any suggestions about what sort of PIR
code and/or PMCs we need to be able to do make the following 
Perl 6 code work...?

my @a;
@a[4] = 'Hello';

my $b := @a[4];
say $b;# says Hello

@a[4] = [1, 2];
say $b;# says 1 2 


Here are the pieces I can fill in:

 my @a; 
new $P0, .Perl6List
.lex '@a', $P0

 @a[4] = 'Hello';
find_lex $P1, '@a'
# (in general case we autovivify @a here if needed)
set $P1[4], 'Hello'

 @a[4] = [1, 2]
$P2 = 'list'(1, 2) # create a list
find_lex $P3, '@a'
# (in general case we autovivify @a here if needed)
set $P3[4], $P2

But what's a good approach for handling C$b, which is bound
to @a[4]?  Do we need a reference type (similar to .Ref) that 
can keep track of a container+key pair?

Pm


Re: p6 variable binding in Parrot

2006-12-08 Thread Patrick R. Michaud
On Fri, Dec 08, 2006 at 05:05:00PM -0500, Matt Diephouse wrote:
 Patrick R. Michaud [EMAIL PROTECTED] wrote:
 Does anyone have any suggestions about what sort of PIR
 code and/or PMCs we need to be able to do make the following
 Perl 6 code work...?
 
 Sure. I think Tcl handles this pretty nicely at the moment (although
 Leo disagrees - he likes the Ref PMC route). The main idea is that
 aliasing/binding enters the same PMC under a different name and that
 assignment morphs the PMC.

Does this basically assume that every PMC knows how to morph into
any other type?  (In the example I gave the PMC would need to be able
to morph from an integer to a list, but in the general case it could
be converting to any type.)

 With this scheme, you'd have to use assign in this last case instead
 of set (with a morph to really make it safe) because you need to reuse
 the same PMC:
 
  @a[4] = [1, 2]
 $P2 = 'list'(1, 2)
 find_lex $P3, '@a'
 $P3 = $P3[4]
 morph $P3, .Undef
 assign $P3, $P2
 
 If you're only assigning your own PMCs, you can drop the morph (which
 isn't technically safe anyway).

I don't think I can assume I'm only assigning my own PMCs.  (This is
being handled in PAST-pm, and so it probably needs to work with PMCs
in general.)  And I know that morphing isn't safe, which is why I've 
been avoiding it.

Hmm... perhaps what we really need is an opcode or sequence of
opcodes that convert a PMC into a value-based copy (clone?) of 
another PMC, but keeping the first PMC as the same PMC so that 
other references to it will see the new value and type.

Pm


Re: Re: p6 variable binding in Parrot

2006-12-09 Thread Patrick R. Michaud
On Sat, Dec 09, 2006 at 12:59:35AM -0500, Matt Diephouse wrote:
 Patrick R. Michaud [EMAIL PROTECTED] wrote:
 On Fri, Dec 08, 2006 at 05:05:00PM -0500, Matt Diephouse wrote:
  Sure. I think Tcl handles this pretty nicely at the moment (although
  Leo disagrees - he likes the Ref PMC route). The main idea is that
  aliasing/binding enters the same PMC under a different name and that
  assignment morphs the PMC.
 
 Does this basically assume that every PMC knows how to morph into
 any other type?  (In the example I gave the PMC would need to be able
 to morph from an integer to a list, but in the general case it could
 be converting to any type.)
 
 No, it assumes that every PMC knows how to morph into an Undef. Once
 you have an Undef, you can safely use assign. [...]

A, I get it.  Yes, this sounds good to me.  In fact, it's
pretty much what I asked for -- a sequence of opcodes that convert
a PMC into a value-based copy of another PMC.

Many thanks, I'll go with that for now.

Pm


Re: Past-pm printing the return value of the main routine

2006-12-12 Thread Patrick R. Michaud
On Tue, Dec 12, 2006 at 09:47:16AM -0800, Allison Randal wrote:
 In Punie or Perl 6, when I execute a simple statement:
 
   print 2;
 
 It prints 21. This is because a) the return value of a successful 
 print is 1, b) the main routine is returning the value of the last 
 statement (note this is correct for Perl, but isn't correct for all 
 languages), and c) HLLCompiler is printing out the return value of the 
 eval'd code here:
 
 376   save_output_1:
 377 print ofh, result
 378 close ofh
 
 Commenting out line 377 gives the correct behavior of just printing 2. 
 My question is, why is HLL compiler printing out the return value of the 
 main routine?

What revision number are you working with?  I think this was fixed in
a later revision of HLLCompiler.

(It outputs the return value from the compilation phase when 
--target=pir is specified.  The previous version was a bit
overeager about outputting the result.)

Pm


Re: Past-pm basic string types

2006-12-12 Thread Patrick R. Michaud
On Tue, Dec 12, 2006 at 09:43:39AM -0800, Allison Randal wrote:
 Patrick, what's the best way to pass-through string types from a 
 compiler to Parrot without doing full string processing? To pass the 
 current tests, Punie only needs Parrot's single- and double-quoted 
 strings, but Past-pm is escaping them. 

PAST-pm expects it to be pretty rare that a HLL's string literal
format will exactly match what works as a string literal in PIR, so 
PAST::Val nodes expect the HLL to have already decoded the string
constant according to whatever rules the HLL uses.  Then PAST-pm
can re-encode the string into a form that is guaranteed to work
in Parrot (even handling things such as placing unicode: in
front of PIR string literals if the string has characters that
fall outside of the ASCII range.)

I can modify PAST-pm to provide a send exactly this string to PIR 
option for PAST::Val.  More generally useful would seem to be to 
provide a generic function or opcode that can decode single/double 
quoted strings according to PIR's encoding rules, and then use
that to get the string into PAST::Val.

PGE::Text could provide such a feature as part of its library-- i.e., 
subrules like:

 PGE::Text::pir_quoted_string:  
' PGE::Text::pir_quoted_string: '  '

could parse a valid pir string literal and provide the
decoded value as the result object.

 (I will add full string processing to Punie later, but since other 
 compilers will also need basic Parrot string types, it makes sense to 
 figure it out now.)

I think that the various languages have enough differences in
string literal handling that each compiler will end up writing 
its own string literal decoder.  (Or we need a semi-powerful library
to handle the many differences.)  In the meantime having an 
easy-to-access subrule for just pretend it's a quoted literal 
according to PIR conventions might be a good way for someone
wanting to bootstrap a compiler, without placing Parrot-specific
encodings into PAST-pm.

Lastly, I'm still working out the handling of HLL to Parrot
type mappings -- it's also possible that some of this will
fall out as a result of that.

Pm


Re: Past-pm basic string types

2006-12-13 Thread Patrick R. Michaud
On Tue, Dec 12, 2006 at 01:57:20PM -0800, Allison Randal wrote:
 Patrick R. Michaud wrote:
 I can modify PAST-pm to provide a send exactly this string to PIR 
 option for PAST::Val.  
 
 Yes, good idea for the simple case.

After sleeping on it overnight, I realized that PAST-pm already
has this feature.

Currently PAST-pm checks the PAST::Val node's ctype
attribute to decide whether to encode the literal value 
as a Parrot form -- if the node doesn't have ctype that indicates
string constant, then PAST-pm just uses the literal value
directly in the output.

So, just don't set ctype, and whatever the node has as
its name attribute will go directly into the PIR output.

Here's an example:

$ cat x.pir
.sub main :main
load_bytecode 'PAST-pm.pbc'
.local pmc valnode, blocknode, pir

##  $S0 is the string we want to appear in the output
$S0 = '\n'
valnode = new 'PAST::Val'
valnode.'init'('vtype'='.String', 'name'=$S0)
blocknode = valnode.'new'('PAST::Block', valnode, 'name'='anon')

##  compile the tree to PIR and print the result
$P99 = compreg 'PAST'
pir = $P99.'compile'(blocknode, 'target'='pir')
print pir
.end

$ ./parrot x.pir

.sub anon
new $P10, .String
assign $P10, \n
.return ($P10)
.end

Eventually the handling of ctype is going to change -- first,
the name will change to be more descriptive (but I'll leave a 'ctype'
accessor in place to give compilers time to switch); second, any
ctype specifications will be held in a HLL class mapping table 
instead of in each PAST::Val node.

There is a good chance that PAST-pm will treat PAST::Val nodes
of type .String as needing their values to be encoded for Parrot,
but to protect against this punie (and other compilers) can
use .Undef:

$S0 = '\n'
valnode = new 'PAST::Val'
valnode.'init'('vtype'='.Undef', 'name'=$S0)

Since the node isn't a string type, PAST-pm will use the name
$S0 value directly in the output PIR without performing any
encoding on the literal value, and the generated PIR from the node
would look like

new $P10, .Undef
assign $P10, \n

And this does exactly what you want.  :-)

Pm


Re: [perl #39997] [PATCH] PGE P5 Test Cleanup

2006-12-16 Thread Patrick R. Michaud
On Sat, Dec 16, 2006 at 11:37:48AM -0800, Paul Cochrane via RT wrote:
 Did you get around to opening the tickets you mentioned here?  If so, I 
 think we can close this ticket.  If not, do you want to sketch out the 
 ideas for the tickets you want opened?  I can then go through the 
 donkey work of opening them for you if you want.

My $0.02-

AFAICT, all of the 'todo'/'skip' markers have been factored out of
the p5 and p6 regex test files themselves, so I think it's safe to
close this ticket.  Any further improvements to be made to the
tests probably deserve their own tickets, without holding this ticket
open for them.

Pm



 On Fri Jul 28 12:20:30 2006, particle wrote:
  i forgot to mention something here... patrick has included the todo /
  skip markers in the perl6 regex test file, and i don't think they
  belong there, either. these should be factored out into the harness,
  as well. i have other plans for those tests, like splitting them up
  into seperate files, but that's another story.
  
  i think the best way for me to document all this is to write up some
  tickets, so i'll be doing that shortly.


Re: [perl #40361] [PATCH] #40278 [CAGE] perl coding standards coda. (cont.)

2006-12-19 Thread Patrick R. Michaud
On Tue, Dec 19, 2006 at 05:20:06PM +0800, Lee Duhem wrote:
 Allison wrote:
 My vote is on removing all emacs and vim settings from our source code
 files.
 and so you can get really bad code appearance.

I'm curious, why is that?  We're already discouraging (if not
disallowing) hard tabs in the Parrot source tree -- are there
other items that make the code appear really bad?

I totally agree with Allison -- let's get rid of the editor-specific
settings in the source code files.

Pm


Re: More Undef vs. Null...

2006-12-20 Thread Patrick R. Michaud
On Wed, Dec 20, 2006 at 10:59:34PM +, Jonathan Worthington wrote:
 Leopold Toetsch wrote:
 Am Mittwoch, 20. Dezember 2006 05:59 schrieb Will Coleda:
   
 Are Hash and Array supposed to have different results on unset keys?
 
 
 The .Undefs returned by Arrays are IMHO and unfortunate leftover of he 
 early PerlArrays. We've managed to change the return result of hashes to 
 .Null, so this should be possible with arrays too.
   
 Changing the PMCs ain't hard. It's changing all the code that is getting 
 an undef back and testing for it that's the issue.

And once again, PGE, PAST-pm, perl6, and others may have a lot of
places that expect to get Undef back instead of PMCNULL.  We
can certainly see about updating them to check for PMCNULL instead
of undef, but they certainly exist.

I asked about this particular discrepancy a couple of months ago
(on IRC, I think), and IIRC the conclusion I was given then was 
something to the effect of since ResizablePMCArray is auto-extendable 
it should return undefs for non-existent values.  Leo is correct 
that this is probably coming from an original perl5-ish
interpretation of how things work.

I'm not strongly in favor or opposed to a particular design --
just pointing out that changing RPA to return NULL will probably
impact a fair bit of PGE, PAST-pm, perl6, abc, etc.  But if there's
going to be a change, it'll be easier to fix it now rather than
later.

Pm


Re: Punie ported to PAST-pm

2007-01-02 Thread Patrick R. Michaud
On Tue, Jan 02, 2007 at 12:01:54AM -0800, Allison Randal wrote:
 - I ran into one bit of strangeness with the assignment operator on 
 simple strings (it was generating an 'assign' opcode with 3 arguments 
 for the source code $x = 'test'). I solved it by setting 'pasttype' to 
 'assign', but now the generated code is unnecessarily calling the 
 'clone' opcode (e.g. clone $P10, $P10). I'll come back and look at it 
 later.

PAST-pm's handling of assignment is about to be refactored a bit
in order to support Perl 6's binding operator (:=).  Yes, the
generated code sometimes calls a clone when it doesn't need to --
this is going be handled by having PAST-pm keep track of which
PMCs are temporaries and thus available for re-use instead of
requiring cloning.

 - The old Punie was loading a library of builtin functions in the 'main' 
 routine of every generated Perl 1 script. I haven't figured out how to 
 do that yet in the new PAST, which means that I can only run Perl 1 
 scripts interpreted. They fail when pre-compiled to .pir files because 
 the builtins aren't loaded.

I'll look into this one a bit.  One of the next items that will be
implemented in perl6 (and may make it into PAST-pm) will be to
support BEGIN/CHECK/INIT/END blocks.

Thanks!

Pm


Re: [perl #41214] [CAGE]: files from 'make languages-test' survive 'make clean'

2007-01-08 Thread Patrick R. Michaud
On Mon, Jan 08, 2007 at 08:49:56PM +, Nicholas Clark wrote:
 On Mon, Jan 08, 2007 at 12:46:09PM -0800, Patrick R. Michaud wrote:
 
  (Perhaps more better would be for the test program(s) to clean up 
  the temporary files when the test is finished.  :-)
 
 Although you can't be sure that test programs won't crash horribly.
 Not that the perl 5 core tests are robust against SEGVing the interpreter
 or hitting abort()

Agreed, they still need to be picked up by 'make clean'.

Pm


should we eliminate examples/japh (RT #37068)?

2007-01-13 Thread Patrick R. Michaud
As part of bugday Bernhard fixed up a couple of japh tests
(examples/japh/) that were using the now obsolete 'pack' 
opcode.  This helps with RT #37068, but still doesn't 
resolve it entirely as there are other japh examples that
don't work.

In looking through the remaining japh examples, most of them
seem to have been obsoleted by changes to Parrot -- especially
w.r.t. calling conventions and namespaces.  Could we perhaps
eliminate these obsolete examples as well?

Here's the state of the programs in examples/japh/ as of r16588:

japh1.pasm: deleted
japh2.pasm: deleted
japh3.pasm: ok (uses substr hackery)

japh4.pasm:
Fails with 'attempt to access code outside of current code segment'.
This particular example creates a __get_string vtable method
for a 'Japh' class, but does so using .pcc_sub .  The 'todo'
marker on the test indicates that the 'namespace has changed'
(I presume this means that namespace implementations are different
now.)

japh5.pasm: ok (overloads __set_string_keyed)
japh6.pasm: ok (overloads __get_string_keyed)
japh7.pasm: ok (traps an error and fiddles with opcode number)

japh8.pasm: Fails on 64-bit platforms.

japh9.pasm:
Fails because it relies on old Parrot calling conventions
whereby the return continuation would be stored in P1.
(The todo message reads P1 is no longer special.)

japh10.pasm:
Fails with Method 'thread3' not found.

japh11.pasm:
Fails with no label offset for '_init'.  I'm guessing
this is because the _init label comes after an empty 
.namespace directive and .namespace handling has changed 
somewhat.

japh12.pasm: deleted in r16588

japh13.pasm:
Fails with 'invoke opcode not found'.

japh14.pasm:
Fails with 'invoke_p opcode not found'.

japh15.pasm:
Fails with 'invokecc opcode not found'.

japh16.pasm:
Uses the deprecated 'compile' opcode, as well as an external
shared object file from examples/compilers/ (which also doesn't
compile on my system).

japh17.pasm:
Fails with 'invoke_p opcode not found'.


Pm


Re: [perl #41237] [TODO] PMC Class name IDs will require a dot in front

2007-01-15 Thread Patrick R. Michaud
On Sun, Jan 14, 2007 at 11:58:10PM -0500, Matt Diephouse wrote:
 Allison Randal via RT [EMAIL PROTECTED] wrote:
  PMC Class name IDs ... will require a dot in front
 
 My preference is to eliminate the dot in classname IDs. Lodge your
 objections now, before it's made fact in 0.4.9.
 
 Allison
 
 I actually prefer the dot. I don't like the possible ambiguity between
 types and local variables:
 
.local string MyClass
MyClass = '...'
$P0 = new MyClass # is this a type or a string?

Just to add my vote, I prefer the dot as well.

Pm


Re: Major bullet biting on | vs || within regex

2007-01-16 Thread Patrick R. Michaud
On Tue, Jan 16, 2007 at 10:41:03AM -0800, Larry Wall wrote:
 Note, in case you don't read synopsis checkins: the previous checkin
 majorly changes the semantics of | within regex to support required
 longest-token matching semantics rather than left-to-right matching.
 This is nearly on the same philosophical level as requiring the
 tail-recursion optimization.  It will enable us to write parsers
 more consistently, and it also opens up normal regexes to better
 optimization via tries and such.  You can now use || for the old |
 semantics, which is majorly consistent with how | and || work outside
 of regexen.

Do we leave C alone (as opposed to introducing a corresponding C
operator)?  I can see arguments both ways.

Pm


Re: [perl #41364] [PATCH] Fixed object vtable method overrides in PIR

2007-01-28 Thread Patrick R. Michaud
On Sat, Jan 27, 2007 at 11:39:16AM -0800, Alek Storm wrote:
 Also, though this is more of a language design question,
 shouldn't we deprecate the double-underscore method of overriding, since we
 now have the :vtable flag?

Just a note that we cannot deprecate the double-underscore method
of overriding vtable methods until RT #40626 is resolved.

RT #40626 notes that the :vtable pragma isn't currently working
for PIR code compiled into .pbc files -- see the sample program
in RT #40626.  And, just for completeness, I tried the test program
after applying the patch in #41364 and got the same results
(this is not unexpected, I'm guessing that #41364 is addressing
a different issue).

Thanks!

Pm



Re: Porting parrot on PDA -- work in progress

2007-02-13 Thread Patrick R. Michaud
On Tue, Feb 13, 2007 at 08:28:38PM +0100, Aldo Calpini wrote:
 I've managed to build parrot for the PocketPC. yes, really. 

 I would appreciate any feedback :-)

Feedback:  Truly amazing, and terrific work.  Aldo++

Pm


Re: PAST-pm: only PAST::Block allowed at root of PAST

2007-02-14 Thread Patrick R. Michaud
On Wed, Feb 14, 2007 at 11:49:54AM +0100, Klaas-Jan Stol wrote:
 hello,
 
 It was discussed before, but I'm not sure what was the result; PAST-pm 
 only allows a PAST::Block node to be returned from transform (ROOT). 
 However, in languages/PIR, the top level construct is a compilation 
 unit, which may be an include statement. An include statement should not 
 be enclosed by a subroutine.
 
 Will PAST-pm be able to handle this?

Yes, I'm expecting that PAST-pm either will have a PAST::CompUnit
node type for compilation units, or the 'blocktype' attribute
on PAST::Block will have a 'compunit' setting.

Pm


Re: in PIR, a BigInt is turning into a string against my will -- what am I doing wrong?

2007-02-17 Thread Patrick R. Michaud
On Sat, Feb 17, 2007 at 09:21:45AM -0800, Eric Hanchrow wrote:
 (This is with parrot built from the subversion trunk, revision 16999)
 Here's a bit of PIR that demonstrates my problem:
 
 .sub 'main' :main
 load_bytecode 'dumper.pir'
 .local ResizablePMCArray fields
 split fields, ,, hey,you

AFAIK, symbols like fields can't be typed beyond int, num, string, or
pmc.  So the .local statement above should read:

.local pmc fields

The 'split' opcode always returns a ResizableStringArray.
It might be easier to think of it as

fields = split ,, hey,you

(You can actually write it this way -- it's the same thing.)

So, when the BigInt is unshifted into the ResizableStringArray,
it's morphed into a string.

Hope this helps.  :-)

Pm


Monthly release moved to Wednesday

2007-02-19 Thread Patrick R. Michaud
Just wanted to make sure everyone was aware of something that
was briefly mentioned during bug day on Saturday...

A last-minute scheduling change means that I'm having to
make a trip out of town tomorrow (Tuesday), which means
that I'll be cutting the Parrot release on Wednesday instead
of tomorrow.

Pm


Re: [perl #41386] MANIFEST must die.

2007-02-19 Thread Patrick R. Michaud
On Sun, Feb 18, 2007 at 08:38:17PM -0800, jerry gay wrote:
  For the moment, disabling configure's manicheck by default would be a
  good start. 

 i don't think manifest checking should be disabled until we have a
 replacement solution in place that makes sure the manifest is checked
 before every release. fortunately, the solution is simple. updating
 release instructions with a procedure to check manifest is sufficient,
 and --nomanifest can be enabled on Configure.pl by default.

Except that for someone who downloads a tarball, the default
should probably be to check the manifest.

Or, perhaps we could use the shiny new Makefile.PL script (chromatic++)
to differentiate.  Running Configure.pl defaults to --nomanifest,
running perl Makefile.PL checks the manifest.

Pm


Re: Q on: #37542: [TODO] core - document behavior of multiple :load subpragmas in same compilation unit

2007-02-19 Thread Patrick R. Michaud
On Mon, Feb 19, 2007 at 05:39:01PM +0100, Klaas-Jan Stol wrote:
 Ticket:
 #37542: [TODO] core - document behavior of multiple :load subpragmas in 
 same compilation unit
 : the behavior of multiple subroutines marked with the ':load' subpragma
 : in the same compilation unit is currently undefined. PGE currently
 : uses a workaround for this limitation, as seen in
 : compilers/pge/PGE.pir.
 
 However, this behavior *is* defined, according to 
 http://www.parrotcode.org/docs/imcc/calling_conventions.html:
 ...
 Does this mean this ticket can be closed?

I agree that the ticket can probably be closed.  

First, PGE no longer uses the workaround, and relies on the 
fact that :load works as documented in calling_conventions.html .

But I also think that with things like this we want to make sure that
not only is the documentation updated (as given by the TODO), but
also that there is test coverage for the behavior described by
the new documentation.

Fortunately, in this case it looks to me as though t/pmc/sub.t 
does have tests that check for proper execution of multiple 
:load subs.  So I think this ticket can be safely closed.

Thanks,

Pm


Preliminary notes for 0.4.9 release

2007-02-21 Thread Patrick R. Michaud
Hello, all-

I've been working on the 0.4.9 release; so far things seem to be
going reasonably well.  Many thanks to Jerry Gay and others who
have come before me for cleaning up the release process and making
sure the various NEWS/STATUS docs are up to date!  It's really quite
straightforward now.

I do have a couple of questions, however, especially for the
people who have been working on release management in the past.
I want to make sure I understand them a bit more before cutting
the release.  (Also, for some reason I've been extremely fatigued
all day today and thus making small mistakes... so I think I prefer
to take one night's rest and get it right in the morning, than to
force it out this evening and perhaps have to clean up a lot of
small mistakes.)

1.  The t/library/pg.t tests require libpq.so to be installed
in order to run -- should I be testing ('make fulltest') with
this library installed?

More generally, is there a specific set of platforms I should
be performing 'make fulltest' on prior to release?  And do I
need to be maximizing test coverage by making sure certain
libraries or capabilities are available on my test platform(s)?
The libpg.o is one example... but what about things such as
ICU, readline, and the like?

Again, it's no problem for me to install the libraries -- I'm
just curious about the correct procedure so I can document it
for later release managers.)


2.  In r17137, I'm getting one test failure from 'make fulltest':

Failed Test  Stat Wstat Total Fail  List of Failed
---
t/pmc/pmethod_test.t1   256 21  2

Anyone know anything more about this failure, and should it just
be marked 'TODO' or shall I see about fixing it?


3.  The release instructions don't make any mention of verifying
MANIFEST or running 'make manitest' -- should this be a required
step in creating a release?  Or is it happening somewhere that
I'm not seeing?


4.  Anyone have a good name for the release?  I'm satisfied with
leaving 0.4.9 unnamed unless there's a sense that we really
need to name it (in which case I'll come up with one or
accept suggestions from others :-).


Any answers will be greatly appreciated -- I'll update 
RELEASE_INSTRUCTIONS with whatever we come up with, and then
publish 0.4.9!

Thanks!

Pm


  1   2   3   4   5   6   7   8   9   10   >