Re: FYI compiling PIR function calls

2006-09-27 Thread Allison Randal

Leopold Toetsch wrote:
There seems to be the impression that generating PIR calls from a compiler is 
hard because it may look like:


  $S0 = obj.'_meth'(arg1, arg2)

but this also works:

.pcc_begin
.arg hello
.arg \n
.invocant obj
.meth_call _meth
.result $S0
.pcc_end

There's a similar construct for return values.


The basic problem is inconsistency. For hand-written code the current 
PIR method call syntactic sugar is mildly annoying. (It'd be nice to 
safely get rid of the quotes around the method name.) For generated code 
it's a completely different syntax tree node. Opcode syntax like so:


  opcode $S0, obj, arg1, arg2

can be represented with a simple parent node with a sequence of child nodes.

The syntax tree for methods (whether you use the syntactic sugar or the 
long form) is more like a parent node (a method), with one invocant 
child and two composite children for the argument list and return list, 
respectively. To generate the PIR, you need unroll the composite 
children in a way that's contextually dependent on other nodes further 
up the tree.


And when some common language constructs are opcodes and some are method 
calls, the burden of deciding which kind of syntax a particular 
construct should use falls to the compiler writer. There are various 
ways to implement it: a lookup table, a chunk of hard-coded PIR, etc. 
But they all boil down to added complexity.


This isn't to say that method calls are bad, but they are more complex 
to work with. You want more features to be implemented only as methods 
and to eliminate the opcodes. But there is real value in keeping the 
opcode syntax for the common cases.


Allison


Re: PDD 22 - I/O release candidate 1

2006-09-27 Thread Tim Bunce
On Tue, Sep 26, 2006 at 04:44:53PM -0700, Allison Randal wrote:
 I've committed an updated I/O PDD. I'm close to pronouncing this ready 
 to implement, so get in your comments now.
 
 One piece that is currently missing is a discussion of which lightweight 
 concurrency model we're going to use for the asynchronous operations. 
 I've had ongoing back-channel conversations with various people, but I 
 need to congeal them. Pitch in your own 2 cents.
 
 Also, any reactions to the distinction that async ops return status 
 objects while sync ops return integer error codes? Sync opcodes could 
 have 2 signatures, one with an integer return type (int error code) and 
 one with a PMC return type (status object).

What's the relative cost of creating a PMC vs passing one in?
I assume passing one in is significantly faster.

If so, then perhaps speed-sensitive ops that are likely to be used in
loops can be given the PMC to (re)use.

Tim.


Re: FYI compiling PIR function calls

2006-09-27 Thread Leopold Toetsch
Am Mittwoch, 27. September 2006 09:12 schrieb Allison Randal:

 The basic problem is inconsistency. For hand-written code the current
 PIR method call syntactic sugar is mildly annoying. (It'd be nice to
 safely get rid of the quotes around the method name.) 

Not easily:

  obj.'foo'()  # a methodname constant
  .local string bar
  bar = get_some_meth()   # or bar = 'get_some_meth'()
  obj.bar()# a method variable

But:

  obj.foo()# still a methodname constant 
   # unless there's a variable 'foo'

To be on the safe side, method (and function) names *should* be quoted. I 
don't think that this is inconsistent.

 And when some common language constructs are opcodes and some are method
 calls, the burden of deciding which kind of syntax a particular
 construct should use falls to the compiler writer. There are various
 ways to implement it: a lookup table, a chunk of hard-coded PIR, etc.
 But they all boil down to added complexity.

Well, when I write x86 or ppc JIT code, the burden of deciding, which syntax 
to use falls on the compiler write that is me. While we could provide an 
introspection API to available opcodes (and args) it still doesn't help much. 
The compiler (writer) has to know, what a particular opcode is doing.

When I write code which calls some library function, I've to lookup the docu 
and check the function name and the argument it takes. And I've to decide, if 
I have an opcode available and use it, or if I've to call a library function.

This is just the usual job when creating a compiler. Introspection would have 
little benefit IMHO:

  interp = getinterp
  info = interp.get_op_info()   # hypothetical
  $I0 = exists info['gcd']  # do we have a 'gcd' opcode

  cl = getclass 'Integer'
  $I0 = can cl, 'gcd'

While a compiler could use introspection or tables or whatever, these still 
wouldn't help to figure out the semantics of a particular operation.

 Allison

leo


Re: PDD 22 - I/O release candidate 1

2006-09-27 Thread Leopold Toetsch
Am Mittwoch, 27. September 2006 01:44 schrieb Allison Randal:
 I've committed an updated I/O PDD. I'm close to pronouncing this ready
 to implement, so get in your comments now.

   I/O Stream Opcodes

I really don't like opcodes, when dealing with I/O.

1) opcodes are needed for native int or float - these are nicely JITtable

2) opcodes with PMCs usually call a vtable function that provides necessary 
virtualization/abstraction:

  set S0, P0[10]# VTABLE_get_string_keyed_int

If a particular PMC doesn't support the vtable function an exception is 
thrown.

Now compare this with an I/O opcode:

  read S0, P0, 10   # PIO_reads(... P0 ...) 

If P0 isn't a ParrotIO opcode, this segfaults. See t/pmc/io_1.pir. While we 
could of course check, what type P0 is, such a check would be needed for 
every IO opcode. (And see below)

3) opcodes don't work with inheritance, unless a vtable (or method) is 
provided.

  subclass P1, 'Array', 'MyArray'
  new P0, 'MyArray'

  set S0, P0[10]# VTABLE_get_string_keyed_int

The vtable still works, and possibly calls a '__get_string_keyed_int' method, 
if the 'MyArray' class provides one.

But:

  subclass P1, 'ParrotIO', ['HTTP'; 'Daemon'; 'ClientConn']
  ...
  read S0, P0, 10   # P0 be a ClientConn now

Given this, we would have to modify the mentioned check, to also deal with 
subclasses of 'ParrotIO'. And worse, the IO opcode would need some knowledge 
about the subclassed ParrotIO to extract the raw PIO structure to carry on 
with the IO PIO_reads function call. This either violates encapsulation or 
would need another interface to deal with it.
  
4) Then we should call a method from the opcode?

  op read(out STR, invar PMC, in INT) {
 STRING *meth = Parrot_find_method_with_cache(...)
 if !(meth) {
# error handling
 }
 $1 = Parrot_run_meth_fromc_args(...);

 # or alternatively
 # emulate set_args
 # emulate get_results
 pc = VTABLE_invoke(...)
 ... 
  }

That is, we'd re-implement the 'callmethodcc' opcode for every IO opcode, 
without any further improvement. Just the opposite. We would add code 
duplication, and we'd possibly add a Continuation barrier by adding this 
indirection.

5) I/O layers

When we look at ParrotIO as an object with methods, the layering boils down to 
simple inheritance. Adding a layer is just a subclass operations, where some 
methods are overridden. All the extra code that deals with layers is likely 
unnecessary then. Or IOW the I/O layer models mimicks some class 
functionality, which should better be done by existing class code.

6) Sockets

Currently the socket API is part of the PIO virtual function table. This is 
suboptimal, as the socket API is exposed to non-socket PIOs as well.

  ParrotSocket isa ParrotIO
  BufferedIO   isa ParrotIO
  ...

or some such would be a better choice. The Perl5 equivalences are probably a 
good model for this.

 Allison

my 2c,
leo


Re: PDD 22 - I/O release candidate 1

2006-09-27 Thread Joshua Hoblitt
On Tue, Sep 26, 2006 at 04:44:53PM -0700, Allison Randal wrote:
 One piece that is currently missing is a discussion of which lightweight 
 concurrency model we're going to use for the asynchronous operations. 
 I've had ongoing back-channel conversations with various people, but I 
 need to congeal them. Pitch in your own 2 cents.

Are you referring to the Parrot side API, the actual implementation, or
both?  As for the implementation side, my gut feeling is that this is
going to need to be highly platform specific.  For example, on Linux
using POSIX aio (assuming a new enough kernel) is probably going to
incur much less over head then either clone(2) or POSIX threads.  As a
point of reference I believe Linux::AIO uses clone(2) but it predates
the POSIX aio layer.

-J

--


pgpUrtz0sRife.pgp
Description: PGP signature


Re: FYI compiling PIR function calls

2006-09-27 Thread Jonathan Scott Duff
On Wed, Sep 27, 2006 at 11:38:10AM +0200, Leopold Toetsch wrote:
 Am Mittwoch, 27. September 2006 09:12 schrieb Allison Randal:
 
  The basic problem is inconsistency. For hand-written code the current
  PIR method call syntactic sugar is mildly annoying. (It'd be nice to
  safely get rid of the quotes around the method name.) 
 
 Not easily:
 
   obj.'foo'()  # a methodname constant
   .local string bar
   bar = get_some_meth()   # or bar = 'get_some_meth'()
   obj.bar()# a method variable
 
 But:
 
   obj.foo()# still a methodname constant 
# unless there's a variable 'foo'
 
 To be on the safe side, method (and function) names *should* be quoted. I 
 don't think that this is inconsistent.

Is there a reason that you would want to conflate method names and
variables used as a method name? If not, why not change the syntax
slightly so that method names in a variable are uniquely identified?
Here's a suggestion:

obj.foo()   # a methodname constant
.local string bar
bar = get_some_meth()
obj.$bar()  # a method variable

The same could be used for ordinary subroutines:

.local string mysub
mysub = foo
$mysub()# calls the foo subroutine

-Scott
-- 
Jonathan Scott Duff [EMAIL PROTECTED]


Re: Common Serialization Interface

2006-09-27 Thread Aaron Sherman

Larry Wall wrote:

On Mon, Sep 25, 2006 at 09:02:56PM -0500, Mark Stosberg wrote:
: 
: eval($yaml, :langyaml);
: 
: Still, these options may not substitute for the kind of role-based

: solution you have mind.

I'm not sure it's wise to overload eval this way.  Seems like a
great way to defeat MMD.  Plus I really want a file interface so I
don't have to slurp a 37M string, which in turn requires a 400M stack
allocation currently.  Seems to want even more heap after that, and
then it really starts thrashing...


Not everything has to be done via MMD, of course, and more to the point, 
many things can play nice with MMD.


multi sub eval(Str $data, Str :$lang) {
given $lang -{
when 'perl'  { perl_eval($data)  }
when 'yaml'  { yaml_eval($data)  }
when 'perl5' { perl5_eval($data) }
when 'pir'   { pir_eval($data)   }
when 'pbc'   { pbc_eval($data)   }
when defined { call } # Back to MMD!
default  { perl_eval($data)  }
}
}


BTW: for the above, it would be nice to be able to say:

when m:i/^perl$/ {...}

without all the noise. That is, it would be nice to have something like:

when 'perl':i {...}

Dunno if that makes any sense or not.


[perl #40419] 2 PDD 07s

2006-09-27 Thread via RT
# New Ticket Created by  Will Coleda 
# Please include the string:  [perl #40419]
# in the subject line of all future correspondence about this issue. 
# URL: http://rt.perl.org/rt3/Ticket/Display.html?id=40419 


There are currently two PDD 07's in the repository:

docs/pdds/clip/pdd07_codingstd.pod  [main]doc
docs/pdds/pdd07_codingstd.pod   [main]doc

--
Will Coke Coleda
[EMAIL PROTECTED]




Re: Common Serialization Interface

2006-09-27 Thread Luke Palmer

On 9/27/06, Aaron Sherman [EMAIL PROTECTED] wrote:

BTW: for the above, it would be nice to be able to say:

when m:i/^perl$/ {...}

without all the noise. That is, it would be nice to have something like:

when 'perl':i {...}


Well, there are a few ways to do that:

   given lc $lang {...}

   when { lc eq 'perl' } {...}

   when insensitive('perl') {...}

Where the last one is a user-defined function which can be written:

   sub insensitive($str) { /:i ^ $str $/ }

Such pattern functions can be useful in a variety of contexts.  That
is, write functions which are designed explicitly to be used in the
conditions of when statements.

Luke


Re: PDD 22 - I/O release candidate 1

2006-09-27 Thread chromatic
On Wednesday 27 September 2006 03:40, Leopold Toetsch wrote:

 Now compare this with an I/O opcode:

   read S0, P0, 10   # PIO_reads(... P0 ...)

 If P0 isn't a ParrotIO opcode, this segfaults. See t/pmc/io_1.pir. While we
 could of course check, what type P0 is, such a check would be needed for
 every IO opcode. (And see below)

I don't buy this argument.  If the cost for checking the type of P0 is greater 
than the cost of doing IO, that's a big problem and not with the interface.

-- c


[svn:perl6-synopsis] r12466 - doc/trunk/design/syn

2006-09-27 Thread larry
Author: larry
Date: Wed Sep 27 10:27:18 2006
New Revision: 12466

Modified:
   doc/trunk/design/syn/S05.pod

Log:
Made directly called tokens and rules auto-anchor for readability.


Modified: doc/trunk/design/syn/S05.pod
==
--- doc/trunk/design/syn/S05.pod(original)
+++ doc/trunk/design/syn/S05.podWed Sep 27 10:27:18 2006
@@ -14,9 +14,9 @@
Maintainer: Patrick Michaud [EMAIL PROTECTED] and
Larry Wall [EMAIL PROTECTED]
Date: 24 Jun 2002
-   Last Modified: 21 Aug 2006
+   Last Modified: 27 Sept 2006
Number: 5
-   Version: 33
+   Version: 34
 
 This document summarizes Apocalypse 5, which is about the new regex
 syntax.  We now try to call them Iregex rather than regular
@@ -332,6 +332,23 @@
 to imply a C: after every construct that could backtrack, including
 bare C*, C+, and C? quantifiers, as well as alternations.
 
+The C:ratchet modifier also implies that the anchoring on either
+end is controlled by context.  When a ratcheted regex is called as
+a subrule, the front is anchored to the current position (as with
+C:p), while the end is not anchored, since the calling context
+will likely wish to continue parsing.  However, when a ratcheted
+regex is called directly, it is automatically anchored on both ends.
+(You may override this with an explicit C:p or C:c.)  Thus,
+you can do direct pattern matching using a token or rule:
+
+$string ~~ token { \d+ }
+$string ~~ rule { \d+ }
+
+and these are equivalent to
+
+$string ~~ m/^ \d+: $/;
+$string ~~ m/^ ws \d+: ws $/;
+
 =item *
 
 The new C:panic modifier causes this regex and all invoked subrules


Re: PDD 22 - I/O release candidate 1

2006-09-27 Thread Leopold Toetsch
Am Mittwoch, 27. September 2006 19:08 schrieb chromatic:
  While we

  could of course check, what type P0 is, such a check would be needed for
  every IO opcode. (And see below)

 I don't buy this argument.  If the cost for checking the type of P0 is
 greater than the cost of doing IO, that's a big problem and not with the
 interface.

A wasn't talking about costs at all. I'm talking about unneeded code 
duplication, sanity and inheritance. In fact, the proposed method interface 
will be slower (currently) - I really don't care.

leo


Re: Common Serialization Interface

2006-09-27 Thread Larry Wall
On Wed, Sep 27, 2006 at 10:43:00AM -0600, Luke Palmer wrote:
: Well, there are a few ways to do that:
: 
:given lc $lang {...}
: 
:when { lc eq 'perl' } {...}
: 
:when insensitive('perl') {...}

With the latest change to S05 that auto-anchors direct token calls,
you can now alo write:

when token { :i perl } {...}

By the way, your 0-ary lc needs to be written .lc these days.
In Chicago we outlawed most of the 0-or-1-ary functions since we now
have a 1-character means of specifying $_ as invocant.

: Where the last one is a user-defined function which can be written:
: 
:sub insensitive($str) { /:i ^ $str $/ }
: 
: Such pattern functions can be useful in a variety of contexts.  That
: is, write functions which are designed explicitly to be used in the
: conditions of when statements.

I guess that can also now be written:

my insensitive ::= token ($str) :i { $str };

or maybe even

my insensitive ::= token :i { $^str };

Larry


RFC: multi assertions/prototypes: a step toward programming by contract

2006-09-27 Thread Aaron Sherman

Executive summary:

I suggest a signature prototype that all multis defined in or exported 
to the current namespace must match (they match if the proto would allow 
the same argument list as the multi, though the multi may be more 
specific). Prototypes are exportable. Documentation tie-ins are also 
suggested, ultimately allowing for documentation-only interface modules 
which collect and re-export the interfaces of implementation modules 
while providing high-level documentation and constraints.


Details:

Larry has said that programming by contract is one of the many paradigms 
that he'd like Perl 6 to handle. To that end, I'd like to suggest a way 
to assert that there will be multi subs defined that match the 
following signature criteria in order to better manage and document the 
assumptions of the language now that methods can export themselves as 
multi wrappers. Let me explain why.


In the continuing evolution of the API documents and S29, we are moving 
away from documentation like:


our Scalar multi max(Array @list) {...}
our Scalar multi method Array::max(Array @array:) {...}

toward exported methods:

our Scalar multi method Array::max(Array @array:)
is export {...}

is export forces this to be exported as a function that operates on
its invocant, wrapping the method call. OK, that's fine, but Array isn't
the only place that will happen, and the various exported max functions
should probably have some unifying interface declared. I'm thinking of
something like:

our proto max(@array, *%adverbs) {...}

This suggests that any max subroutine defined as multi in--or exported 
to--this scope that does not conform to this prototype is invalid. Perl 
will throw an error at compile-time if it sees this subsequently:


our Any multi method Array::max(Array @array: $x)
is export {...}

However, this would be fine:

our Any multi method Array::max(Array @array: :$x)
is export {...}

because the prototype allows for any number of named parameters.

The default behavior would be to assume a prototype of:

our proto max([EMAIL PROTECTED], *%namedargs) {...}

Which allows for any signature.

Any types used will constrain multis to explicitly matching those types 
or compatible types, so:


our Int proto max(Seq @seq, *%adverbs) {...}

Would not allow for a max multi that returned a string (probably not a 
good idea).


The goal, here, is to allow us to centrally assert that Perl provides 
this subroutine without defining its types or behavior just yet. 
Documentation/code could be written for the prototype:


=item max

=inline our proto max(@array, *%adverbs) is export {...}

Cmax takes an input sequence or array (C@array) and
returns the maximum value from the sequence.
Specific implementations of max may be defined which
allow comparators or other adverbs (C%adverbs) to
be defined.

=cut

I've invented the =inline POD keyword here as an arm-wave to 
programming by contract (both Perl and POD read its argument). If it's 
not liked, the proto could be duplicated both inside and outside of the 
documentation as we do now. Kwid, when it comes to pass, could provide 
similar mechanisms. Given this, an entire interface-only module could 
exist as POD/Kwid-only, which isn't a bad thing given that pre-processed 
bytecode will be what most people are loading anyway, and thus not 
parsing the POD every time as in Perl 5.


There's also another interesting thing that we might or might not decide 
to tack onto protos, which is that the is export tag on one could 
cause the exporter mechanism to automatically export any is export 
tagged subroutines from the current namespace that match this prototype, 
even if they came from a different namespace. Essentially defining one 
proto allows you to re-export any multis that you imported by that name. 
This seems to me to be a better mechanism than a simple :REEXPORT tag or 
the like on the use, as it more explicitly targets the interfaces that 
your module defines its own prototype for.


This produces a generic set of documentation for a module that might 
only act as a re-exporter for other modules. e.g. the above might appear 
in a module called CORE which is used by the runtime automatically, 
and uses various other modules like Math::Basic and List without any 
explicit export tags, thus providing the minimal interfaces that Perl 
promises. S29 could eventually be adapted as the documentation for the 
prototypes in that module without having to actually document the 
individual APIs of the rest of the Perl runtime.


In Perl 6, therefore, perldoc perlfunc would become perldoc CORE or 
whatever we call that module.


This is only a first step to programming by contract, which has many 
more elements than simply blending signatures into documentation 
(assertions and 

special named assertions

2006-09-27 Thread David Brunton
From an IRC conversation earlier today:

A quick scan of S05 reveals definitions for these seven special named 
assertions:
  before pattern
  after pattern
  sp
  ws
  null
  '...'
  at($pos)

Twenty-four more are listed in docs/Perl6/Overview/Rule.pod (some of which are 
used in S05, but I don't think there are definitions).
  ...
  dot
  lt
  gt
  prior
  commit
  cut
  fail
  null
  ident
  self
  alnum
  alpha
  ascii
  blank
  cntrl
  digit
  graph
  lower
  print
  space
  upper
  word
  xdigit
  !XXX  # not sure if this counts

Additionally, in t/regex/from_perl6_rules/stdrules.t there is one I didn't 
notice elsewhere, but appears to be implemented in Pugs:
  punct

As far as I can tell, this yields a total of 31 or 32 special named assertions. 
 I'm sure if I have missed any obvious ones, someone will speak up.  Some have 
passing tests, some have failing tests, and some have no tests.

Does it make sense to have a single place in S05 where all the builtin special 
named assertions are defined?  It would make it easier to link the tests, and 
to tell the difference between examples like moose and builtins like ident.

Last, but not least, should any of these be crossed off the list?

Best,
David.





Re: Motivation for /alpha+/ set Array not Match?

2006-09-27 Thread Carl Mäsak

Audrey ():

Indeed... Though what I'm wondering is, is there a hidden implementation
cost or design cost of making /foo+/ always behave such that
$foo.from
returns something, compared to the current treatment with the workaround
you suggested?


Has this been settled or addressed off-list? Because from my
perspective as one who has never used P6 rules for anything in
particular, but who in the future most likely will, the proposed
semantics seems a lot saner and more useful. It'd be sad to let pass
this opportunity to fix (what from my perspective appears to be) a
shortcoming of the rule semantics.

Kindly,
--
masak


Re: RFC: multi assertions/prototypes: a step toward programming by contract

2006-09-27 Thread Trey Harris

In a message dated Wed, 27 Sep 2006, Aaron Sherman writes:

Any thoughts?


I'm still thinking about the practical implications of this... but what 
immediately occurs to me:


The point of multiple, as opposed to single, dispatch (well, one of the 
points, and the only point that matters when we're talking about multis of 
a single invocant) is that arguments are not bound to a single type. So at 
first gloss, having a single prototype in the core for all same-named 
multis as in your proposal seems to defeat that use, because it does 
constrain arguments to a single type.


I would hate for Perl 6 to start using CAny or CWhatever in the sort 
of ways that many languages abuse Object to get around the restrictions 
of their type systems.  I think that, as a rule, any prototype 
encompassing all variants of a multi should not only specify types big 
enough to include all possible arguments, but also specify types small 
enough to exclude impossible arguments.


In other words, to use your proposal, our proto moose (Moose $x:) should 
assert not just that all calls to the multi moose will have an invocant 
that does Moose, but also that all objects of type Moose will work with a 
call to the multi moose.  That may have been implicit in your proposal, 
but I wanted to make it explicit.


In practice, the ability to use junctive types, subsets, and roles like 
any other type makes the concept of single type a much less restrictive 
one in Perl 6 than in most languages.  For example, if you wanted Cmax 
to work on both arrays and hashes, you could have


  our proto max (Array|Hash $container)

Or you could define an CIndexed role that both Array and Hash do and 
have:


  our proto max (Indexed $container)

So maybe this is a reasonable constraint.  But it seems odd to me that 
Perl might then not allow me to write a Cmax that takes, say, Bags or 
Herds or whatever.  And as I said before, I think a prototype of


  our proto max (Whatever $container)

is incorrect too.  What I really want is for max to be callable on 
anything that can do max, and not on anything that can't.  Following that 
observation to its logical conclusion, at some point we get to the core 
containing prototypes like:


  our proto max(Maxable $container)
  our proto sort(Sortable $container)
  our proto keys(Keyable $container)

which (I think) results in no better support for contracts, but merely 
requires gratuitious typing (in both senses of the word): where before we 
could just write our routine multi max..., now we need to write both 
multi max... and remember to add does Maxable so Perl will let us 
compile it.


My apologies if I'm attacking a strawman here; perhaps there's a saner way 
to allow the flexibility for users to define novel implementations of 
global multis while still having the prototypes well-typed.


All that said, the globalness of multis does concern me because of the 
possibility of name collision, especially in big systems involving multis 
from many sources.  Your proposal would at least make an attempt to define 
a multi not type-conformant with a core prototype throw a compile-time 
error, rather than mysterious behavior at runtime when an unexpected multi 
gets dispatched.


Trey




[svn:parrot-pdd] r14774 - in trunk: . docs/pdds/clip

2006-09-27 Thread leo
Author: leo
Date: Wed Sep 27 12:57:22 2006
New Revision: 14774

Added:
   trunk/docs/pdds/clip/pddXX_cstruct.pod   (contents, props changed)
   trunk/docs/pdds/clip/pddXX_pmc.pod   (contents, props changed)

Changes in other areas also in this revision:
Modified:
   trunk/MANIFEST

Log:
add 2 new design docs - see also mail

Added: trunk/docs/pdds/clip/pddXX_cstruct.pod
==
--- (empty file)
+++ trunk/docs/pdds/clip/pddXX_cstruct.pod  Wed Sep 27 12:57:22 2006
@@ -0,0 +1,324 @@
+=head1 TITLE
+
+C Structure Class
+
+=head1 STATUS
+
+Proposal.
+
+=head1 AUTHOR
+
+Leopold Toetsch
+
+=head1 ABSTRACT
+
+The ParrotClass PMC is the default implementation (and the meta class)
+of parrot's HLL classes. It provides attribute access and (TODO)
+introspection of attribute names. It is also handling method
+dispatch and inheritance.
+
+C structures used all over in parrot (PMCs) and user-visible C
+structures provided by the C{Un,}ManagedStruct PMC dont't have this
+flexibility.
+
+The proposed CCStruct PMC is trying to bridge this gap.
+
+=head1 DESCRIPTION
+
+The CCStruct PMC is the class PMC of classes, which are not
+based on PMC-only attributes but on the general case of a C structure.
+That is, the CCStruct is actually the parent class of
+CParrotClass, which is a PMC-only special case. And it is the
+theoretical ancestor class of all PMCs (including itself :).
+
+The relationship of CCStruct to other PMCs is like this:
+
+PASM/PIR code C code
+  Class ParrotClass   CStruct
+  ObjectParrotObject  *ManagedStruct
+  (other PMCs) 
+
+That is, it is the missing piece of already existing PMCs. The current
+*ManagedStruct PMCs are providing the class and object functionality in
+one and the same PMC (as BTW all other existing PMCs are doing). But
+this totally prevents proper inheritance and reusability of such PMCs.
+
+The CCStruct class provides the necessary abstract backings to get
+rid of current limitations.
+
+=head1 SYNTAX BITS
+
+=head2 Constructing a CStruct
+
+A typical C structure:
+
+  struct foo {
+int a;
+char b;
+  };
+
+could be created in PIR with:
+
+  cs = subclass 'CStruct', 'foo'   # or maybe  cs = new_c_class 'foo'
+  addattribute cs, 'a'
+  addattribute cs, 'b'
+
+The semantics of a C struture are the same as of a Parrot Class.
+But we need the types of the attributes too:
+
+Handwavingly TBD 1)
+
+with ad-hoc existing syntax:
+
+  .include datatypes.pasm
+  cs['a'] = .DATATYPE_INT
+  cs['b'] = .DATATYPE_CHAR
+
+Handwavingly TBD 2)
+
+with new variants of the Caddattribute opcode:
+  
+  addattribute cs, 'a', .DATATYPE_INT
+  addattribute cs, 'b', .DATATYPE_CHAR
+
+Probably desired and with not much effort TBD 3):
+
+  addattribute(s) cs, 'DEF'
+int a;
+char b;
+  DEF   
+
+The possible plural in the opcode name would match semantics, but it is not
+necessary. The syntax is just using Parrot's here documents to define
+all the attributes and types.
+
+  addattribute(s) cs, 'DEF'
+int a;
+char b;
+  DEF   
+
+The generalization of arbitrary attribute names would of course be
+possible too, but isn't likely needed.
+
+=head2 Syntax variant
+
+  cs = subclass 'CStruct', 'DEF
+struct foo {
+  int a;
+  char b;
+};
+  DEF
+
+I.e. create all in one big step.
+
+=head2 Object creation and attribute usage
+
+This is straight forward and conforming to current ParrotObjects:
+
+  o = new 'foo' # a ManagedStruct instance
+  setattribute o, 'a', 4711
+  setattribute o, 'b', 22
+  ...
+
+The only needed extension would be C{get,set}attribute variants with
+natural types.
+
+Even (with nice to have IMCC syntax sugar):
+
+  o.a = 4711# setattribute
+  o.b = 22
+  $I0 = o.a # getattribute
+
+=head2 Nested Structures
+
+  foo_cs = subclass 'CStruct', 'foo'
+  addattribute(s) foo_cs, 'DEF'
+int a;
+char b;
+  DEF   
+  bar_cs = subclass 'CStruct', 'bar'
+  addattribute(s) bar_cs, 'DEF'
+double x;
+foo foo;# the foo class is already defined
+foo *fptr;
+  DEF   
+  o = new 'bar'
+  setattribute o, 'x', 3.14
+  setattribute o, ['foo'; 'a'], 4711 # o.foo.a = 4711
+  setattribute o, ['fptr'; 'b'], 255
+
+Attribute access is similar to current *ManagedStruct's hash syntax
+but with a syntax matching ParrotObjects.
+
+=head2 Array Structures Elements
+
+  foo_cs = subclass 'CStruct', 'foo'
+  addattribute(s) foo_cs, 'DEF'
+int a;
+char b[100];
+  DEF   
+
+=head2 Possible future extemsios
+
+  cs = subclass 'CStruct', 'todo'
+  addattribute(s) foo_cs, 'DEF'
+union {  # union keyword
+  int a;
+  double b;
+} u;  
+char b[100]  :ro;# attributes like r/o 
+  DEF   
+
+=head2 Managed vs. Unmanaged Structs
+
+The term managed in current structure usage defines the owner of the
+structure memory. CManagedStruct 

Two new pdds

2006-09-27 Thread Leopold Toetsch
Hi folks,

There are 2 new docs in docs/pdds/clip now (r14774):

1) pddXX_pmc.pod

2) pddXX_cstruct.pod

I'll start with 2) first: it'll be the metaclass of all (publically 
accessible, C-derived) structures used in Parrot, it'll be the Class PMC of 
PMC based objects therefore. As a PMC Cisa CStruct and CStruct is 
implemented as a PMC, it's the metaclass of itself.

The other document 1) describes a more general structure of PMC internal 
layout:
- fixed (non-resizable) but:
- differently-sized structures
as currently already used by Buffer, String, and Bufferlike objects.

As all changes regarding this should be non-intrusive to any existing code, 
I'll start implementing a CStruct PMC with the new layout.

Here are some (preliminary) steps:

[1] prelims
- send note about encapsulation within PMCs
- rearrange PMC fields to new conforming layout
- redefine PMC_* macros to use OPMC fields
- implemnent GC stuff re var-sized PMCs

[2] start CStruct
- use new_from_string vtable to construct a CStruct
- implement or delegate C struct {} parser.

That's it so far,
comments of course and always welcome,

leo


Re: special named assertions

2006-09-27 Thread Patrick R. Michaud
On Wed, Sep 27, 2006 at 11:59:32AM -0700, David Brunton wrote:
 A quick scan of S05 reveals definitions for these seven special named 
 assertions:
   [...]

I don't think that '...' or ... are really named assertions.

I think that !xyz (as well as +xyz and -xyz) are simply special forms
of the named assertion xyz.

I should probably compare your list to what PGE has implemented and see if
there are any differences -- will do that later tonight.

Pm



FYI: $job

2006-09-27 Thread Leopold Toetsch
Hi folks,

After a long period of fulltime parrot addiction, I've to reduce my parrot 
domestication time in favor of a day $job.

I'll try to follow  continue parrot development as time permits. Reduced dev 
time also implies that I will not use much time for reviewing or committing 
patches that anyone else[1] could handle as well too, sorry.

leo

[1] @RESPONSIBLE_PARTIES et. al.


Re: FYI: $job

2006-09-27 Thread Allison Randal
Congratulations! Many thanks for all the work you've done, and the work 
still to come. :)


Allison

Leopold Toetsch wrote:

Hi folks,

After a long period of fulltime parrot addiction, I've to reduce my parrot 
domestication time in favor of a day $job.


I'll try to follow  continue parrot development as time permits. Reduced dev 
time also implies that I will not use much time for reviewing or committing 
patches that anyone else[1] could handle as well too, sorry.


leo

[1] @RESPONSIBLE_PARTIES et. al.


Re: special named assertions

2006-09-27 Thread mark . a . biggar
The documentation should distinguish between those that are just pre-defined 
characters classes (E.G., alpha and digit) and those that are special 
builtins (E.G., before ... and commit.  The former are things that you 
should be freely allowed to redefine in a derived grammar, while the other 
second type may want to be treated as reserved, or at least mention that 
redefining them may break things in surprising ways.

--
Mark Biggar
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]

 -- Original message --
From: Patrick R. Michaud [EMAIL PROTECTED]
 On Wed, Sep 27, 2006 at 11:59:32AM -0700, David Brunton wrote:
  A quick scan of S05 reveals definitions for these seven special named 
 assertions:
[...]
 
 I don't think that '...' or ... are really named assertions.
 
 I think that !xyz (as well as +xyz and -xyz) are simply special forms
 of the named assertion xyz.
 
 I should probably compare your list to what PGE has implemented and see if
 there are any differences -- will do that later tonight.
 
 Pm
 




Re: special named assertions

2006-09-27 Thread Patrick R. Michaud
On Wed, Sep 27, 2006 at 09:12:02PM +, [EMAIL PROTECTED] wrote:
 The documentation should distinguish between those that are just 
 pre-defined characters classes (E.G., alpha and digit) and 
 those that are special builtins (E.G., before ... and commit.  
 The former are things that you should be freely allowed to redefine 
 in a derived grammar, while the other second type may want to be 
 treated as reserved, or at least mention that redefining them may 
 break things in surprising ways.

FWIW, thus far in development PGE doesn't treat before ...
and commit as special built-ins -- they're subrules, same
as alpha and digit, that can indeed be redefined by 
derived grammars.

And I think that one could argue that redefining alpha or
digit could equally break things in surprising ways.  

I'm not arguing against the idea of special builtins or saying it's
a bad idea -- designating some named assertions as special/non-derivable 
could enable some really nice optimizations and implementation shortcuts  
that until now I've avoided.  I'm just indicating that I haven't
come across anything yet in the regex implementation that absolutely
requires that certain named assertions receive special treatment
in the engine.

Thanks,

Pm

  -- Original message --
 From: Patrick R. Michaud [EMAIL PROTECTED]
  On Wed, Sep 27, 2006 at 11:59:32AM -0700, David Brunton wrote:
   A quick scan of S05 reveals definitions for these seven special named 
  assertions:
 [...]
  
  I don't think that '...' or ... are really named assertions.
  
  I think that !xyz (as well as +xyz and -xyz) are simply special forms
  of the named assertion xyz.
  
  I should probably compare your list to what PGE has implemented and see if
  there are any differences -- will do that later tonight.
  
  Pm
  
 
 
 


Re: [svn:parrot-pdd] r14774 - in trunk: . docs/pdds/clip

2006-09-27 Thread Jonathan Worthington

Hi,

Some first thoughts that come to mind after reading leo's two proposals.


+A typical C structure:
+
+  struct foo {
+int a;
+char b;
+  };
+
+could be created in PIR with:
+
+  cs = subclass 'CStruct', 'foo'   # or maybe  cs = new_c_class 'foo'
+  addattribute cs, 'a'
+  addattribute cs, 'b'
+
+The semantics of a C struture are the same as of a Parrot Class.
+But we need the types of the attributes too:
+
+Handwavingly TBD 1)
+
+with ad-hoc existing syntax:
+
+  .include datatypes.pasm
+  cs['a'] = .DATATYPE_INT
+  cs['b'] = .DATATYPE_CHAR
+
  
This (and the addattribute for native types) is one thing that would 
certainly simplify code generation for the .Net translator by 
eliminating various boxing and unboxing code that I emit now. I imagine 
it will help with other languages too.



+Handwavingly TBD 2)
+
+with new variants of the Caddattribute opcode:
+  
+  addattribute cs, 'a', .DATATYPE_INT

+  addattribute cs, 'b', .DATATYPE_CHAR
  

Certainly preferable to syntax 1.


+Probably desired and with not much effort TBD 3):
+
+  addattribute(s) cs, 'DEF'
+int a;
+char b;
+  DEF   
  
I'm not so keen on this part of the proposal. It means the CStruct PMC 
needs to parse the above syntax (but at least that also means no 
additions to PIR parsing to support this, but the previous two 
suggestions did not either).


I think if we could magically have the .DATATYPE_INT constants 
existing without needing to .include them the previous syntax (number 2) 
would be preferable. It compiles down to just a sequence of bytecode 
instructions then, rather than a constants table entry for the string. 
But more importantly, all syntax checking is done at PIR compile time, 
whereas the string describing the struct elements and types would not be 
parsed until runtime so typo's in type names or general syntax errors 
aren't detected until then.



+The generalization of arbitrary attribute names would of course be
+possible too, but isn't likely needed.
  

Unsure what this means - please clarify this a bit.


+=head2 Syntax variant
+
+  cs = subclass 'CStruct', 'DEF
+struct foo {
+  int a;
+  char b;
+};
+  DEF
+
+I.e. create all in one big step.
  

Same issues as above.


+=head2 Object creation and attribute usage
+
+This is straight forward and conforming to current ParrotObjects:
+
+  o = new 'foo' # a ManagedStruct instance
+  setattribute o, 'a', 4711
+  setattribute o, 'b', 22
+  ...
+
+The only needed extension would be C{get,set}attribute variants with
+natural types.
  
This is the real place, of course, where the .Net translator (and I 
think other compilers) will save on spitting out box/unbox code.



+=head2 Nested Structures
+
+  foo_cs = subclass 'CStruct', 'foo'
+  addattribute(s) foo_cs, 'DEF'
+int a;
+char b;
+  DEF   
+  bar_cs = subclass 'CStruct', 'bar'

+  addattribute(s) bar_cs, 'DEF'
+double x;
+foo foo;# the foo class is already defined
  
May I suggest change second foo there to something else? I know it's the 
attribute name, but it made me scratch my head to check something odd 
wasn't going on.

+foo *fptr;
+  DEF   
+  o = new 'bar'

+  setattribute o, 'x', 3.14
+  setattribute o, ['foo'; 'a'], 4711 # o.foo.a = 4711
+  setattribute o, ['fptr'; 'b'], 255
  
Can you describe the semantics of foo vs *foo (or *fooptr as it appears 
in the above code) are more clearly? Is guess it just that in one case 
the foo structure is a part of the bar one, and in the other case it's a 
pointer to it, like in C? But please don't rely too much on knowledge of 
C semantics when describing Parrot ones.



+=head2 Array Structures Elements
+
+  foo_cs = subclass 'CStruct', 'foo'
+  addattribute(s) foo_cs, 'DEF'
+int a;
+char b[100];
+  DEF   
  

With bounds checking on accesses to b, right?


+=head2 Managed vs. Unmanaged Structs
+
+The term managed in current structure usage defines the owner of the
+structure memory. CManagedStruct means that parrot is the owner of
+the memory and that GC will eventually free the structure memory. This
+is typically used when C structures are created in parrot and passed
+into external C code.
+
+CUnManagedStruct means that there's some external owner of the
+structure memory. Such structures are typically return results of 
+external code.

+
  
I think for safety reasons we will later want to have some way of only 
letting approved code that uses unmanagedstructs run, as with them 
anyone can segfault the VM in no time at all...but that's for a security 
PDD or something.



+This proposal alone doesn't solve all inheritance problems. It is also
+needed that the memory layout of PMCs and ParrotObjects deriving from
+PMCs is the same. E.g.
+
+  cl = subclass 'Integer', 'MyInt'
+
...
+
+With the abstraction of a CCStruct describing the CInteger PMC and
+with differently sized PMCs, we can create an object layout, where the
+Cint_val attribute of CInteger and CMyInt are at 

Re: How to pass a ref from a language with no refs

2006-09-27 Thread Mark Stosberg
Mark Stosberg wrote:
 
 When Perl 5 has references and Perl 6 doesn't, I don't know what to
 expect to when I need to pass a hash reference to a Perl 5 routine.
 
 Such details make no appearance currently in the Perl 6 spec, but I'm
 trying to gather them on the wiki if you have anything to add:
 
 http://rakudo.org/perl6/index.cgi?using_perl_5_embedding

I saw there have been some commits lately to Perl5 embedding, so I tried
some experiments with pugs to figure out if I could determine reliable
ways pass hashes and arrays to Perl5, so that they are received as
hashes, hashrefs, arrays, or arrayrefs, as appropriate.

I came up with the following test. As you can see, with arrays I was
able to pass them as a reference or not. However, when attempting to
pass a hash, it always came through as a hash, never flattened. Have I
missed something?

my $p5_dumper =
  eval('sub {use Data::Dumper; print Dumper(@_); }', :langperl5);

my @a = b c d;
$p5_dumper.(@a);  # received as array
$p5_dumper.([EMAIL PROTECTED]); # received as arrayref
$p5_dumper.(VAR @a);  # received as arrayref

my %h = ( a = 1 );
$p5_dumper.(@%h); # received as hashref
$p5_dumper.([,] %h);  # received as hashref
$p5_dumper.(|%h); # received as hashref
$p5_dumper.(%h);  # received as hashref
$p5_dumper.(\%h); # received as hashref
$p5_dumper.(VAR %h);  # received as hashref


Re: RFC: multi assertions/prototypes: a step toward programming by contract

2006-09-27 Thread Jonathan Lang

Minor nitpick:


Any types used will constrain multis to explicitly matching those types
or compatible types, so:

our Int proto max(Seq @seq, *%adverbs) {...}

Would not allow for a max multi that returned a string (probably not a
good idea).


IIRC, perl 6 doesn't pay attention to the leading Int here except when
dealing with the actual code block attached to this - that is, Int
isn't part of the signature.  If you want Int to be part of the
signature, say:

   our proto max(Seq @seq, *%adverbs - Int) {...}

More to the point, I _could_ see the use of type parameters here
(apologies in advance if I get the syntax wrong; I'm going by memory):

   our proto max(Seq of ::T [EMAIL PROTECTED], *%adverbs - ::T) {...}

This would restrict you to methods where the return type matches the
list item type.

--
Jonathan Dataweaver Lang


Re: RFC: multi assertions/prototypes: a step toward programming by contract

2006-09-27 Thread Aaron Sherman

Trey Harris wrote:

In a message dated Wed, 27 Sep 2006, Aaron Sherman writes:

Any thoughts?


I'm still thinking about the practical implications of this... but 
what immediately occurs to me:


The point of multiple, as opposed to single, dispatch (well, one of 
the points, and the only point that matters when we're talking about 
multis of a single invocant) is that arguments are not bound to a 
single type. So at first gloss, having a single prototype in the core 
for all same-named multis as in your proposal seems to defeat that 
use, because it does constrain arguments to a single type.


I certainly hope not, as I agree with you! That's not the goal at all, 
and in fact if that were a side effect, I would not want this to be 
implemented. The idea of having types AT ALL for protos was something 
that I threw in because it seemed to make sense at the end. The really 
interesting thing is to match signature shapes, not types. That is, max 
doesn't take two positional arguments, and a max that does is probably 
doing something that users of max will be shocked by. To this end, a 
programmer of a library *can* issue an assertion: all implementations of 
max will take one (no type specified) positional parameter and any 
number of adverbial named parameters (again, no type specified).


Notice that I keep saying no type specified, when in reality, us Perl 
6 programmers know that parameters default to type Any (it is Any now, 
right?) I don't see value in protos taking this into account. If there 
is value, then I'll bow to superior Perl 6 mojo on the part of whoever 
can point it out.


Remember that this is NOT part of the MMD system. Once a multi is 
declared, and passes any existing protos, the proto no longer has any 
relevance, and is never consulted for any MMD dispatch. It is forgotten 
(unless a new multi is defined later).


Does that help to remove any concerns? Adding in types is fine, and I 
have no problem with it, but adding in types should probably not be 
something done in core modules without heavy thought.


In other words, to use your proposal, our proto moose (Moose $x:) 
should assert not just that all calls to the multi moose will have an 
invocant that does Moose, but also that all objects of type Moose will 
work with a call to the multi moose.  That may have been implicit in 
your proposal, but I wanted to make it explicit.


If you specify such types, OK, that seems fair. Side point: the multi 
moose is a pretty darned funny turn of phrase ;)


All that said, the globalness of multis does concern me because of the 
possibility of name collision, especially in big systems involving 
multis from many sources.  Your proposal would at least make an 
attempt to define a multi not type-conformant with a core prototype 
throw a compile-time error, rather than mysterious behavior at runtime 
when an unexpected multi gets dispatched.


Say signature-conformant there, and I'm in full agreement.



Synopses on the smoke server are a bit out-of-date

2006-09-27 Thread Agent Zhang

Hi~

lanny noticed yesterday that the Synopses on the smoke server were
different from the ones on feather. Because I am maintaining the
feather ones, I know the synopses there are being resync'd every hour
as expected.

As of this writing, the feather synopses are at r12466 while the ones
on the smoke server are at r12432. I'm just wondering what the resync
cycle of the cron setting for the latter is. Hopefully it's still
working.

Maybe iblech++ and malon++ could look into this issue?

Regards,
Agent