T and L parameter types for NCI

2006-01-02 Thread Dan Sugalski
I just went, after ages, and sync'd up with a current parrot for my 
work project. Fixing things to work with the changes has been... 
interesting.


The big hang up has been the removal of the T and L parameter types 
for NCI calls. T was a string pointer array and L was a long array. 
They're still documented in call_list.txt, there are still references 
to them in parts of the library, and there are fragments of the code 
for them in nativecall.pl.


Change 9260 did this back in september (yes, it has been ages, I'm 
just syncing up now). This breaks the postgres.pir interface code -- 
making calls into postgres now isn't possible, as the interpreter 
pukes and dies when you try.


Are there alternatives? The documentation for this stuff is worse now 
than when I wrote it originally, and it's not clear what needs to be 
done to duplicate the functionality of the removed call types.

--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Ordered Hashes -- more thoughts

2005-06-09 Thread Dan Sugalski

At 4:05 PM -0400 6/8/05, Tolkin, Steve wrote:
Summary: An ordered hash that does not support deletes could cause a 
user visible bug.  At a minimum it should support the special case 
of delete that is supported by the Perl each() operator.


Details: This Week in Perl 6, May 25, 2005-May 31, 2005 
http://www.perl.com/pub/a/2005/06/p6pdigest/20050602.htmlhttp://www.perl.com/pub/a/2005/06/p6pdigest/20050602.html 
has a brief discussion of Ordered Hashes with this link 
http://groups-beta.google.com/group/perl.perl6.internals/browse_frm/thread/86466b906c8e6e10/24a935c5c2c71aa8#24a935c5c2c71aa8http://groups-beta.google.com/group/perl.perl6.internals/browse_frm/thread/86466b906c8e6e10/24a935c5c2c71aa8#24a935c5c2c71aa8 
where Dan Sugalski says: I'd just pitch an exception if code 
deletes an entry  ...


Perhaps this is OK, because this code is intended for internal use 
only.  But people like to reuse code, and if anyone writes an 
ordered hash module on top of this code it will have a bug.


Which is why it ought not get reused.

The whole point of the original ordered hash was to support lexical 
pads as fast as possible while still allowing by-name lookup for 
introspective code. Doing anything that compromises fast array-based 
lookup would be ill-advised for that. If it makes sublclassing tough, 
well... subclassing continuations is likely going to be problematic 
too, but that's fine.


Reuse is good but everything doesn't need to be reusable. Special 
purpose data structures are just fine too.

--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: What the heck is... wrong with Parrot development?

2005-06-07 Thread Dan Sugalski
I didn't realize I was still on this list (which I'm going to take 
care of in a bit) but since I'm seeing this...


At 5:07 PM -0700 6/6/05, Edward Peschko wrote:
But it still strikes me as odd. I saw some tension in the perl6 
mailing lists, but

nothing that would have suggested that *this* would happen.


There was a *lot* of tension. It didn't go anywhere because I always 
insisted that the internals list stay professional and polite, and 
that applied to me as well. (And when I didn't, well... I've a 
collection of mail taking me to task about it)


No, things weren't particularly happy in parrot land. And no, you 
generally didn't see it. And no, it has nothing to do with Larry. And 
no, I'm not going to go into it here -- this isn't the place for it.

--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Missing MMD default functions?

2005-06-04 Thread Dan Sugalski

At 8:14 PM -0400 6/3/05, Chip Salzenberg wrote:

On Fri, Jun 03, 2005 at 02:55:52PM -0400, Dan Sugalski wrote:

 Dan was expecting sane defaults, that is when I do addition with two
 PMCs that haven't otherwise said they behave specially that the
 floating point values of the two PMCs are retrieved and added
 together.


Is deriving from Float a hardship?
(This is not a rhetorical question.)


Mildly, yes. But... I'm not going to argue any more. It isn't worth 
it. Do whatever you think is best, and if there's any followup you 
think I should care about it'd be best to cc me, since I'm not on the 
list any more.

--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Missing MMD default functions?

2005-06-03 Thread Dan Sugalski

At 9:23 AM +0200 6/3/05, Leopold Toetsch wrote:

Dan Sugalski [EMAIL PROTECTED] wrote:

 I sync'd up with subversion this afternoon, and I'm finding that a
 *lot* of things that used to work for me are now breaking really
 badly. Specifically where there used to be sane fallbacks for pretty
 much all of the MMD functions now we've got nothing and I'm having to
 install a lot of crud I never used to have to.


You are not very verbose about what actually fails, but I presume that
you are speaking of ParrotObjects, which happened to call the mmd
fallback functions in absence of a Cmmdvtregister overload.


 I assume that we didn't throw away all the default functions on
 purpose, since that'd be more than a little foolish.


Well, if this is sane or foolish is probably a matter of taste or usage.
I think that arbitrary objects shouldn't have floating point
mathematical semantics.


Then they don't implement the floating point accessor vtable functions.


All the fallback functions where just duplicating code from (mostly)
float.pmc or integer.pmc.


Right, so to reduce code duplication you remove stuff that's working 
so people have to go reimplement the code. That makes *perfect* sense.

--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Missing MMD default functions?

2005-06-03 Thread Dan Sugalski

At 2:50 PM +0200 6/3/05, Leopold Toetsch wrote:

Dan Sugalski wrote:

Right, so to reduce code duplication you remove stuff that's 
working so people have to go reimplement the code. That makes 
*perfect* sense.


I've announced and summarized all these changes, e.g.
http://xrl.us/gayp  on Apr. 8th


It's been a long time since I sync'd up. I *assumed* that you 
wouldn't break stuff. I don't know why.



And, what is wrong about:

  cl = subclass Float, MyFloat


Why should I have to sublcass anything to get basic functionality?

Don't bother answering that one. Having to deal with this sort of 
crap is the single biggest reason I bailed. I'm happy to not have to 
do so, and I'm going to keep on being bailed. Do whatever you want, 
you're someone else's problem now.

--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Missing MMD default functions?

2005-06-03 Thread Dan Sugalski

At 2:21 PM -0400 6/3/05, Chip Salzenberg wrote:

One could argue that by providing __get_integer, Foo class is
automatically implying that it would serve where an Integer would.
This is obviously what Dan was expecting.  :-,


Dan was expecting sane defaults, that is when I do addition with two 
PMCs that haven't otherwise said they behave specially that the 
floating point values of the two PMCs are retrieved and added 
together.


Y'know, like people would generally expect from all the languages in 
the core set parrot cares about.

--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Missing MMD default functions?

2005-06-02 Thread Dan Sugalski
I sync'd up with subversion this afternoon, and I'm finding that a 
*lot* of things that used to work for me are now breaking really 
badly. Specifically where there used to be sane fallbacks for pretty 
much all of the MMD functions now we've got nothing and I'm having to 
install a lot of crud I never used to have to.


I assume that we didn't throw away all the default functions on 
purpose, since that'd be more than a little foolish. Is this stuff 
being worked on, or shall I take some time to throw the default code 
back into the MMD subsystem?

--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


re: Keys

2005-06-01 Thread Dan Sugalski

At 7:10 PM -0700 5/31/05, TOGoS wrote:

  The 'used as' type indicates whether this key

 is to be used to do a  by-integer-index (array)
 access or by-string-index (hash) access.


Why not extend this to properties, too?


Because properties (and attributes) are metadata. At the moment 
properties aren't ordered, but they could be I suppose. (well... 
maybe) Attributes certainly are. You may want to access them by name 
or by offset, in which case you wouldn't want an as-property flag in 
the key, but rather use a key structure to access properties and 
attributes.

--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [PATCH]Loop Improvements

2005-05-31 Thread Dan Sugalski

At 10:19 AM +0200 5/31/05, Leopold Toetsch wrote:

Curtis Rawls (via RT) wrote:


This patch makes improvements to the loop struct.


Thanks, applied - r8219

BTW: would you like to take a look at the register allocator?

It works but consumes enormous amounts of resources for e.g. Dan's 
Evil Subs[1]. I've here an IIRC slightly modfied version of Bill's 
original patch, which I could sync to current Parrot.


When patches that improve assembly time of Big Evil code go in, feel 
free to ping me and I'll give things a whirl. I'm not generally 
keeping up to date with parrot builds on the system I'm doing my dev 
work on, so it can be a while before I notice. (While here being a 
month or more sometimes)


Believe me, speeding up compile times would make Dan a Happy Camper(tm) :)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Keys

2005-05-31 Thread Dan Sugalski

Since this is coming back up, and a ref's in order...

The way keyed access is supposed to work is this.

A key structure is an array of three things:

   1) A key
   2) A key type
   3) A 'used as' type

The key can be an integer, string, or PMC, and is, well, the key. The 
thing we use to go looking up in an aggregate.


The key type notes the type of a key -- whether it's an integer, a 
string, or a PMC. Nothing at *all* fancy, basically a what thing is 
in the key union slot.


The 'used as' type indicates whether this key is to be used to do a 
by-integer-index (array) access or by-string-index (hash) access.


So, code like:

$a{'foo'}

would generate a key struct that looks like:

 'foo'
 string
 as-hash

while $a['foo'] generates:

 'foo'
 string
 as-integer

and $a{$b} generates

 $b
 PMC
 as-hash

and $a{3} generates

 3
 integer
 as-hash

*If* a PMC supports it, it may complain if the basic type is 
incorrect (that is, you pass in a string for array access) but 
generally that's a bad thing -- the PMC should just convert. (All the 
languages we care about do this, as far as I know)


Keys are multidimensional -- that is, we support real 
multidimensional arrays, hashes, arrays of hashes, hashes of arrays, 
and so on. That means BASIC code like:


A[1,2,3]

generates a key that looks like:

 1
 integer
 as-integer
 2
 integer
 as-integer
 3
 integer
 as-integer


and perl code like:

$a{'a'}[2]{$b}

*should* get you a structure like:

'a'
string
as-hash
2
integer
as-integer
$b
pmc
as-hash

It is *perfectly* valid to pass in a multidimensional key to a 
one-dimensional aggregate, or an n+m dimensional key to an 
n-dimensional aggregatge. In that case the aggregate *must* consume 
as much of the key as it can, use that to fetch out a PMC, and then 
make a keyed access call to the resulting PMC with the remainder of 
the key structure. (Those of you old-timers who remember, this is why 
the key stuff was done as a singly-linked list once upon a time)


This scheme is fairly straightforward and has the advantage of 
allowing true multidimensional data structures (which we really want 
for space efficiency) and still handling the more legacy 
one-dimensional aggregates of references scheme that, say, perl 5 
uses.

--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: refcounts and DOD

2005-05-26 Thread Dan Sugalski

At 6:08 PM -0400 5/25/05, Michal Wallace wrote:

So I'm still thinking about a generic
wrapper for python modules. I would like to be able to recompile the 
python standard library (and other libraries) to run on parrot

with only a few minor patches.


If you're doing this to make the python library parrot extensions, 
then just go and make the inc/dec macros noops, since they're not 
necessary what with us tracking the pointers and all. (The only time 
they'll be needed is if the pointers to the refcounted things are in 
places parrot can't find 'em)

--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: ordered hash thoughts

2005-05-25 Thread Dan Sugalski

At 3:34 PM +0200 5/25/05, Leopold Toetsch wrote:
The OrderedHash PMC provides indexed access by a (string) key as 
well as indexed access by insertion order. It's currently 
implemented as an hash holding the index value into the data array.
The problem is of course deleting items (and adding items w/o string 
key). The former is done by storing a new Undef item, the latter 
just messes up the whole thing, IIRC.


Given the usage this is supposed to be for, that is:


E.g.:
- lexicals
- PMC/class types
- object attributes
- constant string table
- ...


I'd just pitch an exception if code deletes an entry or adds one with 
no string key. Or if you're going to make it work maybe it'd be worth 
renaming the current PMC class into something else and creating a new 
class. (Assuming the changes would slow down or add complexity to the 
current class, since we'd really like it fast and simple enough to be 
reasonably auditable)

--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Parrot as an extension language

2005-05-20 Thread Dan Sugalski
At 1:15 AM +0800 5/21/05, Autrijus Tang wrote:
On Sat, May 21, 2005 at 12:53:15AM +0800, Autrijus Tang wrote:
 Yeah, I bumped against that too.  You need to look at the strstart
 field in the ParrotString struct.
 In Haskell I use:
 peekCString = #{peek STRING, strstart} s5
Actually, never mind; string_to_cstring is the way to go.
Well, mostly. string-cstring conversion is potentially lossy, if for 
no other reason than embedded nulls will get in your way. I see we're 
not exposing anything to do that, though, which we ought to fix.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Parrot as an extension language

2005-05-20 Thread Dan Sugalski
At 8:10 PM +0100 5/20/05, Colin Paul Adams wrote:
  Leopold == Leopold Toetsch [EMAIL PROTECTED] writes:
Leopold Colin Paul Adams [EMAIL PROTECTED] wrote:
 I have a problem with this - namely that the function is
 variadic, and the interface generator can't cope with this.
Leopold Have a look at src/inter_run.c e.g.
Leopold void * Parrot_runops_fromc_arglist(Parrot_Interp
Leopold interpreter, PMC *sub, const char *sig, va_list args)
Despite what I said before, this is actually worse than
Parrot_call_sub.
The interface generator ignores it completely, rather than generating
an incorrect signature (I suppose that's an improvement really, except
then it can't even be used for subroutines which accept zero
arguments).
Then the question is: What'd work? There are limits to what's going 
to be useful if you're not in a position to do full variadic function 
calls, and I'm not sure that we should put too many special cases in. 
On the other hand, I know how much of a pain this can be, since I did 
the first implementation of parrot's NCI interface, which has the 
same sorts of issues to deal with.

So, I see four real options:
1) Someone fixes the Eiffel interface generator to understand C 
variadic functions.
2) We provide a function and method call interface that assumes 
you've already pre-filled in the registers according to parrot's 
calling conventions
3) We build some sort of really simple call interface that takes an 
array PMC with the parameters stuffed into it
4) Parrot provides some sort of facility to autogenerate shim 
functions based on a passed-in signature

None of them are particularly good, and they'll all potentially cause 
you problems with an interface generator. 1's the best (and not just 
because it means we don't have to do anything :) but that's probably 
untenable. #4 is probably the next easiest thing, but I'm not sure 
that you could use that without at least some tweaks to the interface 
generator, and it'd be a bit dodgy on systems we don't have JIT 
capabilities on. #s 23 are sub-optimal, and either require some work 
on the caller's part (or changes to the interface generator) or kinda 
limit what you can do.

Teaching the interface generator about variadic functions is probably 
the easiest thing -- in principle it's not that tough (Hey, I did 
one, I get to say that :) though that does depend on what the code in 
the interface generator looks like.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Parrot as an extension language

2005-05-20 Thread Dan Sugalski
At 4:35 PM -0400 5/20/05, C. Scott Ananian wrote:
On Fri, 20 May 2005, Dan Sugalski wrote:
Well, mostly. string-cstring conversion is potentially lossy, if 
for no other reason than embedded nulls will get in your way. I see 
we're not exposing anything to do that, though, which we ought to 
fix.
pascal-style strings (ie, char* and length) are the canonical way to
fix this.  C code generally doesn't have too much trouble replacing
strcpy with memcpy, etc...
That's what parrot strings are. (Well, a bit more than that, since we 
carry around encoding and charset information as well) It's only when 
you drop down to C-style blob 'o memory with an in-band EOS marker 
that you run into trouble.

There are interfaces in the extension system to get a void * and 
length back from a PMC when fetching string data out, but I see we 
don't have that for plain strings. I'll probably fix that this 
weekend if someone doesn't beat me to it.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Useful task -- Character properties

2005-05-04 Thread Dan Sugalski
At 10:21 AM -0500 5/4/05, Patrick R. Michaud wrote:
On Tue, May 03, 2005 at 09:22:11PM +0100, Nicholas Clark wrote:
 Whilst I confess that it's unlikely to be me here, if anyone has the time
 to contribute some help, do you have a list of useful self-contained tasks
 that people might be able to take on?
Actually, overnight I realized there's a relatively good-sized
project that needs figuring out -- identifying character properties
such as isalpha, islower, isprint, etc.  Here I'll briefly sketch
how I'd like it to work, and maybe someone enterprising can take
things from
I'd planned on everything else going into constructed character 
classes. I'd figured the named classes would correspond to the major 
regex classes (things represented by \X sequences) while the 
constructed classes would handle everything else and more or less 
correspond to [] style sequences.

I thought I'd put in some docs to that effect, but apparently not. :(
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: parrot and refcounting semantics

2005-05-02 Thread Dan Sugalski
At 8:59 AM +0100 5/1/05, sphillips wrote:
/lurk
I have been enjoying the recent discussion of GC vs refcounting. Thanks.
While you're rehashing/justifying sensible design decisions made 
years ago ;-) I was wondering why you decided to roll-your-own GC 
rather than use an established one e.g. Hans Boehm's.
I did look at the Boehm collector when we started this. It's a good 
piece of software and works really well. The reason I didn't go with 
it was basically one of speed.

The Boehm collector is a conservative collector -- that is, it 
doesn't make any assumptions about what is and isn't a pointer to a 
collectable thing, and errs on the side of caution. For the uses 
people put it to, that's a good thing indeed. That caution brings two 
issues along with it, though. The first is that it's relatively slow, 
since it can't make any assumptions about what is and isn't a 
pointer, and therefore has to treat everything as if it might be a 
pointer and test it accordingly. The second issue is that since it 
can't assume things really are pointers or not you can have random 
memory mis-identified as a pointer to something and therefore have an 
object left as live when it really isn't. Neither of these problems 
is a bug per se in the Boehm collector -- it's one of those nature of 
the beast things.

Parrot, though, because of what it is doesn't need a conservative 
collector. We can know exactly where pointers to collectable things 
live, which means that we can both be faster (no pointer tests, no 
spending time on memory that can't be pointers) and exact. That means 
collection runs don't leave objects alive when they're not because an 
IEEE float or bitvector was mis-identified as pointing to something 
real, and it means that collection runs are faster since we check 
fewer places for pointers, the pointer checks are faster (no checking 
to see if it's a real pointer), and there are on average fewer 
objects alive at any one time. The only place this falls down is when 
we check the system stack, but it's generally only a few dozen words, 
so the extra overhead's OK, and it certainly beats the alternative. 
(Which is crashing)

Basically if you're in a position to build an exact collector you'll 
get a nice speed win over using a conservative one. If you can reduce 
the uncertainty you get a speed boost. A lot of programs aren't in a 
position to do that, which is fine. Parrot, because of what it is, 
*is* in a position to do so, so we did.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: parrot and refcounting semantics

2005-04-30 Thread Dan Sugalski
At 11:12 PM -0400 4/29/05, Bob Rogers wrote:
   From: Dan Sugalski [EMAIL PROTECTED]
   Date: Fri, 29 Apr 2005 15:23:47 -0400
   At 10:55 PM -0400 4/28/05, Bob Rogers wrote:
   From: Robin Redeker [EMAIL PROTECTED]
   I'm astounded.  Do neither of you ever design data structures with
   symmetrical parent-child pointers?  No trees with parents?  No
   doubly-linked lists?  In my (probably skewed) experience, circular
   references are used frequently in languages like C or Lisp that don't
   penalize them.
   I responded to Uri on this, but note that I said neither are
   terribly common, and they aren't. Relative to the total number of
   GC-able things, objects in circular structures are a very small
   minority.
I can think of many programs I've written or otherwise hacked on where
this is not the case.  In some cases, the majority of objects are
directly circular (i.e. part of a cycle as opposed to being referenced
from an object in a cycle).  But I suppose that just means that we've
worked on very different apps.  Before Perl5, I used to use parent
pointers at the drop of a hat.  But, 'nuff said, I guess.
Actually I think you'll find that isn't the case, though it's easy to 
overlook, and is probably mostly my sloppy terminology. Programs, 
when they run, tend to create a lot of what I've been calling 
objects, but for parrot they'd be PMCs, and not necessarily actually 
real objects as such.

Languages parrot's going for -- perl, python, ruby, php, tcl, and 
their ilk -- do chew through a lot temps. For the GC's purposes, and 
mine, they count. When you add those in, the plain GCable thing count 
tends to skyrocket. Yeah, the *important* data structures in a 
program may be partly, or mostly, in circular structures, but the 
hundred thousand temps that get created to hold intermediate values 
are all uncircular things. This is definitely not the case for 
languages like C++ where you're a lot closer to the metal, but we're 
a bit further away from it here. (It'd be interesting, since parrot 
*can* keep stats, to run some complex programs and see what the GC 
numbers look like. I ought to turn it on for some of the 
longer-running reports I have and see what there is to see, though 
the results may be as much a condemnation of my compiler as anything 
else. :)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: parrot and refcounting semantics

2005-04-30 Thread Dan Sugalski
At 7:50 PM +0200 4/30/05, Robin Redeker wrote:
Hi!
Just a small question:
On Thu, Apr 28, 2005 at 04:37:21PM -0400, Dan Sugalski wrote:
 If you don't have the destroy, and don't tag the object as needing
 expedited cleanup, then the finalizer *will* still be called. You
 just don't have any control over when its called.
Will there be destructors on imcc or language level?
Yes. Or no, depending on your terminology. We're calling them 
finalizers since that's what they really are, as there's no memory to 
destroy. There's a vtable method that's called by the GC system when 
an object is no longer reachable from the root set.

 And if so, what
would the purpose of them be?
It's there so that if there's any sort of cleanup that needs doing -- 
closing DB handles or files, destroying windows, updating stats -- 
can be done. You don't have to free memory unless you've managed to 
allocate it in a way that parrot's not tracking it.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: parrot and refcounting semantics

2005-04-30 Thread Dan Sugalski
At 9:19 AM +0200 4/30/05, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
  ... We should probably make it 'safe' by forcing the
 destroyed PMC to be an Undef after destruction, in case something was
 still referring to it.
That sounds sane. Or maybe be: convert to an Undef and put a Null PMC
vtable into it which would catch any further access to that PMC.
BTW shouldn't we really separate Cdestroy and Cfinalize? The latter
would be overridable by user code, the former frees allocate memory.
Nope, I don't think so. There's really only one action -- You are 
dead. Go clean up after yourself -- that PMCs should be getting. 
There's no need to clean up memory since we do that for you 
automatically, and if you have to release memory back to a 
third-party library it's part of the cleaning up after yourself bit.

I can't really think of a reason to have two cleanup actions. Maybe 
I'm missing something here.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: parrot and refcounting semantics

2005-04-29 Thread Dan Sugalski
At 3:05 PM +0200 4/29/05, Leopold Toetsch wrote:
Gerald Butler [EMAIL PROTECTED] wrote:
 Isn't there something like:

{
my $s does LEAVE { destroy $s } = new CoolClass;
# ... do stuff that may throw ...
}

 Or something like that?
Not currently. There used to be a Cdestroy opcode, but I've deleted
it, because I thought it's too dangerous.
We really need to put it back in -- I knew it was dangerous, but it 
was necessary. We should probably make it 'safe' by forcing the 
destroyed PMC to be an Undef after destruction, in case something was 
still referring to it.

--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: parrot and refcounting semantics

2005-04-29 Thread Dan Sugalski
At 12:37 AM -0400 4/29/05, Uri Guttman wrote:
  RR == Robin Redeker [EMAIL PROTECTED] writes:
  RR I don't think circular references are used that much. This is
  RR maybe something a programmer still has to think a little bit
  RR about.  And if it means, that timely destruction maybe becomes
  RR slow only for the sake of collecting circular references... don't
  RR know if thats a big feature.
ever do any callback stuff? an object needs to be called back from a
service (say a i/o handler). it must pass itself to the service to be
stored there. the service returns a handle which needs to be stored in
the object so it can be used to manage it (start/stop/abort/etc.). there
is a quick circular ref.
Oh, I realize that, along with a number of other useful uses of 
circular references. I can't speak for Robin, but when I said 
circular refs weren't that common I was talking in the overall number 
of things case. The large majority of objects are dead-stupid things 
that have no finalizers and no references to anything, the second 
largest (and definitely smaller case) is objects that refer to other 
objects in a non-circular fashion, then circular objects, then 
objects that need timely destruction.

Anyway, that usage pattern argues for efficiency handling simple PMCs 
first, reference/aggregate PMCs second, and ones with timely 
destructors last, which is how parrot's DOD/GC system's set up.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: parrot and refcounting semantics

2005-04-29 Thread Dan Sugalski
At 10:55 PM -0400 4/28/05, Bob Rogers wrote:
   From: Robin Redeker [EMAIL PROTECTED]
   Date: Thu, 28 Apr 2005 00:12:50 +0200
   Refcounting does this with a little overhead, but in a fast and
   deterministic O(1) way.
This is the first half of an apples-to-oranges comparison, and so is
misleading even if partly true.  Refcounting may be proportional
(approximately) to the amount of reference manipulation, but GC is
proportional (though even more approximately, and with a different
constant) to the amount of memory allocated [1].
Actually it's proportional to the number of live objects.
A refcounting scheme has to touch *all* objects at least twice, while 
a tracing scheme generally has to touch only the objects that are 
live at trace time. For the most part, refcount O(n) time is 
proportional to the total number of objects created, while tracing 
O(n) time is proportional to the number of live objects.

It's definitely possible to work up degenerate examples for both 
refcount and tracing systems that show them in a horribly bad light 
relative to the other, but in the general case the tracing schemes 
are significantly less expensive.

   From: Dan Sugalski [EMAIL PROTECTED]
   Date: Thu, 28 Apr 2005 13:10:00 -0400
   . . .
   I don't think circular references are used that much. This is maybe
   something a programmer still has to think a little bit about.
   And if it means, that timely destruction maybe becomes slow only for the
   sake of collecting circular references... don't know if thats a big
   feature.
   Circular references are far more common than objects that truly need
   timely destruction, yes, and the larger your programs get the more of
   an issue it is. Neither are terribly common, though.
I'm astounded.  Do neither of you ever design data structures with
symmetrical parent-child pointers?  No trees with parents?  No
doubly-linked lists?  In my (probably skewed) experience, circular
references are used frequently in languages like C or Lisp that don't
penalize them.
I responded to Uri on this, but note that I said neither are 
terribly common, and they aren't. Relative to the total number of 
GC-able things, objects in circular structures are a very small 
minority. Which, of course, doesn't help much as an app designer when 
you have to deal with these things, but it is important to know when 
doing the design of the back end, since relative usage of features 
needs to be kept in mind when making design tradeoffs. One of those 
annoying engineering things. (Just once I'd love to have my cake 
*and* eat it too, dammit! :)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: parrot and refcounting semantics

2005-04-28 Thread Dan Sugalski
At 5:57 PM +0200 4/28/05, Robin Redeker wrote:
On Wed, Apr 27, 2005 at 03:43:32PM -0400, Dan Sugalski wrote:
 At 5:40 PM +0200 4/27/05, Robin Redeker wrote:
 Just for the curious me: What was the design decision behind the GC
 solution? Was refcounting that bad? Refcounting gives a more global
 speed hit indeed, but it's more deterministic and you wont run into
 (probably) long halts during GC. (Java programs often suffer from this,
 and it results in bad latency).
 I'll answer this one, since I'm the one responsible for it.
 Refcounting has three big issues:
 1) It's expensive
 2) It's error-prone
 3) It misses circular garbage
[...]
 The expense is non-trivial as well. Yeah, it's all little tiny bits
 of time, but that adds up. It's all overhead, and useless overhead
 for the most part.
Yes, but do we know whether refcounting is really slower than a garbage
collector in the end?
Yes, we do. This is a well-researched topic and one that's been gone 
over pretty thouroughly for the past twenty years or so. There's a 
lot of literature on this -- it's worth a run through citeseer for 
some of the papers or, if you've a copy handy in a local library, the 
book Garbage Collection by Jones and Lins, which is a good summary 
of most of the techniques, their costs, drawbacks, and implementation 
details.

  The circular garbage thing's also an issue. Yes, there are
 interesting hacks around it (python has one -- clever, but definitely
 a hack) that essentially involves writing a separate marksweep
 garbage collector.
I don't think circular references are used that much. This is maybe
something a programmer still has to think a little bit about.
And if it means, that timely destruction maybe becomes slow only for the
sake of collecting circular references... don't know if thats a big
feature.
Circular references are far more common than objects that truly need 
timely destruction, yes, and the larger your programs get the more of 
an issue it is. Neither are terribly common, though.

Are cicrular references such a big issue in perl5? I heard that the
buildin GC in perl5 only runs at program end and captures the circular
references, what sometimes causes a segfault (as a friend of mine 
experienced).
Perl 5, like all other refcounting GC systems, is essentially 
incremental and things get collected as the program runs. There's a 
final global destruction sweep that clears up anything still alive 
when the program exits, but this sweep doesn't guarantee ordering 
(and it really can't in the general case when there's circular 
garbage) and can in some cases cause segfaults if you've got 
finalizers written in C that don't properly handle last-gasp out of 
order cleanup.

  The thing is, there aren't any languages we care about that require
 true instant destruction -- the ones that care (or, rather, perl.
 Python and Ruby don't care) only guarantee block boundary collection,
 and in most cases (heck, in a near-overwhelming number of cases)
 timely destruction is utterly irrelevant to the code being run. Yeah,
 you *want* timely destruction, but you neither need nor notice it in
 most program runs, since there's nothing that will notice.
In many programruns you wont notice the overhead of refcounting too.
And in scripts, that only run up to (max) a minute, you won't even
notice if the memory isn't managed at all.
We're building a general purpose engine, remember. It needs to handle 
programs with 10 objects that run in 10ms as well as ones with 10M 
objects that run for 10 weeks.

That argument actually favors non-refcount GC -- there's a minor 
speed win to a non-refcount system at the short end of program runs 
and a significant one at the large end of program runs.

And timely destruction is still a feature thats used much more than
collection of circular references would be (IMHO)
In this case, YHO would turn out to be incorrect. Don't get me wrong, 
it's a sensible thing to think, it's just that it doesn't hold up on 
closer inspection.

  Having been too deep into the guts of perl, and having written more
 extensions in C than I care to admit to, I wanted refcounting *dead*.
 It's just spread across far too much code, tracking down errors is a
 massive pain, and, well, yech. Yes, non-refcounting GC systems are a
 bit more complex, but the complexity is well-contained and
 manageable. (I wrote the first cut of the GC system in an afternoon
 sitting in my local public library) There's also the added bonus that
 you can swap in all sorts of different GC schemes without disturbing
 99% of the code base.
Just because refcounting is error-prone it doesn't mean that a garbage
collector is better (and less error-prone).
I agree, the code is more localized. But i guess that memory leaks
(and resource leaks) that are caused by a bug in a garbage collector 
aren't that easy
to find and fix also.
Actually they are, significantly. Bugs in a centralized GC system 
show up reasonably quickly and usually very fatally

Re: parrot and refcounting semantics

2005-04-28 Thread Dan Sugalski
At 12:12 AM +0200 4/28/05, Robin Redeker wrote:
On Wed, Apr 27, 2005 at 12:33:30PM -0600, Luke Palmer wrote:
 Dan Sugalski writes:
  Also, with all this stuff, people are going to find timely destruction
  is less useful than they might want, what with threads and
  continuations, which'll screw *everything* up, as they are wont to do.
  I know I've been making heavy use of continuations myself, and this is
  for a compiler for a language that doesn't even have subroutines in
  it. Continuations screw everything up, which is always fun, for
  unusual values of 'fun'.
 When I programmed in C++ with proper use of the STL and other such
 abstractions, almost the only time I needed destructors were for
 block-exit actions.  Perl 6 has block-exit hooks, so you don't need to
 use destructors to fudge those anymore.  And there's a lot less explicit
 memory management in languages with GC (that's the point), so
 destructors become even less useful.
 I agree with Dan completely here.  People make such a big fuss over
 timely destruction when they don't realize that they don't really need
 it.  (But they want it).   I think, more importantly, they don't
 understand what they're getting in return for giving it up.
Could you point out what i get?
I use TD is to handle resources: filehandles, database handles, gui
windows and all that.
Sure, most people do. Heck, I do. And... I don't need timely 
destruction for them. Not in the at scope exit this will be cleaned 
up sense, at least.

Timely may be getting in the way here. Timely is as soon as 
possible but with very few exceptions it isn't necessary. (There are 
a couple of perl idioms that really demand it at the block level, but 
that's about it) A number of languages use it as an alternative to 
providing block exit actions, though those aren't really a great 
place for forcing finalizers, since you can't be sure that a thing is 
actually in a state to be finalized.

You've really got three classes of things that the GC system needs to 
know about. There are the things that need cleaning eventually 
(basically the no-finalizer objects), things that need cleaning up 
reasonably quickly (most things with finalizers, including most file 
and DB handles) where reasonably quickly's in the 100-200ms range, 
and things that need cleaning as soon as they die.

We don't provide any good support for instant destruction. People are 
just going to have to live with that, but honestly I don't think 
anyone's going to notice. (Though compilers can force a call to a 
PMC's finalizer if they want) For block-level cleanup there's 
conditional garbage collection and tagging PMCs as needing GC 
attention, which provides the support that most everyone needs. How 
expensive a sweep is depends entirely on how many PMCs you have, but 
the tracing part of the collector's darned fast, especially if you're 
using the aggregate PMCs that we provide, since they're special-cased 
in the collector.

Refcounting does this with a little overhead, but
in a fast and deterministic O(1) way.
Deterministic, yes. Fast, no. Refcounting, in general, is about twice 
as expensive over the life of a program as other GC schemes.

And how do you want to implement glib objects with parrot?
They are refcounted.
They'll be wrapped, so it's no big deal. When the PMC wrapping them 
dies, its finalizer decrements the refcount of the wrapped glib thing 
and leaves it to die on its own.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: parrot and refcounting semantics

2005-04-28 Thread Dan Sugalski
At 11:48 AM -0600 4/28/05, Luke Palmer wrote:
Robin Redeker writes:
 This should actually be, to prevent the resource from leaking:
 {
   my $s = new CoolClass;
   eval {
 ... do stuff that may throws ...
   };
   destroy $s;
 }
Or, with the block hooks that I keep claiming makes timely destruction
almost never needed, it is:
{
my $s = new CoolClass;
# ... do stuff that may throw ...
LEAVE { destroy $s }
}
This destroys properly and even propagates the exception.
Actually it destroys improperly, since it destroys unconditionally, 
which is likely wrong. The right thing is to have the constructor for 
CoolClass tag its constructed object as needing expedited destruction 
and have the language compiler emit conditional sweep ops as part of 
its block finishing code.

Explicit destruction like that will clean up the object even if there 
are outstanding references, which is likely the wrong thing to do.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: parrot and refcounting semantics

2005-04-28 Thread Dan Sugalski
At 7:24 PM +0200 4/28/05, Robin Redeker wrote:
I just wanted to correct my small example:
On Thu, Apr 28, 2005 at 05:00:53PM +0200, Robin Redeker wrote:
  Robin Redeker writes:
 And with explicit resource handling (without timely destruction) it may be:
{
  my $s = new CoolClass;
  ...
  destroy $s;
}
This should actually be, to prevent the resource from leaking:
{
  my $s = new CoolClass;
  eval {
... do stuff that may throws ...
  };
  destroy $s;
}
That's not necessary to prevent the resource from leaking. It's only 
necessary if you want to unconditionally destroy the object when 
control is leaving that block for the first time. You probably don't 
want to, since that means any outstanding references will now refer 
to a dead object.

If you don't have the destroy, and don't tag the object as needing 
expedited cleanup, then the finalizer *will* still be called. You 
just don't have any control over when its called.

  Not that big difference. And this is what we have with
  refcounting/timely destruction:
{
  my $s  = new CoolClass;
  ...
}
The latter example will destruct nicely if something throws.
Regardless of GC method, yes. The only question is when.
It's *really* important to note that I explained how parrot does GC 
-- that wasn't opening a descussion on redesigning the feature. 
Parrot doesn't have, and isn't going to have, a refcounting GC 
system. That's just not an option.

Parrot's got a tracing GC system triggered on demand by failure to 
allocate resources, explicitly by user code, or at regular intervals. 
Switching to a different tracing system than what's in now is doable. 
Switching away from a tracing system *isn't*. That'd require changing 
the entire source base, and just isn't feasible.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: parrot and refcounting semantics

2005-04-27 Thread Dan Sugalski
 cut down on the pauses with generational 
collectors and other such interesting things, which can be plugged in 
mostly seamlessly. Right now we're usually in stop-the-world mode but 
heck we compile parrot with *no* C compiler optimizations too, given 
that things are all in development as it is.

Also, with all this stuff, people are going to find timely 
destruction is less useful than they might want, what with threads 
and continuations, which'll screw *everything* up, as they are wont 
to do. I know I've been making heavy use of continuations myself, and 
this is for a compiler for a language that doesn't even have 
subroutines in it. Continuations screw everything up, which is always 
fun, for unusual values of 'fun'.

I really need to go profile perl 5 some time to get some real stats, 
but I think it's likely that most programs (well, most programs I run 
at least) have less than 0.1% of the variables with destructors, and 
maybe one or two variables *total* that have a need for timely 
destruction. (And most of the time they get cleaned up by global 
destruction, which makes 'em not actually timely cleaned up)

You said, that most languages will have refcount semantics, just sounds
funny for me to implement a GC then.
Actually most languages won't have refcount semantics. Perl 5's the 
only one that really guarantees that sort of thing now, though I 
think it's in there for perl 6. I doubt the python, ruby, Lisp, or 
Tcl compilers will emit the cleanup-at-block-boundary sweep code.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


One more MMD -- assignment?

2005-04-22 Thread Dan Sugalski
I'm not *100%* sure this would be a win, but I'm finding that I'm 
currently writing an awful lot of code in my set_pmc vtable methods 
that is suspiciously MMD-like. As such, I think adding assignment to 
the list of MMD functions might not be a bad idea.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [RFC] some doubtable MMDs?

2005-04-20 Thread Dan Sugalski
At 2:38 PM +0200 4/15/05, Leopold Toetsch wrote:
I'm not quite sure, but it seems that some of the MMD functions may 
better be vtable methods:

- bitwise_sh[rl]*shift by anything other then int?
- bitwise_lsris missing generally
or even just a plain opcode only:
- logical_{or,and,xor}  return a PMC depending on the boolean value
What are HLLs expecting of these infix operations?
These were in because I fully expected people would want to override 
,  and their ilk. Basically any overridable operation got an 
entry in the MMD table.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: A sketch of the security model

2005-04-14 Thread Dan Sugalski
At 10:03 PM -0400 4/13/05, Michael Walter wrote:
Dan,
On 4/13/05, Dan Sugalski [EMAIL PROTECTED] wrote:
 All security is done on a per-interpreter basis. (really on a
 per-thread basis, but since we're one-thread per interpreter it's
 essentially the same thing)
Just to get me back on track: Does this mean that when you spawn a
thread, a separate interpreter runs in/manages that thread, or
something else?
We'd decided that each thread has its own interpreter. Parrot doesn't 
get any lighter-weight than an interpreter, since trying to have 
multiple threads of control share an interpreter seems to be a good 
way to die a horrible death.

  Each running thread has two sets of privileges -- the active
 privileges and the enableable privileges. Active privs are what's
 actually in force at the moment, and can be dropped at any time. The
 enableable privs are ones that code can turn on. It's possible to
 have an active priv that's not in the enableable set, in which case
 the current running code is allowed to do something but as soon as
 the privilege is dropped it can't be re-enabled.
How can dropping a privilege for the duration of a (dynamic) scope be
implemented? Does this need to be implemented via a parrot intrinsic,
such as:
  without_privs(list_of_privs, code_to_be_run_without_these_privs);
..or is it possible to do so with the primitives you sketched out above?
When a priv is dropped it stays dropped until it's reinstated. If 
code drops a priv that it can't re-enable then the priv is gone. 
(There are going to be issues with privileges attached to 
continuations, since this could potentially mean that dropped privs 
get un-dropped when you invoke a return continuation, though dropping 
a privilege could ripple up the return continuation chain)

  Additionally, subroutines may be marked as having privileges, which
 means that as long as control is inside the sub the priv in question
 is enabled. This allows for code that has elevated privs, generally
 system-level code.
Does the code marking a subroutines must have any other privilege than
the one it is marking the subroutine with?
Dunno, that's something we'll need to work out. It's possible that 
sub marking needs to be done externally -- that is, it's bytecode 
metadata or something like that which requires system privileges of 
some sort to set. (Though there are issues with that) Marking code as 
privileged is really a system administration task, though we've not 
really put much thought into administering a parrot system yet.

  ... Non-continuation
 invokables (subs and methods) maintain the current set of privs, plus
 possibly adding the sub-specific privs.
Same for closures?
Yeah, I think so.
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: More registers

2005-04-14 Thread Dan Sugalski
At 2:05 PM -0400 4/13/05, Dan Sugalski wrote:
At 12:05 PM +0200 4/13/05, Leopold Toetsch wrote:
As of rev 7824 Parrot *should* run with NUM_REGISTERS defined as 64 
too. Only some stack tests are failing that do half frame push and 
pop tests.

imcc/t/reg/spill_2 just spills 4 registers instead of 36.
Dan, could you please try that with one of your big subroutines and 
report compile times and functionality.
Sure. I'll sync up and give it a shot.
Okay, after doing a CVS syncup (I don't have subversion set up yet) I 
find there's no difference to speak of -- still takes about 390 
minutes on the big form. (If the CVS repository's not up to date I 
can see about getting subversion installed and working)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: More registers

2005-04-14 Thread Dan Sugalski
At 3:53 PM +0200 4/14/05, Jens Rieks wrote:
On Thursday 14 April 2005 15:33, Dan Sugalski wrote:
 (If the CVS repository's not up to date I
 can see about getting subversion installed and working)
Yes, the CVS repository is not updated anymore.
Swell -- I thought when we were switching over to subversion we were 
going to make sure that the CVS repository was kept up to date.

I guess I'll see about digging up a subversion client and see where I 
go from there.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: A sketch of the security model

2005-04-14 Thread Dan Sugalski
At 10:44 AM -0400 4/14/05, Aaron Sherman wrote:
On Thu, 2005-04-14 at 09:11, Dan Sugalski wrote:
 At 10:03 PM -0400 4/13/05, Michael Walter wrote:

 On 4/13/05, Dan Sugalski [EMAIL PROTECTED] wrote:
   All security is done on a per-interpreter basis. (really on a
   per-thread basis, but since we're one-thread per interpreter it's
   essentially the same thing)

 Just to get me back on track: Does this mean that when you spawn a
 thread, a separate interpreter runs in/manages that thread, or
 something else?
 We'd decided that each thread has its own interpreter. Parrot doesn't
 get any lighter-weight than an interpreter, since trying to have
 multiple threads of control share an interpreter seems to be a good
 way to die a horrible death.
So to follow up on Michael's question: does this mean that you spawn a
new thread, instance an interpreter, and then begin executing shared
code?
Yes.
 What about data?
Data needs to be explicitly shared.
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: A sketch of the security model

2005-04-14 Thread Dan Sugalski
At 5:51 PM -0400 4/13/05, Aaron Sherman wrote:
On Wed, 2005-04-13 at 17:01, Dan Sugalski wrote:
 So here's what I was thinking of for Parrot's security and quota
 model. (Note that none of this is actually *implemented* yet...)
[...]
 It's actually pretty straightforward, the hard part being the whole
 don't screw up when implementing thing, along with designing the
 base set of privs. Personally I think taking the VMS priv and quota
 system as a base is a good way to go -- it's well-respected and
 well-tested, and so far as I know theoretically sound. Unix's priv
 model's a lot more primitive, and I don't think it's the one to take.
 (We could invent our own, but history shows that people who invent
 their own security system invent ones that suck, so that looks like
 something worth avoiding)
VMS at least *is* a priv-based security model, but VMS privs are not
appropriate for parrot on the whole.
Right. The privileges themselves are generally inappropriate for our 
use, which is fine. It's the model that I'm interested in, as it's 
the model that gets screwed up so badly, or so I'm told.

Anyway, a number of people I deeply respect (and who do this sort of 
thing for a living, at deep levels) have told me flat-out that we're 
better not having a security system than we are trying to roll our 
own, and the common response to We're lifting VMS' has been Good. 
Do that.

I think it would be easier to start from scratch, personally. I
understand your concerns, but I don't think you run any less risk by
creating a new VM security model out of an OS security model than you do
by creating a new one. They both create many opportunities to make a
mistake.
That's not been the general consensus I've seen from people doing 
security research and implementation. This is an area that I've no 
real experience doing any sort of design in, and the people who have 
the experience say not to, so I think it best to take them at their 
word.

If you really want to reduce the chances that you'll make a mistake,
swipe the security model from JVM or CLR and start with that. At least
those have been tested in the large, and map closer to what Parrot wants
to do than VMS.
The problem is twofold with those. First, there's some indications 
that they're busted, and second (and more importantly) they're both 
very coarse-grained, and that leads to excessive privs being handed 
out, which increases your exposure to damage. If a library routine 
needs to potentially exceed memory quotas we'd rather not give it the 
equivalent of root privileges.

Don't get me wrong. I loved VMS back in the day. It was a pain in the
ass at times, but what isn't. It's just that it's not a VM trying to
execute byte-code... it's an operating system which directly manages
hardware.
Yeah, but don't forget that for all intents and purposes parrot is an 
OS trying to execute bytecode, especially when you look at the 
environments that the features will get used in.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: A sketch of the security model

2005-04-14 Thread Dan Sugalski
At 9:51 AM -0700 4/14/05, Dave Whipp wrote:
Dan Sugalski wrote:
All security is done on a per-interpreter basis. (really on a 
per-thread basis, but since we're one-thread per interpreter it's 
essentially the same thing)
...
   * Number of open files
   * IO operations/sec
   * IO operations total
...
Can an application get more resources simply by spawning threads? 
If the answer is no, parent and child must divide share their 
quotas then there is a load balancing problem. If the answer is 
yes, then there's no real protection at all. A threads-per-second 
limit isn't an answer here, either (a malicious app could sit around 
for a few hours, launching threads at a low intensity, until it has 
enough to bring down the system).
Spawning threads may require a privilege, and there should be a quota 
for it, so the number of spawned threads could be managed.

I can see sharing quotas across multiple threads, though there are 
issues there (like sharing CPU time) as well.

Is a thread really the right thing to apply these limits to? It 
seems to me that there needs to be some sort of token (cf. cash; cf 
capability) that an application can obtain/spend/refresh to do 
these ops. An application could share its token(s) with any threads 
it creates. It could probably even loan its token to a backgroud 
thread that does some operation on behalf of many other threads.
Well, and interpreter is as fine-grained as we can get, and you're 
right, we may want to get a bit broader and share quotas amongst 
multiple threads. I don't know that we want to get much fancier than 
grouping threads together, though -- while I can see it being useful 
in a few cases, I'm not sure in practice that anyone'd actually want 
to do that.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: More registers

2005-04-14 Thread Dan Sugalski
At 4:42 PM +0200 4/14/05, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
 At 3:53 PM +0200 4/14/05, Jens Rieks wrote:
Yes, the CVS repository is not updated anymore.

 Swell
You need just this part:
 Date: Wed Apr 13 03:04:41 2005
 New Revision: 7824

 Modified:
 trunk/imcc/reg_alloc.c
Ah, OK. With that part in, the time's cut from 390 minutes to 295 
minutes. A 25% cutback, not bad.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Parrot and the web (PHP?)

2005-04-13 Thread Dan Sugalski
At 8:42 AM -0500 4/13/05, Timm Murray wrote:
On Wednesday 13 April 2005 08:38 am, BÁRTHÁZI András wrote:

 I think that web development will be very important in the life of
 Parrot and Perl 6. One of the most important (at least as a server
 administrator) feature of PHP, is that you can lock the programs into a
 directory by defining open_basedir. If the application try to open a
 file from a directory not defined in it, that there will be an
 exception. It's very useful for a hosting company, that two client's
 program cannot read each other.

I think Parrot is the wrong place to solve this problem.  It's better to be
handled by the languages themselves.
Nope, parrot's the right place to solve this 
problem, otherwise the problem's not solved. 
Security needs to be implemented by the platform 
(which, in this case, would be parrot) if you 
want it to work.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: More registers

2005-04-13 Thread Dan Sugalski
At 12:05 PM +0200 4/13/05, Leopold Toetsch wrote:
As of rev 7824 Parrot *should* run with NUM_REGISTERS defined as 64 
too. Only some stack tests are failing that do half frame push and 
pop tests.

imcc/t/reg/spill_2 just spills 4 registers instead of 36.
Dan, could you please try that with one of your big subroutines and 
report compile times and functionality.
Sure. I'll sync up and give it a shot.
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Parrot and the web (PHP?)

2005-04-13 Thread Dan Sugalski
At 8:25 PM +0200 4/13/05, BÁRTHÁZI András wrote:
An other question is, that how can you tell to 
the platform, to limit these features, maybe 
non-modifiable environment variables and command 
line parameters can be the ways of it.
For that you need a full-blown quota and 
privilege system. Luckily there are plans for 
one. :)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Parrot and the web (PHP?)

2005-04-13 Thread Dan Sugalski
At 9:49 PM +0200 4/13/05, BÁRTHÁZI András wrote:
Hi,
An other question is, that how can you tell 
to the platform, to limit these features, 
maybe non-modifiable environment variables 
and command line parameters can be the ways 
of it.
For that you need a full-blown quota and 
privilege system. Luckily there are plans for 
one. :)
As far as boxing a VM into a sub-directory, 
etc. UNIX (chroot) and VMS make this a breeze 
since
the mechanisms are builtin to the OS, it is 
Windows where all the work has to be done.
I'm not a UNIX guru, but I don't know an easily 
installable solution for the problem. I would 
like to run just one Apache, and would like to 
run Perl as an Apache module. Chroot I think is 
not a solution for it. Running the script as CGI 
or running as much Apaches as much client you 
have is not a solution for me and for a lot of 
people. PHP offer an easy way to solve this 
problem.
It's important here to note that when I said 
platform I meant Parrot. (That was in there, 
but it's worth being clear about) That is, the 
platform is Parrot, not the OS parrot is running 
on, and Parrot is responsible for any security 
guarantees it makes. Now, it may make them by 
using facilities the OS provides (which makes the 
job easier) but it doesn't have to -- it can and 
will do it with no OS help if need be.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


A sketch of the security model

2005-04-13 Thread Dan Sugalski
So here's what I was thinking of for Parrot's security and quota 
model. (Note that none of this is actually *implemented* yet...)

All security is done on a per-interpreter basis. (really on a 
per-thread basis, but since we're one-thread per interpreter it's 
essentially the same thing)

QUOTAs are limits on the number of resources or operations that an 
interpreter an allocate or perform, either in absolute terms (i.e. 
allocate no more than 10M of memory) or relative terms (i.e. can do 
only 10 IO operations per second). Quotas are tracked by parrot, and 
cover:

   * Number of open files
   * IO operations/sec
   * IO operations total
   * Memory allocated
   * CPU time consumed
   * Threads spawned
   * Sub-processes spawned total
   * Simultaneous sub-processes
PRIVILEGEs are permissions to do certain things. Parrot will have a 
number of privileges it checks before doing dangerous operations, and 
user code may also assign and check privileges.

Normally parrot runs with no quotas and no privilege checking. This 
is the fastest way to run. Code may at any time enable privilege 
and/or quota checking. Once enabled code must have proper privileges 
to disable it again.

Each running thread has two sets of privileges -- the active 
privileges and the enableable privileges. Active privs are what's 
actually in force at the moment, and can be dropped at any time. The 
enableable privs are ones that code can turn on. It's possible to 
have an active priv that's not in the enableable set, in which case 
the current running code is allowed to do something but as soon as 
the privilege is dropped it can't be re-enabled.

Additionally, subroutines may be marked as having privileges, which 
means that as long as control is inside the sub the priv in question 
is enabled. This allows for code that has elevated privs, generally 
system-level code.

Continuations, when taken, capture the current set of active and 
enableable privs, and when invoked those privs are put into place. 
(This is a spot that will require some thought, since there's a 
potential for privilege leaks which worries me here) Non-continuation 
invokables (subs and methods) maintain the current set of privs, plus 
possibly adding the sub-specific privs.

It's actually pretty straightforward, the hard part being the whole 
don't screw up when implementing thing, along with designing the 
base set of privs. Personally I think taking the VMS priv and quota 
system as a base is a good way to go -- it's well-respected and 
well-tested, and so far as I know theoretically sound. Unix's priv 
model's a lot more primitive, and I don't think it's the one to take. 
(We could invent our own, but history shows that people who invent 
their own security system invent ones that suck, so that looks like 
something worth avoiding)

--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Passing on the hat

2005-03-22 Thread Dan Sugalski
At 12:27 PM -0500 3/22/05, MrJoltCola wrote:
At 06:55 PM 3/21/2005, Chip Salzenberg wrote:
According to Dan Sugalski:
 As such, I'd like to say a big thanks to Chip Salzenburg who's agreed
 to take the hat.
I thank you for your kind words, and for giving me the opportunity
again to work long hours and explain difficult and arbitrary design
decisions to enthusiastic contributors.  :-)
And *all* of us thank you for the last four and a half years.  I'm
glad you'll still be around; your work certainly will be.

Congrats Chip, I can't actually follow what Dan said since I did not
receive the original message from Dan, all I see is replies.
Was this a reply to a private message or is the mailing
server doing funny things again? I'm just curious about how Dan is
doing since I haven't talked to him in a while. :)
The original did go to the list, so it should be out there somewhere. 
(OTOH, my local mailserver's configured such that some systems are 
unhappy with it. Not that it should matter, since the message got 
distributed by the mailing list, but you never know with these 
things. Darn that shub-internet and it's evil minions...)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Parrot_Exec_OS_Command interface ?

2005-03-21 Thread Dan Sugalski
At 12:02 PM +0100 3/15/05, François PERRAD wrote:
When I analyse the failure of t/pmc/sys.t with MinGW32,
I see that this script generates a command depending of the OS
on MSWin32, cmd = .\parrot temp.imc
on *nix, cmd = ./parrot temp.imc
(So with MinGw, the generation of Makefile needs /, and the execution needs \)
The bug can be fixed in ConfigureLand or in Parrot_Exec_OS_Command function.
And the dilemma is :
the interface of Parrot_Exec_OS_Command is OS native command as now,
or is OS independent, like open dirname/filename
This is a low-level interface and is designed to 
be OS-dependent. An OS-independent layer on top 
of it wouldn't be bad, but I'm not sure it'd be 
too useful as you get really really system 
dependent really fast when spawning off 
processes. (There's so much more to it than quick 
filename munging that I'm not sure it's worth it, 
really)

Anyway, any sort of OS-independence should live 
on top of the low-level interface, and would be a 
reasonable thing to put in a library.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Namespaces

2005-03-21 Thread Dan Sugalski
At 11:07 AM +0100 3/15/05, Leopold Toetsch wrote:
Leopold Toetsch [EMAIL PROTECTED] wrote:
  t/pmc/namespace.t

 Please have a look at the supported syntax constructs. Are these
 sufficient for HLL writers?
Some more thoughts WRT namespaces.
We can define a namespace, where a function or method is stored:
  .namespace [Foo]
  .sub bar
  ...
This does effectively (during bytecode loading)
  store_global Foo, bar, pmc_of_sub_bar


But what happens, if the PIR file contains just a plain
  bar()
Since PIR behaviour is ours to define, it can do whatever we want. 
I'm inclined to rule that a plain

   bar()
is equivalent to:
   current_namespace::bar()
and as such the PIR compiler should look up the function by name in 
the current namespace and then invoke it.

The implementation looks into the current namespace and then in other
namespaces present in the compiled source file, which seems to be quite
wrong to me.
Yeah, this isn't the right thing.
Since invoking a sub in the current file's also useful and something 
we do (I'm making heavy use of it. It's nice :) I can see having 
alternate syntax for it. Perhaps:

   :bar()
looks for the bar sub in the current namespace in the current file 
and invokes it, skipping any sort of by-name lookup or runtime sub 
overriding.

(And yeah, I suggest this even though it'd mean rewriting some of my 
compiler's codegen code)

Any semantics past these (simple lookup by name, and simple 
compiletime lookup) should be delegated to the individual language 
compilers, and they can emit code to do whatever oddball things they 
may need to do, though I'm not sure there's a whole lot more that'd 
need to be done for most languages. (Especially since namespaces are 
supposed to be lexically and dynamically overridable, as well as 
layered, but that's all a separate thing)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [CVS ci] builtins

2005-03-21 Thread Dan Sugalski
At 5:19 PM +0100 3/19/05, Leopold Toetsch wrote:
1) builtin methods are living in a class namespace e.g.
  Float.cos
  ParrotIO.open  # unimplemented
I'm way out of the loop and may have been dealt with in prior mail, 
but are we doing real method calls for cos() and suchlike things? 
That seems... sub-optimal for speed reasons. (Open I can see, that 
makes sense, though I'm not sure I'd want it for any other file 
operations, again for speed reasons)

I've been thinking it may be worth pulling out some groups of 
semi-commonly used functions that should be fast but still 
PMC-class-specific, and thus not methods, into sub-tables hanging off 
the vtable. Most of the semi-high-level string functions (basically 
everything that may be delegated to the charset/encoding layers) 
would be a candidate for this. Possibly the standard trig functions 
too.

This'd cost us a single pointer per sub-table per pmc class, and one 
table of generic functions per sub-table, so it'd not be that 
expensive, yet still allow classes to override the default functions 
on a per-class basis without the overhead of full method dispatch.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [PROPOSAL] MMD: multi sub syntax

2005-03-21 Thread Dan Sugalski
At 8:10 AM +0100 3/16/05, Leopold Toetsch wrote:
Leopold Toetsch [EMAIL PROTECTED] wrote:
 Leopold Toetsch [EMAIL PROTECTED] wrote:

 Syntax proposal:

.sub foo @MULTI
  .invocant Integer a
  .invocant Float b
  .param pmc c
  ...

 Alternate syntax:

   .sub foo multi(Integer, Float)
 .param pmc a
 .param pmc b
 .param pmc c
And another one:
  .multi sub foo
.sub foo__Int_Num_Str   Integer,Float:String
.sub foo__Num_Int_Any   Float,Integer,pmc
  .endmulti
Since the sub PMC's going to have to be installed in the MMD tables 
at load time with some amount of force (since there's going to have 
to be type lookups, amongst other things, that need to be done) I 
think we might as well go for something easy to parse. If I had my 
choice, I think I'd go with:

.sub foo @MULTI(Integer, -, Float)
where the @MULTI() carries the signature, with a dash denoting 
positions whose types are ignored for purposes of MMD lookup.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Passing on the hat

2005-03-21 Thread Dan Sugalski
As everyone's more than aware, I've not been around much at all the 
past few months. Real Life, alas, managed to get a good hold on me 
and doesn't look to be letting go any time soon. This isn't at all 
good for Parrot development -- a designer who's absent is definitely 
a Bad Thing, and it's really getting in the way.

So, since there's just no way I'm going to be able to do what the 
position requires, it's time to pass on the responsibility to someone 
who actually has the time. (Though I'll be wading through some of the 
recent backlog anyway, since I can't help but meddle :)

As such, I'd like to say a big thanks to Chip Salzenburg who's agreed 
to take the hat. The perl folks on the list will recognize Chip as 
the perl 5.004 pumpking and the guy who took the first shot at Perl: 
The Next Generation (aka Topaz). Chip's a darned sharp guy, 
desperately over-qualified, and one of the few people I know who can 
do off-the-cuff MST-ing of modern cinema.

We'll be getting Chip up to speed pretty quickly, and I've no doubt 
parrot will be in capable hands.

And, to forestall some of the wave of questions and off-list 
grumbling: The FAQ!

Q: Why isn't Leo getting the hat
A: He didn't want it.
Q: So you're going missing?
A: No, not as such. Oddly I'll probably have more time for Parrot. Go figure.
Q: What's going on?
A: Non-parrot stuff, which makes it all off-topic for the list :)
Q: So what's the plan?
A: Chip takes the hat and I kibitz
Q: No more parrot for you?
A: Far from it. Parrot's still integral to my work project (and doing 
quite well, I should add) and I'll be throwing in patches and 
commenting on what I had planned.

Q: Who wins the arguments?
A: Chip, of course, since he has the hat. I'm now just a guy who 
*used* to have the hat

Q: Sad to see it go?
A: Sort of. On the other hand, it's been four and a half years (not 
counting the past few months of inactivity) and I'll admit, I'm just 
*tired*. Definitely a sign it's time to pass the hat and get out of 
the way.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Passing on the hat

2005-03-21 Thread Dan Sugalski
At 12:50 PM -0800 3/21/05, chromatic wrote:
On Mon, 2005-03-21 at 15:39 -0500, Dan Sugalski wrote:
 And, to forestall some of the wave of questions and off-list
 grumbling: The FAQ!
Q: Is there any way to talk you into continuing to design, or at least
describing, the long-awaited security model?
A: (Chip is a fine choice.)
D'oh! Why yes, yes you can. I think I shall go do that.
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Calling conventions, invocations, and suchlike things

2005-01-28 Thread Dan Sugalski
At 5:04 PM -0500 1/18/05, Sam Ruby wrote:
Dan Sugalski wrote:
Hi folks.
Welcome back!
Parrot's got the interesting, and somewhat unfortunate, requirement 
of having to allow all subroutines behave as methods and all 
methods behave as subroutines. (This is a perl 5 thing, but we have 
to make it work) That is, an invokable PMC may be invoked as a 
method call and passed in an object, or as a plain subroutine and 
not have an object passed in. As far as perl 5 is concerned the 
object is the first parameter in the argument list, but for 
everyone else the object is a very distinct and separate thing.
Python essentially has the same requirement, with a few twists. 
Specifically, methods come in be static, class, and regular flavors.

But first, a simple example.  Strings in python have a find 
method, so and can do the following:

f = Parrot.find
print f(r)
Note that I referenced the method as an attribute, and then called 
it as a function.
Mmm, syntax! :) Luckily it makes no difference to us at the parrot 
level. What that should translate to is something like:

$P0 = find_method Parrot_string, find
 # Elided check for failed lookup and fallback to attribute fetch
$P1 = make_bound_method(Parrot_string, $P0)
$P1(r)
Furthermore, the function remembers what object it is bound to. 
This is accomplished by VTABLE_find_method creating a new 
PyBoundMeth PMC which contains two references, one to the object, 
and one to the method.
While a good idea, I think it's not the right way to handle this. 
Binding objects to methods to create invokable subs is going to be 
something we're going to need for a lot of the languages, so I think 
we'd be better served providing a general facility to do it rather 
than leaving it to each individual language designer to do it. Should 
save some work all around too.

Static methods differ in that the object is not passed.
How is this different from a subroutine, then?
Class methods differ in that the object passed is actually the class 
of the object in question.
I'm assuming this is different from just a method on the class somehow?
Note: all this is determined by the callee.  It is all transparent 
to the caller.
This is the part I'm not so sure about. It looks like, rather than 
having two sides (caller and calle) we have three, caller, callee, 
and the code that fetches the invokable in the first place.

I fully agree that the caller shouldn't know about this stuff, since 
it may well have been handed the invokable thing as part of a 
function call or pulled it out of a variable or something.

I don't think the callee should have to know anything special here, 
though -- it doesn't seem at all unreasonable to have the callee 
*not* have to do anything special, nor play any magic games. (And I 
think I'd be a bit peeved if I was writing code which passed in 
object A as the object being invoked on, but the method decided it 
wanted to use object B instead) This is especially true in a 
mixed-language environment when you've got a class with methods 
written in different languages -- setting up any conventions that'll 
actually be followed seems like an exercise in futility. :)

That leaves the code that actually fetches the invokable thing in the 
first place, and that seems like the right place for this to happen. 
The language the code is written in knows what should happen based on 
what it gets back when querying the object, so as long as we provide 
a standard means to do all the binding stuff, we shoul dbe fine.

First, observe that I don't have any control over the exception that 
is raised when a method is not found (fix: raise the exception 
within find_method).
Right. There's going to be one generic method-not-found exception -- 
there really has to be only one, otherwise we're going to run into 
all sorts of cross-language problems. Exception unification (and, 
more likely, aliasing) is going to be one of the tricky issues.

My one minor request here is P2 be made available on entry to the 
invoked method.  This would remove some special case logic for me 
requiring the use of interpinfo.  I don't expect any guarantees that 
this is preserved or restored across sub calls.
The one thing that leaving it in the interpreter structure and not 
explicitly passing it in gets us is we get notice if its actually 
extracted and used. Which  is going to be fairly common, so I'm not 
sure what it buys us. I think we'll leave things as-is, but I'm not 
sure for how much longer.

Not having objects handle their own method dispatch is less 
clear-cut, but I do have some reasons, so here they are.
Just be aware that in order to preserve Python semantics, 
find_method will need to return a bound method.
That can't happen. find_method has to return an unbound method, since 
there are just too many cases where that's what we need. If the 
method then needs to be bound then the fetching code can do the 
binding.

 This involves creating an object on the heap

Re: Name of parrot executable

2005-01-20 Thread Dan Sugalski
At 1:50 PM -0500 1/19/05, Matt Diephouse wrote:
On Wed, 19 Jan 2005 11:09:19 -0500, Dan Sugalski [EMAIL PROTECTED] wrote:
 Good point--we should. That'd mean we'd want to have three sets of
 data: the invoked full/base name, the 'program' full/base name, and
 the interpreter full/base name.
Then we can use this to have parrot look for .include's and dynclasses
from the root parrot directory? (See #32178)
I think so, but I'll admit the prospect makes the sysadmin bit of me 
profoundly nervous. I'm thinking Parrot should have some pretty 
draconian, paranoid defaults about where things it wants live, with a 
mechanism to lift them if need be. I'd feel a lot more comfortable 
about that -- we can't stop people from compromising their accounts 
and systems, but I'm thinking we ought not make it too easy.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Name of parrot executable

2005-01-19 Thread Dan Sugalski
At 10:14 AM -0800 1/18/05, Will Coleda wrote:
To implement tcl's
[info nameofexecutable]
I need to get the name of the exectuable parrot was invoked with. I would
have expected to live in interpinfo, but don't see it there.
Anyone have a pointer to where this is? (If it's not in yet, I'll
add a TODO)
Let's add it as a TODO. Two new interpinfo options -- fullname (which 
gives the entire executable name as we got it) and basename which 
gives the name without the path and suffix information.

As an example, if we were invoked as:
~/src/parrot/parrot foo.pbc
then fullname would be ~/src/parrot/parrot and basename would be 
parrot. If, on the other hand, we were invoked as:

 parrot foo.pbc
then both fullname and basename would be parrot. Unix hashbang (and 
Windows file association) invocation may give us something different 
-- if the user did:

   ~/src/foo.pasm
and you'd either associated .pasm with parrot, or foo.pasm started 
#! /usr/bin/parrot (which is legal :) then you'd get a fullname of 
~/src/foo.pasm and a basename of foo.

Clear and sensible?
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Name of parrot executable

2005-01-19 Thread Dan Sugalski
At 4:02 PM + 1/19/05, Nicholas Clark wrote:
On Wed, Jan 19, 2005 at 10:54:53AM -0500, Dan Sugalski wrote:
 parrot. If, on the other hand, we were invoked as:
  parrot foo.pbc
 then both fullname and basename would be parrot. Unix hashbang (and
 Windows file association) invocation may give us something different
 -- if the user did:
~/src/foo.pasm
 and you'd either associated .pasm with parrot, or foo.pasm started
 #! /usr/bin/parrot (which is legal :) then you'd get a fullname of
 ~/src/foo.pasm and a basename of foo.
 Clear and sensible?
Perl 5 makes the distinction between $^X (the interpreter name) and $0
(the script name)
Perl 5 also puts some effort into seeing if it can get a fully qualified
path for the interpreter from the OS. Certainly this is do-able on Solaris,
on Linux given /proc, and on FreeBSD given /proc and a following wind
(at least on FreeBSD 4 where there is a bug). I think it's do-able on Win32
too.
Would we want to try to do this?
Good point--we should. That'd mean we'd want to have three sets of 
data: the invoked full/base name, the 'program' full/base name, and 
the interpreter full/base name. (With the invoked full/base being the 
same as either the program or interpreter full/base, but which way it 
went would depend on how things were fired off, so we might as well 
have them all separate)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Calling conventions, invocations, and suchlike things

2005-01-18 Thread Dan Sugalski
Hi folks.
Sorry I've been gone so long. Non-p6i stuff's been well past 
monopolizing my time. Not much of an excuse, I know, but the Real 
World intrudes at the most inconvenient times. Things are, I hope, 
easing up a little, though I apologize in advance if I get a little 
cranky while I get back into things.

Having (very lightly) skimmed the past month of list mail, I'm 
thinking the best place to start is with the things that've come up 
about objects and method calls. I want to explain why things are 
designed the way they are so (hopefully) everyone's on the same page. 
(And, hopefully, to forestall grumbling when I say things aren't 
going to change :)

The setup, for those following along at home, is that when we make a 
method call the object is passed out-of-band (that is, not as part of 
the regular parameter list), and that objects don't actually handle 
method dispatch -- we split it into a two step affair where we 
request an invokable method PMC for a named method from an object, 
and then invoke it as a separate step.

The easy one first -- why the object is out-of-band, rather than one 
of the parameters. (Something that I doubt anyone's that worked up 
over, and I think everyone's OK with things as they stand, but here 
are the reasons anyway)

Parrot's got the interesting, and somewhat unfortunate, requirement 
of having to allow all subroutines behave as methods and all methods 
behave as subroutines. (This is a perl 5 thing, but we have to make 
it work) That is, an invokable PMC may be invoked as a method call 
and passed in an object, or as a plain subroutine and not have an 
object passed in. As far as perl 5 is concerned the object is the 
first parameter in the argument list, but for everyone else the 
object is a very distinct and separate thing. Regardless invokable 
things need to know whether they were called as a method or a sub. We 
*could* set a flag and have them check, then have some convention 
where the first parameter is an object if the I'm a method flag is 
set, but... yech. Having the object be separate and standalone seems 
cleaner, while still giving us a way to distinguish method/sub 
invocation. (You check to see if there's an object)

This does make things a little tricker for the perl 5 code generator, 
but not that much trickier and, let's face it, we're below the layer 
where things are easy. This *also* makes building signature checking 
into parrot a lot simpler (something we should do), since the 
signature checking stuff doesn't have to deal with possible parameter 
shifting based on whether we've a sub or method invocation.

Not having objects handle their own method dispatch is less 
clear-cut, but I do have some reasons, so here they are.

First off, one of the things I'm very much concerned about is C stack 
usage, both because we don't have all that much we can count on (joys 
of threads -- we'll be lucky to scrape together 10k some places) and 
because continuations can't cross C stack level boundaries. We're 
pretty careful about that one (it's the big reason for the limitation 
that continuations taken from within vtable functions can't escape).

I realize we can continue to be careful with it, mandating that the 
invoke_method vtable function behaves the same as the plain invoke 
does (that is, returning the address to jump to) but that brings up a 
separate problem -- transfer of control is a relatively heavyweight 
thing for us. Method and sub calls can potentially cross bytecode and 
security boundaries. Doing that right requires (potentially) a fair 
amount of screwing around inside the interpreter, as well as the 
invokable thing carrying around enough metadata to properly do the 
transfer. I'd really prefer to limit the number of PMCs that have 
that amount of intimate knowledge. Since all methods and subs have 
the appropriate bits attached to them, I'd as soon just use them.

There's also the potential issue of curried methods, where we need to 
create a new invokable thing and bind some parameters to it. We can 
certainly do that now with the current scheme so adding an 
invoke_method to the mix won't get in the way as such, but it does 
mean we have two near-identical ways of doing the same thing 
(find_method  invoke, and invoke_method) and since we can't toss the 
find_method way, it doesn't feel like adding invoke_method to the mix 
will get us anywhere.

Anyway, there we go. (I fully expect to find that both topics are 
dead about an hour after this goes out, but there you go :)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [perl #33129] N registers get whacked in odd circumstances

2004-12-21 Thread Dan Sugalski
At 10:56 AM +0100 12/21/04, Leopold Toetsch wrote:
Dan Sugalski (via RT) wrote:
You'll note that N5 is set to 22253 when the returncc's done, but 
after the return the value is -21814.6. Looks like something's 
stomping the N registers.
The program below shows exactly the same behavior WRT 
__set_number_native. The call comes from inside of the mmd_fallback 
function, so I presume that in the part before the shown trace you 
are having some kind of mathematical operation that leads to the 
change in N5.
Yep, that's it. Going back a page or so I see this is triggered by 
some math, and I assume that something's not preserving the N 
registers as part of the dispatch in there somewhere. I'll see about 
poking around the code and tracking it down tomorrow unless someone 
beats me to it.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: auxiliary variables

2004-12-20 Thread Dan Sugalski
At 12:00 AM +0100 12/20/04, [EMAIL PROTECTED] wrote:
Please
Lets have two scalars variables in Perl and some operation under
them like an adding.
x = a + b
I would like know, witch auxiliary variables are creating
on the in-line code like a Parrot
somethink like T = a + b
   x = T ???
For simple expressions there's no need for temps. x = a + b 
translates to add x, a, b. If you have more complex expressions and 
need temps, then generally the compiler will choose the correct temp 
type, since it's normally language dependent. (Though Parrot's Undef 
is generally clever enough to be a good generic destination, as it 
morphs to most destination types on assign)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [perl #33032] Parameter fillin problem

2004-12-15 Thread Dan Sugalski
At 9:31 AM + 12/15/04, Leopold Toetsch via RT wrote:
Dan Sugalski wrote:
 Or not. (I've got too many versions of parrot around at the moment) I
 see this bug happening against yesterday morning's parrot.
 imcc/CVS/Entries shows a date of Mon Dec 13 12:19:33 2004 for reg_alloc.c.
I still can't reproduce it. CVS fetches either to P16 or even P3 for the
first registers until P3 is used to hold overflow.
From what I can see it's inconsistent--I've only tracked down one 
function call where this happens, though I've not dug in too deeply 
yet. (I was hoping there was something in the code that'd trigger a 
D'oh! and easy fix :)

I'll do a compile-to-pasm and grovel through the code some more to 
see if I can't find some marker or other that might show what's going 
on.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Objects, classes, metaclasses, and other things that go bump in the night

2004-12-14 Thread Dan Sugalski
At 11:13 AM +0100 12/14/04, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
  subclass - To create a subclass of a class object
Is existing and used.
Right. I was listing the things we need in the protocol. Some of them 
we've got, some we don't, and some of the stuff we have we probably 
need to toss out or redo.

   add_parent - To add a parent to the class this is invoked on
  become_parent - Called on the class passed as a parameter to add_parent
What is the latter used for?
To give the newly added parent class a chance to do some setup in the 
child class, if there's a need for it. There probably won't be in 
most cases, but when mixing in classes of different families I think 
we're going to need this.

   class_type - returns some unique ID or other so all classes in one
   class family have the same ID
What is a class family?
The metaclass's class. I think. This is meant to be an identifier 
that says what kind of class a class is, so code can make some 
assumptions about internal structure and such. Everything that's 
based on a ParrotClass PMC, for example, would have the same 
class_type ID.

   add_method - called to add a method to the class
   remove_method - called to remove a method from a class
These are the other two parts of the Cfetchmethod vtable I presume. When
during packfile loading a Sub PMC constant is encountered and it's a
method, Cadd_method should be called instead of CParrot_store_global?
Yeah, I think so. I'm not too happy about it, but I think that's the 
way things will end up. Most classes will then go and stuff the 
methods into the namespace, but I think it'll have to be up to the 
classes whether they use the namespaces for methods or not.

   namespace_name - returns the name of this class' namespace
Should we really separate the namespace name from the class name?
Yes. We're going to have cases where we've got multiple classes with 
the same name but different namespaces. (Generally classes 
masquerading as other classes, but I can see some other cases)

   get_anonymous_subclass - to put the object into a singleton anonymous
   subclass
How is the singleton object created in the first place?
For now, I think singletons will all be objects of a normal class 
that get pulled into a singleton class, most likely because code's 
added or changed methods on the object rather than to the class.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: overloaded operator calling conventions

2004-12-13 Thread Dan Sugalski
At 7:45 AM +0100 12/11/04, Leopold Toetsch wrote:
Thinking more about that it seems that we don't have much chance to keep
the current scheme that the destination is passed in.
(This is probably out of order -- I've a lot of mail I'm backed up on 
unfortunately, but since it was CC'd directly to me I'll take it)

The note here is that Parrot's MMD function signature for binary ops 
doesn't match what Python needs. Parrot is:

void binary_mmd_op(pmc left, pmc right, pmc dest)
where Python is:
pmc dest = left.add(pmc right)
And, as you can see, the difference is more than just python creating 
a destination where we require one to be passed in -- it's a method 
call as well.

I fully expected this to be an issue. Perl 5 and perl 6 are going to 
have different conventions, (and I think there may well be at least 
two separate ones for perl 6, but I may be misrememebering) Ruby 
doesn't match, and neither do any of the other languages. I think 
there was some discussion back when this was first batted around, but 
there might not have been.

The short answer here is to cope: that is, when installing an MMD 
function, including one of the default MMD functions for a class, a 
language needs to generate the appropriate wrapper function if 
necessary to translate between what parrot provides and what the 
language itself wants.

Nothing much for it -- no matter what we choose it's going to be 
wrong for someone, so the sensible thing to do is choose the scheme 
that works best for the underlying model (which we have) and leave it 
to compilers and class libraries to translate to their own preferred 
form. I think we're likely to find that the scheme we have catches on 
reasonably well once we've hit release and start seeing widespread 
use.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Q: scope exit (was: Exceptions, sub cleanup, and scope exit)

2004-12-13 Thread Dan Sugalski
At 8:07 AM +0100 12/10/04, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
 ... A scope exit
 action is put in place on the control stack with:

 pushaction Psub
* What is the intended usage of the action handler?
* Specifically is this also ment for lazy DOD runs?
* How is the relationship to the Cpop_pad opcode?
The action handler is there to provide the languages as a way to do 
something when scopes are left. It's a generic 'out' for stuff that 
we've not thought about. Most scope exit stuff is cleanup which we'd 
rather be done via the DOD/GC system (otherwise things go Horribly 
Wrong in the face of continuations) but there may well be things that 
need doing.

The one thing that I figure *will* be done is that languages will 
push a sweep or collect op in their scope cleanup in those cases 
where the language knows that there's potentially easy temp cleanup 
that needs doing, for example filehandles that should be closed when 
the scope's exited if there are no outstanding references to them. 
(And I know we've got the aggressive GC system for things like that, 
but in most cases languages can use something a bit less aggressive 
-- do a full sweep to clean up anything that's actually dead, and 
anything that escapes scope can be picked up later, since it's 
lifetime's probably gone nondeterministic)

As far as pop_pad goes, I think maybe we need to revisit the control 
stack ops to see if some of them can go. (There are a fair number of 
rough draft things in the pad handling design that need editing) 
Possibly push_pad too, which could be handled with pushaction, or 
come up with a lighter-weight scheme to do something similar. Or it 
may be that there are few enough things that it's worth keeping 
push/pop_pad around. Not quite sure right now, but we should nail 
that one down.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Q: scope exit

2004-12-13 Thread Dan Sugalski
At 10:19 AM +0100 12/14/04, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
 At 8:07 AM +0100 12/10/04, Leopold Toetsch wrote:

* What is the intended usage of the action handler?
* Specifically is this also ment for lazy DOD runs?
* How is the relationship to the Cpop_pad opcode?

 The one thing that I figure *will* be done is that languages will
 push a sweep or collect op in their scope cleanup in those cases
 where the language knows that there's potentially easy temp cleanup
 that needs doing, for example filehandles that should be closed when
 the scope's exited if there are no outstanding references to them.
That'll not be really easy:
  [ subroutine frame ]
  |
  |
[ cleanup handler subroutine frame ]
Which does argue that it ought not be a sub, I suppose, but something 
simpler. A plain bsr sort of thing.

If the cleanup handler is a plain subroutine, the previous one, which
should be cleaned, is alive. A lazy DOD will find the filehandle in the
lexicals and likely in registers.
Well, maybe. Subs are going to have multiple scopes in them (which 
argues for a faster-than-sub cleanup handler dispatch) so it's more 
than just a 'cleanup before leaving the sub' sort of thing. There's 
an awful lot of code around that looks like:

sub foo {
# Insert code here
foreach (@some_array) {
}
{
   # Some code that needs its own block
}
if (foo) {
} else {
}
   }
Besides cleaning up on sub exit, there's also a potential cleanup 
when the foreach is left, the bare block is left, and each of the 
legs of the if are left. (Potentially once on each foreach iteration, 
I suppose)

Also, since the compilers are in control of when things get 
established when entering a sub, if they've noted that there's a 
reason for a cleanup handler they can push one before establishing 
the lexical pad it needs to clean up after so that pad'll be disposed 
of before the cleanup handler runs. (Or, I suppose, two can be 
pushed, one for before and one for after, if it's noted that's needed)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [perl #33032] Parameter fillin problem

2004-12-13 Thread Dan Sugalski
At 9:08 AM + 12/14/04, Leopold Toetsch via RT wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
 IMCC's doing odd things when moving PMCs into the appropriate spot
 when calling into functions with a large number of parameters. Here's
 a snip from a trace of one of the programs running. Note the lines
 from bytecode offset 78123, 78126, and 78130. P9 is set to P28 (which
 is right) then P9 is set to spill 64, which is then moved to register
 P10 (leaving the same PMC in P9 and P10, which isn't correct)
I tried to reproduce it but failed. I've added more tests that all spill
correctly.
Are you using CVS head?
Damn. No, I'm using a build from Nov 30th. Syncing up with CVS head 
breaks this code in other interesting ways. I'll go close this bug 
and track down the current problems.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [perl #33032] Parameter fillin problem

2004-12-13 Thread Dan Sugalski
At 8:48 AM -0500 12/14/04, Dan Sugalski wrote:
At 9:08 AM + 12/14/04, Leopold Toetsch via RT wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
 IMCC's doing odd things when moving PMCs into the appropriate spot
 when calling into functions with a large number of parameters. Here's
 a snip from a trace of one of the programs running. Note the lines
 from bytecode offset 78123, 78126, and 78130. P9 is set to P28 (which
 is right) then P9 is set to spill 64, which is then moved to register
 P10 (leaving the same PMC in P9 and P10, which isn't correct)
I tried to reproduce it but failed. I've added more tests that all spill
correctly.
Are you using CVS head?
Damn. No, I'm using a build from Nov 30th. Syncing up with CVS head 
breaks this code in other interesting ways. I'll go close this bug 
and track down the current problems.
Or not. (I've got too many versions of parrot around at the moment) I 
see this bug happening against yesterday morning's parrot. 
imcc/CVS/Entries shows a date of Mon Dec 13 12:19:33 2004 for 
reg_alloc.c.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Q: scope exit

2004-12-13 Thread Dan Sugalski
At 3:31 PM +0100 12/14/04, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
 At 10:19 AM +0100 12/14/04, Leopold Toetsch wrote:

 Which does argue that it ought not be a sub, I suppose, but something
 simpler. A plain bsr sort of thing.
A bsr doesn't change anything. It has to return to the caller. That
thing, where it's returning to, is alive.
I'm only concerned about overhead here. Live-ness is a separate 
issue. (An important one, but separate. That's dealt with below)

 If the cleanup handler is a plain subroutine, the previous one, which
should be cleaned, is alive. A lazy DOD will find the filehandle in the
lexicals and likely in registers.

 Well, maybe. Subs are going to have multiple scopes in them (which
 argues for a faster-than-sub cleanup handler dispatch) so it's more
 than just a 'cleanup before leaving the sub' sort of thing. There's
 an awful lot of code around that looks like:

  sub foo {
  # Insert code here
  foreach (@some_array) {
  }

  {
 # Some code that needs its own block
  }
  if (foo) {
  } else {
  }
 }

 Besides cleaning up on sub exit, there's also a potential cleanup
 when the foreach is left, the bare block is left, and each of the
 legs of the if are left. (Potentially once on each foreach iteration,
 I suppose)
Yes. I'll presume that the first Perl6 compiler will just emit closures
for each block.
Ah, I hope not. I *really* hope not. (Paying attention Patrick? :) 
That'd be rather slower than necessary in most cases.

[Snippage]
But that still doesn't solve the problem that a file-handle (after
cleaning lexicals) is still in a PMC register, when the Csweep 0
opcode is run.
True but, and this is the good part, that's not our problem. It is, I 
think, safe to assume that language compilers that want timely 
destruction will make sure to clean up after themselves sufficiently 
to make that timely destruction possible. It's our job to provide the 
mechanisms they need, and leave it to them to use them as needed.

In other words, we punt it to someone else. :)
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Still out of touch...

2004-12-13 Thread Dan Sugalski
Hey folks.
I'm really sorry that I've been missing of late -- been mugged by 
work. (Which you might've figured, seeing the trickle of bug reports 
:) I'm still a week or more behind in p6i mail, though if I'm lucky 
I'll be able to mostly catch up in the next few days.

I don't think my time shortage is going to get any better in the 
immediate future, so I'd like to try and focus things as best as I 
can.

First order of business is to get the object stuff that's in the air 
finished. Second is to get the new string stuff reintegrated to the 
main branch. (Or, if someone's got a few minutes and wants to play 
CVS games and see if it Just Works, then give it a shot and we can do 
it simultaneously)

Anything else, with the exception of bugs or existing issues that 
require some architectural design decision, I'm going to ignore -- 
Leo's darned capable and can handle that stuff without any problem. 
For the things that *do* require some design input, please CC me 
personally. (While my inbox can be a massive black hole of mail, if 
it doesn't get lost it'll get caught earlier)

Sorry 'bout all this. Hopefully things'll clear up soon and we can 
start juggling more balls.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: PDD 03 Issue: keyword arguments

2004-12-01 Thread Dan Sugalski
At 10:29 PM -0500 11/30/04, Sam Ruby wrote:
Python provides the ability for any function to be called with 
either positional or keyword [1] arguments.  Here is a particularly 
brutal example:
Oh, it's even more brutal than that. Perl 6 goes one step further, 
such that you can't tell whether a name/value pair is a keyword or 
positional parameter at the time you make the call, and can 
potentially be either with MMD. (There's a reason I've been hiding 
from this one) Each potential keyword parameter is a name/value pair 
(of type Pair), but whether it's to be taken as a keyword fillin or a 
positional parameter depends on the called sub. For example:

   foo(b = 1, a= 2)
is a keyword call if foo's prototype is:
   sub foo($a, $b)
or
   sub foo(Integer $a, Integer $b)
but is a positional call if the prototype is:
   sub foo(Pair $a, Pair $b)
and if the prototype is
   sub foo(Pair $a, Integer $b)
I think I get to smack someone with a stick.
Anyway, this is... complex, or at least hurts my brain, but it's time 
to deal with it.

My first thought is this scheme:
All potential keyword parameters (that is, name/value pairs) are 
passed in as PMCs of type Pair. (Pairs will need to have some sort of 
funky vtable to allow them to act mostly like their value, I think, 
to make this work right) The called sub, if it cares to do named arg 
parsing, will then go process the passed in parameters and Do The 
Right Thing with them, presumably stuffing values into the right 
spots and yelling if things aren't right.

It feels like something of a cheat, but I'm OK with cheating with 
this if we set up the right protocol and provide support for it. 
(Like some sort of library routine or opcode to handle shuffling the 
keyword parameters into the right spots, and we probably ought to 
have it provide optional typechecking while we're at it)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: continuation enhanced arcs

2004-12-01 Thread Dan Sugalski
At 12:06 AM -0800 12/1/04, Jeff Clites wrote:
On Nov 30, 2004, at 11:45 AM, Dan Sugalski wrote:
In this example:
% cat continuation6.ruby
def strange
callcc {|continuation| $saved = continuation}
end
def outer
a = 0
strange()
a = a + 1
print a = , a, \n
end
Through the joys of reference types, a will continue to increase 
forevermore, assuming the compiler hasn't incorrectly put a in an 
int register. (Which'd be wrong)
Separate question, but then what would happen for languages which 
_do_ use primitive types? (Presumably, Perl6 would do that in the 
my int case.) If proper behavior requires basically never using 
the primitive I/N types, that seems like a waste.
Two potential options. One, they have a backing PMC for the lexicals 
(which they'll probably need anyway) and flush/restore at spots where 
things could get lost. Two, they don't actually get a low-level type 
but the compiler cheats and acts as if it was in spots where it's 
safe. (I admit, I'd always planned on having my int $foo use a PMC. 
The win would be for my int @foo which would also get a PMC, but 
one that had optimized backing store for the values)

The contents can change over and over without the register itself 
ever changing.
But in this Ruby case, a = a + 1 actually creates a new Fixnum 
instance, so a ends up holding a different instance each time--you 
can verify that by printing out a.id in the print statement.
This is where the magic of objects comes in, for particularly loose 
values of magic. Ruby uses a double-reference system for objects 
the same way that perl 5 does -- that is, the underlying structure 
for a doesn't hold the object, rather it holds a pointer to another 
structure that holds the object. So it looks like:

 (name a) - (a struct, addr 0x04) - (object struct addr 0x08)
and after the store it looks like:
 (name a) - (a struct, addr 0x04) - (object struct addr 0x0C)
The PMC register would hold the 0x04 struct pointer, basically a 
Reference PMC, and assignments to it just switch the thing it refers 
to.

Essentially things like I and N registers are value types, PMC and 
strings are (for us) reference types, and many objects (including 
ruby and perl's objects) are double-reference types.

Which, yeah, means we've been ignoring some important object things. 
Time to deal with that, too.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: continuation enhanced arcs

2004-12-01 Thread Dan Sugalski
At 10:12 PM -0800 11/30/04, Bill Coffman wrote:
On Tue, 30 Nov 2004 14:45:39 -0500, Dan Sugalski [EMAIL PROTECTED] wrote:
 At 11:20 AM -0800 11/30/04, Jeff Clites wrote:
 % cat continuation6.ruby
 def strange
  callcc {|continuation| $saved = continuation}
 end
 
 def outer
  a = 0
  strange()
  a = a + 1
  print a = , a, \n
 end
 Through the joys of reference types, a will continue to increase
 forevermore, assuming the compiler hasn't incorrectly put a in an int
 register. (Which'd be wrong)
I can see that there is true magic in the power of using references in
this way.  Nonetheless, how can the compiler figure out that it can't
use an integer here?
Generally it can't. Unfortunately our target languages are painfully 
difficult (and in the general case, nearly impossible) to optimize. :(
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Parrot Strong typing

2004-12-01 Thread Dan Sugalski
At 4:53 PM +1000 12/1/04, Cameron Zemek wrote:
[Yeah, I snipped the first question. It's early, and I've not had 
enough coffee :)]

Also could the Parrot VM be used effectively with strong typing languages.
Absolutely. At least some of the languages we're interested in, 
specifically perl 5 and perl 6, (I'm less sure about python, ruby, 
and tcl, though I'm pretty sure they are also strongly typed, at 
least in some circumstances) are very strongly typed.

Parrot, on the other hand, is not going to be well-suited at all to 
*static* typing, though since that's generally more a compiler thing 
than a runtime thing it's less of an issue. (Though if you're going 
to play well with others and the library it may be a bit of a 
headache, since many of the types will be I dunno, and we're likely 
not going to do much to keep code from swapping in  subs and methods 
at runtime with prototypes and type guarantees that are different 
from what they're replacing)

For those folks playing along at home, I'll take a bit to talk about 
the difference between strong/weak typing and static/dynamic typing. 
(I think I blathered on about this on my blog, but I can't be 
bothered to go look it up right now :)

A *strong* type system is one that doesn't allow you to violate the 
constraints you put on variables, while a *weak* type system does 
allow you to do so -- it is, basically, a measure of how badly you 
can lie to the compiler about the type of a variable. This is a 
continuum, so you'll find languages are strong-ish or weak-ish.

C, for example, is weakly typed. That is, while you tell the system 
that a variable is one thing or another (an int, or a float), you're 
perfectly welcome to treat it as another type. This is *especially* 
true of values you get to via pointers. For example, this snippet 
(and yes, it's a bit more explicit than it needs to be. Cope, you 
pedants :):

char foo[4] = abcd;
printf(%i, *(int *)foo[0]);
tells the C compiler that foo is a 4 character string with the value 
abcd, but in the next statement we get a pointer to the start of 
the string, tell the compiler No, really, this is a pointer to an 
int. Really! and then dereference it as if the string abcd really 
*was* an integer. If C were strongly typed you couldn't do that.

Perl, on the other hand, is strongly typed. If you try this:
$foo = abcd;
$bar = \$foo; # Get a reference to $foo
print $bar-[10]; # Treat $bar as if it were a reference to an array
Perl will yell at you, telling you that $bar isn't an array 
reference. If perl were weakly typed (like C is) it'd let you, but it 
doesn't.

And yeah, I'm throwing perl in here because it has what is reasonably 
considered a bizarre type system (it doesn't have integers or strings 
as types. It has singular thing, aggregate thing accessed via 
integer offset, and aggregate thing accessed by name as types, 
with a lot of autoconversion and context sensitive behaviour to give 
people's brains a twist) but it's still a strong one -- you're just 
not allowed to violate it.

Static vs dynamic typing, on the other hand, refers to how much 
knowledge you have at compile time about the types of your variables. 
For example, with Ruby (since I have a manual handy and wouldn't want 
to embarrass myself in python with Sam around :) code like:

s = object.bar(1,2,3)
is just fine -- this is the first time we use s, it's never been 
declared, and the compiler may well have no clue as to what 
object.bar returns (heck, it may not even exist at compiletime). We 
can only know the type of s at a point in time. C, on the other hand, 
is statically typed -- you must give the compiler a type for each 
variable. (Even if you lie about it later)

Strong/weak and static/dynamic typing can mix just fine. You can have 
a strong dynamically typed language (this'd be one where you don't 
know the type of a variable at compiletime, but once it gets a type 
it *keeps* that type) or a weakly statically typed language, like C, 
where you *must* give types at compile time to everything but can 
like left and right about it at runtime.

(And thus endeth the rant :)
Anyway, Parrot'll do strong typing if you want it to, no big deal. 
PMCs are in complete control on assignment, so you can have all the 
strong types check to see what they're handed and pitch a fit at 
runtime if it's wrong.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Parrot Strong typing

2004-12-01 Thread Dan Sugalski
At 2:47 PM + 12/1/04, Richard Jolly wrote:
On 1 Dec 2004, at 14:33, Matt Fowles wrote:
Strong typing can be more clearly seen in something like haskell
Will there be haskell on parrot? How easy/hard would that be?
Dunno if there will (though I'd love it) and it shouldn't be too 
hard. That'd be an interesting thing. (I've pondered, more than once, 
Prolog for parrot :)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


What is and isn't up for grabs

2004-12-01 Thread Dan Sugalski
To help us stay focused on the ultimate goal, I'm going to pause and 
take a minute to clarify what is and isn't up for current discussion. 
Current discussion here meaning until we're functionally complete 
and pass a comprehensive test suite.

*) Unclear parts of the existing architecture are up for discussion.
*) Unspecified architecture parts are up for discussion.
*) Changes to the existing programmer-facing architecture aren't up 
for current discussion. This includes the parts of parrot's virtual 
architecture (things like register count, which stacks exist, whether 
PMCs have vtables, whether strings being potentially binary data and 
text, what ops exist) which are already specified.

*) Changes to the internal implementation for performance reasons 
aren't up for current discussion or re-implementation. Performs 
slowly isn't enough of a reason to rework parts of the internals now 
(though it will be later).

*) Bug fixes to the internal implementation are up for discussion and 
implementation.

*) Reworking parts of the implementation because newly specified 
features make the current implementation untenable is okay.

Basically, if it exists, leave it until we're functionally complete. 
When we get there we'll have a re-evaluation phase and rework things. 
It'll be clear when we get to that point. In the mean time, let's 
stick to buggy parts of the implementation, or parts that won't work 
with new architecture.

Discussion of things which aren't up for discussion will get a 
simple, good-humored reply: http://insert the perl.org archive link 
of this message.

The design isn't perfect. No design ever is. But we'll have a much 
better idea of what our real weaknesses are, and how to address them, 
once we're functionally complete. Getting functionally complete 
should be our current overriding goal.

(And for those who've noted that this is somewhat different than my 
normal postings, well... this is what happens when you get someone 
competent in the use of english language involved :)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: continuation enhanced arcs

2004-11-30 Thread Dan Sugalski
At 1:45 AM -0800 11/29/04, Jeff Clites wrote:
On Nov 28, 2004, at 2:48 AM, Piers Cawley wrote:
I just thought of a heuristic that might help with register
preservation:
A variable/register should be preserved over a function call if either of the
following is true:
1. The variable is referred to again (lexically) after the function has
   returned.
2. The variable is used as the argument of a function call within the
   current compilation unit.
That doesn't solve it, though you'd think it would. Here's the 
counter-example:

x = 1
foo()
print x
y = 2
return y
You'd think that x and y could use the same memory location 
(register, variable--whatever), since ostensibly their lifetimes 
don't overlap. But continuation re-invocation can cause foo() to 
return multiple times, and each time it should print 1, but it 
won't if x and y use the same slot (it would print 2 each time 
after the first). In truth, their lifetimes do overlap, due to the 
hidden (potential) loops created by continuations.
Except... we've already declared that return continuations are 
special, and preserve the registers in the 16-31 range. So when we 
return from foo, regardless of how or how many times, the pointer to 
x's PMC will be in a register if it was in there before the call to 
foo, if it's in the preserved range. So in this case there's no 
problem. Things'll look like:

  x = 1 # new P16, .Integer; P16 = 1 # P16 has pointer value 0x04
  foo() # foo invocation
  print x # P16 still has pointer value 0x04
  y = 2 # new P16, .Integer; P16 = 2 # P16 now has pointer value 0x08
  return y # Passes back 0x08
With more or less clarity.
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: continuation enhanced arcs

2004-11-30 Thread Dan Sugalski
At 2:51 PM +0100 11/29/04, Leopold Toetsch wrote:
Luke Palmer [EMAIL PROTECTED] wrote:
 It seems to me that there is no good solution to this problem without
 annotating the register set or killing the register allocator.
I think I've proposed a reasonable solution: putting lexicals in
registers.
I'm reading these threads a bit out of order so I think I'm missing 
some context here, but lexicals in registers works out fine as long 
as the backing pad is kept up to date. (And since PMCs are all 
reference types that's pretty easy) Lexicals *exclusively* in 
registers won't work -- the languages we're doing require pads.

If the big issue is spilling properly, lexicals could be removed 
entirely from the spill picture with proper PIR notation and some 
fleshing out of how we're handling pads. For example it'd be fine to 
do something like:

 .sub foo
   .lexicals x, y, z
   %1 = new Integer
   %2 = new String
   %3 = new Hash
   %1 = 12
   %2 = foo
   %3[%2] = %1
 .end
%nnn could work the same as $Pnnn notation but refer to slots in the 
current pad. Since these are always PMCs, and always have a 
functioning backing store, they can be dropped as need be and 
reloaded.

Originally they were going to be tied in heavily to the full array 
notation so lexicals wouldn't be required to take up registers -- 
code like:

   a =  b + c
could've been expressed by a single op:
  add lexbase[1], lexbase[2], lexbase[3]
and take a lot of pressure of the register allocator, but...
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Namespace-sub invocation syntax?

2004-11-30 Thread Dan Sugalski
At 11:48 AM +0100 11/24/04, Leopold Toetsch wrote:
Luke Palmer [EMAIL PROTECTED] wrote:
 Should there be one for invoking a sub out of a namespace, say:

 .namespace [ Baz ]

 .sub quux
 [ Foo, bar ]()
Looks a bit strange.
I think for this being explicit is fine:
$P1 = global [Foo], bar
$P1()
Class methods already have their namespaces. For subs we could do:
  .locale pmc ns, ns_foo
  ns = interpinfo .CURRENT_NAMESPACE# or .TOPLEVEL_NAMESPACE
  ns_foo = ns [Foo]
  ns_foo.bar()
The namespace PMC provides the find_method() that's actually behind that
call. With the additional benefit that it's using the method cache too.
Hrm. No, I don't think this is the right way to go for this, and I 
don't think it ought to use the method cache. It'll certainly screw 
up code that does the sensible thing and looks to see if an object 
was passed to see if it was invoked as a method or sub.

Subs aren't methods, and shouldn't be invoked as such. They really 
*are* subs, and at best the invocation should be:

ns_foo[bar]()
except we don't do that any more.
This should be a two step thing, doing a fetch and then sub invoke.
  ... If the former, how do we name our classes?
 Do we have to mangle those ourselves, or is there a way to put a class
  in a namespace?
This is turning out to be a more complex issue. Namespaces might not 
be the right answer here.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: continuation enhanced arcs

2004-11-30 Thread Dan Sugalski
At 9:36 AM -0500 11/30/04, Matt Fowles wrote:
Dan~
On Tue, 30 Nov 2004 08:28:35 -0500, Dan Sugalski [EMAIL PROTECTED] wrote:
 At 1:45 AM -0800 11/29/04, Jeff Clites wrote:
 On Nov 28, 2004, at 2:48 AM, Piers Cawley wrote:
 
 I just thought of a heuristic that might help with register
 preservation:
 
 A variable/register should be preserved over a function call if 
either of the
 following is true:
 
 1. The variable is referred to again (lexically) after the function has
 returned.
 2. The variable is used as the argument of a function call within the
 current compilation unit.
 
 That doesn't solve it, though you'd think it would. Here's the
 counter-example:
 
x = 1
foo()
print x
y = 2
return y
 
 You'd think that x and y could use the same memory location
 (register, variable--whatever), since ostensibly their lifetimes
 don't overlap. But continuation re-invocation can cause foo() to
 return multiple times, and each time it should print 1, but it
 won't if x and y use the same slot (it would print 2 each time
 after the first). In truth, their lifetimes do overlap, due to the
 hidden (potential) loops created by continuations.

 Except... we've already declared that return continuations are
 special, and preserve the registers in the 16-31 range. So when we
 return from foo, regardless of how or how many times, the pointer to
 x's PMC will be in a register if it was in there before the call to
 foo, if it's in the preserved range. So in this case there's no
 problem. Things'll look like:
x = 1 # new P16, .Integer; P16 = 1 # P16 has pointer value 0x04
foo() # foo invocation
print x # P16 still has pointer value 0x04
y = 2 # new P16, .Integer; P16 = 2 # P16 now has pointer value 0x08
return y # Passes back 0x08
 With more or less clarity.
I think that the concern is for the circumstance where foo() promotes
it return continuation to a full continuation.  Then, that guarantee
is no longer provided (I think), and repeated invocation could leave y
in P16 rather than x.
Nope. The guarantee's still there. Promotion will force things up the 
call chain to get marked as un-recyclable, but registers still get 
restored on invocation.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Lexicals, continuations, and register allocation

2004-11-30 Thread Dan Sugalski
At 9:15 PM +0100 11/23/04, Leopold Toetsch wrote:
Below inline/attached are some thoughts WRT the subject.
leo
Lexicals, continuations, and register allocation
1) Recent discussions have shown that we obviously can't handle all
the side effects of continuations correctly. Reusing preserved
(non-volatile) registers after a call isn't possible any more, due to
loops in the CFG a continuation might create.
I admit, I've been watching the discussion and I just don't 
understand why there's a problem.

So far as I can tell there are two cases here:
1) A return continuation
2) A regular continuation
We've already established that a return continuation preserves the 
top half of the registers (though I think that's not been clear) so 
there's just no problem there -- at the point a return continuation 
returns, [IPSN]16-31 will be as they were when the return 
continuation was created. Which, in practice, makes the I and N 
registers useless for variables like loop counters, though fine for 
constants (both real and effective) So continuations, as such, don't 
make any difference here. A return is a return, and if it works right 
the first time it should work right all the times.

The second case, where code takes an arbitrary continuation that 
returns to location X, wherever X is. I'm missing the problem there, 
too. Assuming there's a way to note the destination to the register 
allocator (with those points being places where all registers must be 
assumed to be trash) I'm not seeing the problem either. There are 
only two cases here, where the destination is marked and where it 
isn't. If it's marked, the register allocator assumes things are 
dirty and loads everything, which is fine. If it's unmarked, the code 
has essentially shot itself and everything quietly goes to hell since 
you've lied to the register allocator and you shouldn't do that. 
Which is fine too. Don't Do That.

N.B. that this is an issue that only affects the PIR register 
allocator -- I'm not seeing a case where this can be an issue for 
anything else, including plain assembly. If I'm missing something 
this would be a good time to point out the missing bits.

There are two proposed, accepted but undesirable work arounds:
a) don't reuse registers - drawback spilling
b) refetch all from lexicals - drawback execution time
Before I comment on this one, I want to double-check -- you're 
proposing tossing the pads and going with a variable-sized register 
frame, yes?
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: continuation enhanced arcs

2004-11-30 Thread Dan Sugalski
At 10:10 AM -0500 11/30/04, Matt Fowles wrote:
Dan~
On Tue, 30 Nov 2004 09:49:54 -0500, Dan Sugalski [EMAIL PROTECTED] wrote:
 At 9:36 AM -0500 11/30/04, Matt Fowles wrote:
 Dan~
 
 
 On Tue, 30 Nov 2004 08:28:35 -0500, Dan Sugalski [EMAIL PROTECTED] wrote:
   At 1:45 AM -0800 11/29/04, Jeff Clites wrote:
 
 
   On Nov 28, 2004, at 2:48 AM, Piers Cawley wrote:
   
   I just thought of a heuristic that might help with register
   preservation:
   
   A variable/register should be preserved over a function call if
 either of the
   following is true:
   
   1. The variable is referred to again (lexically) after the 
function has
   returned.
   2. The variable is used as the argument of a function call within the
   current compilation unit.
   
   That doesn't solve it, though you'd think it would. Here's the
   counter-example:
   
  x = 1
  foo()
  print x
  y = 2
  return y
   
   You'd think that x and y could use the same memory location
   (register, variable--whatever), since ostensibly their lifetimes
   don't overlap. But continuation re-invocation can cause foo() to
   return multiple times, and each time it should print 1, but it
   won't if x and y use the same slot (it would print 2 each time
   after the first). In truth, their lifetimes do overlap, due to the
   hidden (potential) loops created by continuations.
 
   Except... we've already declared that return continuations are
   special, and preserve the registers in the 16-31 range. So when we
   return from foo, regardless of how or how many times, the pointer to
   x's PMC will be in a register if it was in there before the call to
   foo, if it's in the preserved range. So in this case there's no
   problem. Things'll look like:
 
  x = 1 # new P16, .Integer; P16 = 1 # P16 has pointer value 0x04
  foo() # foo invocation
  print x # P16 still has pointer value 0x04
  y = 2 # new P16, .Integer; P16 = 2 # P16 now has 
pointer value 0x08
  return y # Passes back 0x08
 
   With more or less clarity.
 
 I think that the concern is for the circumstance where foo() promotes
 it return continuation to a full continuation.  Then, that guarantee
 is no longer provided (I think), and repeated invocation could leave y
 in P16 rather than x.

 Nope. The guarantee's still there. Promotion will force things up the
 call chain to get marked as un-recyclable, but registers still get
 restored on invocation.
In that case, I am confused.  When does the guarantee NOT apply?
When a general-purpose continuation's taken. Return continuations are 
special case, designed to make sub calls and returns work as people 
expect in the face of the underlying CPS mechanism we have. (They 
are, if you like, a cheat) Exception continuations are also 
special-case continuations. As are sub and method PMCs when you get 
right down to it.

A general purpose continuation has two things attached to it. The 
first is a destination address -- where in the code to go when the 
continuation is invoked. The second thing attached is an environment 
-- the lexical pad pointer, namespace pointer, opcode library 
pointer, security information, top stack frame pointers, and so on. 
Basically everything in the interpreter structure outside of the 
internal bookkeeping and interpreter global information. (The context 
structure's misnamed. It's not the context that continuations care 
about)

When a continuation (or, really, anything) is invoked, parts of the 
interpreter struct are overwritten with information from the 
continuation, parts are left alone, and parts are effectively nulled 
out. Which bits do what depends entirely on the 
continuation/invokable, as each type saves/ignores/nulls different 
things.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Lexicals, continuations, and register allocation

2004-11-30 Thread Dan Sugalski
At 5:30 PM +0100 11/30/04, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
 At 9:15 PM +0100 11/23/04, Leopold Toetsch wrote:
Below inline/attached are some thoughts WRT the subject.
leo
Lexicals, continuations, and register allocation
1) Recent discussions have shown that we obviously can't handle all
the side effects of continuations correctly. Reusing preserved
(non-volatile) registers after a call isn't possible any more, due to
loops in the CFG a continuation might create.

 I admit, I've been watching the discussion and I just don't
 understand why there's a problem.

 So far as I can tell there are two cases here:

 1) A return continuation

 2) A regular continuation

 We've already established that a return continuation preserves the
 top half of the registers (though I think that's not been clear) so
 there's just no problem there --
That term preserves needs some more clarification. Some months ago we
used the savetop / restoretop opcodes. They made a copy of half of
the register frame and restored it on return. The caller and the
subroutine did executed in the registers inside the interpreter
structure. We decided to give up this scheme in favour of the indirect
register addressing.
No, we didn't. We mandated that call/return preserved the top 16 
registers of all the types automatically. That's *all* we did. The 
rest is implementation detail.

  ... at the point a return continuation
 returns, [IPSN]16-31 will be as they were when the return
 continuation was created. Which, in practice, makes the I and N
 registers useless for variables like loop counters, though fine for
 constants (both real and effective)
Huh? You are now back with register copying?
No. I'm still on register preserving, which is where we started. 
Quoth PDD 03 (which, I'll note, you agreed with)

  The return continuation PMC type, used to create return
  continuations used for call/return style programming, guarantees
  that registers 16-31 for all types (P, N, S, and I) will be set
  such that the contents of those registers are identical to the
  content of the registers when the return continuation was
  Icreated.
It's pretty clear, I think.
  N.B. that this is an issue that only affects the PIR register
 allocator
Yes. That is what we are discussing since weeks.
Just checking. The answer is no of course, but you knew that when 
you started this discussion. Architecture changes aren't an option 
we're entertaining until after we're functionally complete. *That*, 
I'm quite sure, I was clear on.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: PIC again (was: Too many opcodes)

2004-11-30 Thread Dan Sugalski
[Snip]
This is interesting. After we're functionally complete we can revisit it.
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: continuation enhanced arcs

2004-11-30 Thread Dan Sugalski
At 10:15 AM -0800 11/30/04, Jeff Clites wrote:
On Nov 30, 2004, at 5:28 AM, Dan Sugalski wrote:
At 1:45 AM -0800 11/29/04, Jeff Clites wrote:
On Nov 28, 2004, at 2:48 AM, Piers Cawley wrote:
I just thought of a heuristic that might help with register
preservation:
A variable/register should be preserved over a function call if 
either of the
following is true:

1. The variable is referred to again (lexically) after the function has
   returned.
2. The variable is used as the argument of a function call within the
   current compilation unit.
That doesn't solve it, though you'd think it would. Here's the 
counter-example:

x = 1
foo()
print x
y = 2
return y
You'd think that x and y could use the same memory location 
(register, variable--whatever), since ostensibly their lifetimes 
don't overlap. But continuation re-invocation can cause foo() to 
return multiple times, and each time it should print 1, but it 
won't if x and y use the same slot (it would print 2 each time 
after the first). In truth, their lifetimes do overlap, due to the 
hidden (potential) loops created by continuations.
Except... we've already declared that return continuations are 
special, and preserve the registers in the 16-31 range. So when we 
return from foo, regardless of how or how many times, the pointer 
to x's PMC will be in a register if it was in there before the call 
to foo, if it's in the preserved range. So in this case there's no 
problem. Things'll look like:

  x = 1 # new P16, .Integer; P16 = 1 # P16 has pointer value 0x04
  foo() # foo invocation
  print x # P16 still has pointer value 0x04
  y = 2 # new P16, .Integer; P16 = 2 # P16 now has pointer value 0x08
  return y # Passes back 0x08
With more or less clarity.
But the problem isn't preservation per se. When a continuation 
(originally captured inside of foo) is invoked, the frame will be 
restored with the register contents it had when it last executed, so 
P16 in your annotations will have pointer value 0x08 after the 
first time that continuation is invoked (because y = 2 will have 
executed and changed the register contents).
Oh. No, it won't. We've declared that return continuations will 
always leave the top half registers in the state they were when the 
return continuation was taken. In this case, when it's taken to pass 
into foo, P16 is 0x04. When that return continuation is invoked, no 
matter where or how many times, P16 will be set to 0x04. This does 
make return continuations slightly different from 'plain' 
continuations, but I think this is acceptable.

None of this should have anything to do with return continuations 
specifically, since this is the case where the body of foo (or 
something called from it) creates a real continuation, which as I 
understand it is supposed to promote the return continuations to 
real continuations all the way up the stack.
The return continuations have to maintain their returny-ness 
regardless, otherwise they can't be trusted and we'd need to 
unconditionally reload the registers after the return from foo(), 
since there's no way to tell whether we were invoked via a normal 
return continuation chain invocation, or whether something funky 
happened down deep in the call chain.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Lexicals, continuations, and register allocation

2004-11-30 Thread Dan Sugalski
At 7:20 PM +0100 11/30/04, Thomas Seiler wrote:
At Tue 30 Nov 6:22pm, Dan Sugalski wrote:
 Architecture changes aren't an option we're entertaining until after we're
 functionally complete.
Just would like to ask a related question:
Is a change that invalidates an existing precompiled bytecode but not
the source code of it
considered as an architecture change ?
Nope. This'll happen as ops come, go if they're fundamentally broken 
and/or stupid, and get shifted around. Changes to the packfile format 
and the basic PMCs'll do it too.

(At one point, packfiles contained a versioning scheme to catch
changes in the ABI.
 Don't know about current state, though)
It still does, pretty reliably. (Or, rather, completely reliably so 
far as I can tell. Which is very nice)

We've not yet declared fully backward-compatibility for bytecode 
files. That's likely not going to happen until after the first final 
release.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: continuation enhanced arcs

2004-11-30 Thread Dan Sugalski
At 11:20 AM -0800 11/30/04, Jeff Clites wrote:
On Nov 30, 2004, at 10:27 AM, Dan Sugalski wrote:
At 10:15 AM -0800 11/30/04, Jeff Clites wrote:

None of this should have anything to do with return continuations 
specifically, since this is the case where the body of foo (or 
something called from it) creates a real continuation, which as 
I understand it is supposed to promote the return continuations to 
real continuations all the way up the stack.
The return continuations have to maintain their returny-ness 
regardless, otherwise they can't be trusted and we'd need to 
unconditionally reload the registers after the return from foo(), 
since there's no way to tell whether we were invoked via a normal 
return continuation chain invocation, or whether something funky 
happened down deep in the call chain.
Yeah, so I think that won't work correctly.
Ah, you see, that's where the Cunning Plan comes in. :)
 Here's an example from Ruby which I posted in a previous thread. If 
the return from the call to strange() by outer() always restores the 
registers as of the point the (return) continuation was created, 
then the below would print out a = 1 over and over, but really 
it's intended that the value should increase, so with the behavior 
you describe, the following Ruby code wouldn't work right:
But it will, you see. All that happens is that the PMC register that 
holds a reference to a continues to hold a reference to a. The 
*variable* that the register refers to is constant. The value, 
though, isn't, and will change over time as it ought to. This isn't 
any different than the way we're handling globals and lexicals -- 
something like, assuming a_variable has an initial value of 0:

foo = global a_variable
foo = foo + 1
bar = global a_variable
print bar
print foo
this'd print 1 1.
In this example:
% cat continuation6.ruby
def strange
callcc {|continuation| $saved = continuation}
end
def outer
a = 0
strange()
a = a + 1
print a = , a, \n
end
Through the joys of reference types, a will continue to increase 
forevermore, assuming the compiler hasn't incorrectly put a in an int 
register. (Which'd be wrong) Remember the PMC and string registers 
hold pointers to pmc/string structures, which is all we're preserving 
-- the *pointer* to the structure, not the contents of the structure. 
(Though that's an interesting thing to do for other reasons, like, 
say, transactions, we're not going to be doing that) The contents can 
change over and over without the register itself ever changing.

# these two lines are main
outer()
$saved.call
% ruby continuation6.ruby
a = 1
a = 2
a = 3
a = 4
a = 5
a = 6
a = 7
a = 8
a = 9
a = 10
...infinite loop, by design
JEff

--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: continuation enhanced arcs

2004-11-29 Thread Dan Sugalski
At 11:06 AM -0500 11/29/04, Matt Fowles wrote:
All~
On Mon, 29 Nov 2004 14:51:43 +0100, Leopold Toetsch [EMAIL PROTECTED] wrote:
 Luke Palmer [EMAIL PROTECTED] wrote:
  It seems to me that there is no good solution to this problem without
  annotating the register set or killing the register allocator.
 I think I've proposed a reasonable solution: putting lexicals in
 registers.
I would appreciate it if Dan (who I cc'ed this to directly) would
weigh in on this thread.  I think that we have present most of the
options as clearly as we can, and a decision on how to move forward
would be appreciated.
I expect I will in a bit -- work's got me backed up and I've got 
too-big a backlog of perl6-internals mail to read. (This isn't a good 
time to slip things in past me, though :) I'll try and dig through 
this thread later today.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Objects, classes, metaclasses, and other things that go bump in the night

2004-11-29 Thread Dan Sugalski
 them as a sub-object, and 
will have the master composing object as a property. We can get into 
how this is handled later. The important thing here is that we are 
going to have a flag bit that says I'm only part of an object and 
whenever a method call is made on a sub object parrot will 
automatically look for the 'master' object and make the method call 
on it instead.

Parrot's object system is going to be generally class-based -- that 
is, each object has a class that's responsible for managing the 
object. Classes, being objects, themselves have classes, but past 
that we don't look too closely or we'll end up getting bitten by the 
snake. (No, not that snake, the other one) This does mean that 
prototype-based object systems are going to be kinda annoying, but if 
you do things right then the object you're using as a prototype is 
your class anyway, so it ought to work out OK. I think. Hopefully, at 
least, since prototype-based object systems aren't our main interest.

A class is responsible for instantiating objects, providing basic 
information, providing subclasses of itself, merging in with another 
class for multiple inheritance, and managing methods. Basically we 
access the class object when we want to subclass a class, add a class 
into another class' inheritance hierarchy,  make new objects, and 
mange a namespace. (I think. we might go with classes managing their 
own methods, or may just declare that classes are going to find their 
methods in a namepsace of the same name so had darned well better go 
look there. I'm as yet somewhat undecided -- selector namespaces are 
awfully tempting, and there's MMD to consider)

So. What does a class need to be able to do? I'm thinking the 
following basic methods (the names are up in the air)

subclass - To create a subclass of a class object
add_parent - To add a parent to the class this is invoked on
become_parent - Called on the class passed as a parameter to add_parent
class_type - returns some unique ID or other so all classes in one
 class family have the same ID
instantiate - Called to create an object of the class
add_method - called to add a method to the class
remove_method - called to remove a method from a class
namespace_name - returns the name of this class' namespace
All objects also must be able to perform the method:
get_anonymous_subclass - to put the object into a singleton anonymous
 subclass
I can see adding some information-fetching methods for introspection 
to this list, so I'm up for those too.

Some fallout you might not have thought about: Objects are ultimately 
responsible for handling method finding, which means that each object 
controls the search order of its parent classes when looking for 
methods, so if you want to fiddle with that, it's fine. It also means 
you can mess around with the class hierarchy as it appears to 
external code (fibbing about who you are, or aren't) and suchlike 
things.

Now, there is one big gotcha here -- multimethod dispatch. I'm not 
entirely sure that we can do this properly and still give classes 
full control over how methods are looked for and where they go. I 
*think* it can be done, but I'm not feeling clever enough to see how 
without having a relatively costly double lookup on every method call 
(first to see if there's a registered MMD method for the named 
method, then the regular dispatch if there's not) so I'm not sure we 
will. Efficient MMD wins, if we can make it look like 
perl/python/ruby/tcl's method lookup rules are in force, even if they 
really aren't under the hood.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: eof opcode

2004-11-29 Thread Dan Sugalski
At 12:23 PM -0500 11/29/04, brian wheeler wrote:
Fair enough.  However, shouldn't the rest of the opcodes with an IO
object as their parameter be methods as well?  Its not a lot of ops, but
it would trim down the core a bit.
They should -- it'll make it easier to abstract things out later when 
people start wanting to do bizarre things with pseudo-filehandles.

On Thu, 2004-11-25 at 08:00 +0100, Leopold Toetsch wrote:
 Brian Wheeler [EMAIL PROTECTED] wrote:
  I noticed a hole in the io.ops where the PIO stuff wasn't covered.  This
  patch creates an eof opcode which checks for end of file.
 Please just use the eof method of the PIO object:
$I0 = $P0.eof()
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [CVS ci] opcode cleanup 1 - minus 177 opcodes

2004-11-29 Thread Dan Sugalski
At 8:29 AM +0100 11/28/04, Leopold Toetsch wrote:
Thomas Seiler [EMAIL PROTECTED] wrote:
 Dan Sugalski wrote:
 At 10:34 AM +0100 11/27/04, Leopold Toetsch wrote:
 See also subject Too many opcodes.
   [...]
  
 Could you undo this please? Now is not the time to be trimming ops out.
When is the time? After another 1000 opcodes are in, which all ought to
be functions?
Yes. Y'know, when we start doing the optimization based on a fully 
designed and implemented engine. Anything before that's premature. 
(Shall I go dig up a half dozen or more archive references with you 
chiding me for premature optimizations?)

  OTOH, it won't hurt anyone and it is already in.
That's my point.
Then your point's wrong. This patch broke a lot of my code.
You keep wanting to chop things out of the core. Stop. That's not 
your call -- it's mine, and it will be made, but not now.

Put these back.
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Too many opcodes

2004-11-29 Thread Dan Sugalski
At 9:20 AM +0100 11/24/04, Leopold Toetsch wrote:
Too many opcodes
Bluntly, no. Not too many opcodes.
This has been an ongoing issue. I'm well aware that you want to to 
trim down the opcode count for ages and replace a lot of them with 
functions with a lightweight calling convention. Well, we already 
*have* that. We call them (wait for it) *opcodes*. That's one of the 
really big points of all this. You're micro-optimizing things, and 
you're not going the right way with it.

Yes, I'm well aware that the computed goto and switch cores are big, 
and problematic. The answer isn't to reduce the op count. The 
answer's to make the cores manageable, which doesn't require tossing 
ops out. It requires being somewhat careful with what ops we put *in*.

It's perfectly fine for a good chunk of the ops to not be in the main 
switch or cgoto loop, and have to be dispatched as indirect 
functions, the same as any opcode function from a loadable opcode 
library is. (Hell, some of these can go into a loadable opcode 
library if we want, to make sure the infrastructure works, including 
the packfile metadata that indicates which loadable op libraries need 
to be loaded) I'm also fine with making some of the ops phantom 
opcodes, ones that the assembler quietly rewrites. That's fine too, 
and something I'd like to get in.

So, short answer: Ops aren't going away.
Longer answer: We need to add in the following facilities:
 1) Op functions tagged (either in their definitions for all 
permutations, or in the ops numbering metadata file for individual 
functions) as to whether they're in the core loop or not. Ones that 
aren't hit the switch's default: case (and the cgoto core's 
equivalent, and the JIT's perfectly capable of handling this too) and 
get dispatched indirectly.

 2) The assembler and PIR compiler need to be taught appropriate 
transforms, which then *could* allow for add N2, I3, N3 to be 
turned into add N2, N3, I3 if we decide that in commutative IxN ops 
it's OK to make them NxI and so on. (Comparisons too, up to a point 
-- we can't do this with PMCs)

 3) The loadable opcode library stuff needs to be double-checked to 
make sure it works right, so we can create loadable libraries and 
actually load them in

 4) The metadata in packfiiles to indicate which loadable opcode 
libraries are in force for code in each segment needs to be 
double-checked to make sure it works right

 5) The ops file to C converter needs to have a knockout list so we 
can note which combinations aren't supported (and believe me, I fully 
plan on trimming hard, but only *after* we're functionally complete) 
or, if we'd rather, it can respect the ops numbering list and just 
not generate ops not on it.

Once this is done the only difference between 'real' opcodes and 
fixed-arg low-level functions is which are in the switch/cgoto/jit 
cores and which aren't, something that should be transparent to the 
bytecode and tunable as we need to. Which is as it should be.

The list of opcode functions is going to grow a lot, and there's 
really no reason that it shouldn't. With proper infrastructure there 
just isn't any need for there to be a difference between opcode 
functions and library functions.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [CVS ci] opcode cleanup 1 - minus 177 opcodes

2004-11-29 Thread Dan Sugalski
At 8:32 PM -0500 11/29/04, Michael Walter wrote:
There is also such thing as premature pessimization. I'm not in the
position to judge whether it is appropriate in this case, though.
Oh, absolutely. In this case the issues are personal taste (Leo 
doesn't like the big list) and issues with specific inefficiencies in 
the way we've got some of the automated infrastructure being built. 
While there are pessimal things that need fixing it's important to 
look at the things that are broken, not the things that we don't like.

It would've been clearer had I taken things in front-to-back order, 
but I didn't -- there's a longish message that came after this one 
explaining what needs to be done.

On Mon, 29 Nov 2004 20:25:48 -0500, Dan Sugalski [EMAIL PROTECTED] wrote:
 At 8:29 AM +0100 11/28/04, Leopold Toetsch wrote:
 Thomas Seiler [EMAIL PROTECTED] wrote:
   Dan Sugalski wrote:
   At 10:34 AM +0100 11/27/04, Leopold Toetsch wrote:
 
   See also subject Too many opcodes.
 
 [...]

   Could you undo this please? Now is not the time to be trimming ops out.
 
 When is the time? After another 1000 opcodes are in, which all ought to
 be functions?
 Yes. Y'know, when we start doing the optimization based on a fully
 designed and implemented engine. Anything before that's premature.
 (Shall I go dig up a half dozen or more archive references with you
 chiding me for premature optimizations?)
OTOH, it won't hurt anyone and it is already in.
 
 That's my point.
 Then your point's wrong. This patch broke a lot of my code.
 You keep wanting to chop things out of the core. Stop. That's not
 your call -- it's mine, and it will be made, but not now.
  Put these back.
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: Too many opcodes

2004-11-29 Thread Dan Sugalski
At 8:46 PM -0500 11/29/04, Dan Sugalski wrote:
It requires being somewhat careful with what ops we put *in*.
And since I wasn't clear (This stuff always obviously makes little 
sense only after I send things...), I meant in the switch/cgoto/jit 
core loop, not what ops are actually ops.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Last string checkin

2004-11-27 Thread Dan Sugalski
Okay, I've made the final checkin to the string branch, one that 
SKIPs the tests that fail because of the (very temporary, I hope) 
lack of unicode support.

The branch is ready to get merged into the trunk and we can deal with 
the fallout from that. Unfortunately this is *not* quite as finished 
as I'd like it to be, but it does work and things are stubbed in 
enough to let folks with more time can fill in the blanks. Basically 
everything works except the dynamic loading of encoding and charset 
libraries -- right now the basic ones are yanked in at compile-time. 
The string.c guts likely have more knowledge of things than they 
ought to as well, but that's not going to get shaken out until we 
have a real variable-width encoding and multi-codepoint charset 
working. (Like, say, unicode... :)

Anyway, I'd like to get this merged in. If a few folks could try and 
give it a whirl I'd much appreciate it.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: EcmaScript

2004-11-27 Thread Dan Sugalski
At 10:48 PM +0100 11/27/04, liorean wrote:
On Sat, 27 Nov 2004 21:11:07 +, Herbert Snorrason
[EMAIL PROTECTED] wrote:
 On Sat, 27 Nov 2004 21:43:01 +0100, liorean [EMAIL PROTECTED] wrote:
  Are there any projects to create an implementation of 
EcmaScript/JavaScript that will run on top of parrot?
 I believe not. That's really something that should get done, though...
CLR, JVM and C/C++ implementations exists. As parrot is supposed to be
better for dynamic languages, I guess EcmaScript 3.0 would fit right
in with parrot.
I have only the smallest knowledge of other languages (have made some
tries at Scheme and Ruby, but I don't really feel comfortable with
them), but I have used JavaScript since first introduced in nn2 and
I'd love to contribute. Could one write an initial compiler in
JavaScript, compile from SEE or SpiderMonkey and then run the compiler
on that implementation?
Absolutely. Compilers do *not* have to be integrated in with parrot 
-- my current work project uses Parrot as its back end, but the 
compiler's written in perl as a standalone program. Works just fine. 
(Though a Javascript compiler written in Javascript could bootstrap 
itself pretty nicely. That'd be cool... :)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [CVS ci] opcode cleanup 1 - minus 177 opcodes

2004-11-27 Thread Dan Sugalski
At 10:34 AM +0100 11/27/04, Leopold Toetsch wrote:
See also subject Too many opcodes.
* other VMs might already have a negative opcode count w this change ;)
* there are 3 incompatible changes: see ABI_CHANGES
* all other removed opcodes get replaced with equivalent ops
* opcodes got renumbered and shuffled - please recompile your PBCs [1]
* ops/ops.num is now the ultimate source of valid opcodes, except for
  opcodes in ops/experimental.ops, which are included in core_ops*.c
  nethertheless
Could you undo this please? Now is not the time to be trimming ops 
out. We can do that later, closer to release, if we choose to do it 
at all.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [perl #32418] Re: [PATCH] Register allocation patch - scales better to more symbols

2004-11-23 Thread Dan Sugalski
At 11:35 AM +0100 11/17/04, Leopold Toetsch wrote:
Dan Sugalski wrote:
Okay. I'll apply it and take a shot. May take a few hours to get a 
real number.
How does it look like? Any results already?
Okay, got some time this morning. Two of the patch hunks were already 
in, so I skipped 'em. The results... not good.

The parrot I have, which is a day or two out of date, takes 7m to 
churn through one of my pir files. With this patch, I killed the run 
at 19.5 minutes.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [perl #32418] Re: [PATCH] Register allocation patch - scales better to more symbols

2004-11-23 Thread Dan Sugalski
At 3:54 PM +0100 11/23/04, Leopold Toetsch wrote:
Dan Sugalski wrote:
The parrot I have, which is a day or two out of date, takes 7m to 
churn through one of my pir files. With this patch, I killed the 
run at 19.5 minutes.
Sh... That's one of the smaller ones I presume.
Nope, one of the biggest. Sixth largest, at 800KB of pir code.
 How many basic blocks and variables are listed with -v?
sh-2.05a$ ~/src/parrot/parrot -v -o forms/shipper.pbc forms/shipper.imc
debug = 0x0
Reading forms/shipper.imc
using optimization '0' (0)
Starting parse...
build_reglist: 9941 symbols
allocate_non_interfering, now: 3167 symbols
sub _MAIN:
registers in .imc:   I2891, N0, S912, P6115
0 labels, 0 lines deleted, 0 if_branch, 0 branch_branch
0 used once deleted
0 invariants_moved
registers needed:I3597, N0, S962, P6207
registers in .pasm:  I32, N0, S32, P32 - 271 spilled
5679 basic_blocks, 47459 edges
build_reglist: 1342 symbols
allocate_non_interfering, now: 1003 symbols
sub __Internal_Startup:
registers in .imc:   I25, N0, S0, P1290
0 labels, 0 lines deleted, 0 if_branch, 0 branch_branch
0 used once deleted
0 invariants_moved
registers needed:I34, N0, S3, P2828
registers in .pasm:  I31, N0, S7, P32 - 476 spilled
650 basic_blocks, 663 edges
Past that there are a mass of very small subs that don't spill.
This is interesting, though. I'd not looked at the numbers too 
closely, and I've been assuming that the _MAIN sub was the real cause 
of the register coloring code going insane, but the internal_startup 
code doesn't look too good either there. I think I'll go see what 
things look like for the big evil program.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [perl #32418] Re: [PATCH] Register allocation patch - scales better to more symbols

2004-11-23 Thread Dan Sugalski
At 4:02 PM +0100 11/23/04, Leopold Toetsch wrote:
Dan Sugalski wrote:
The parrot I have, which is a day or two out of date, takes 7m to 
churn through one of my pir files. With this patch, I killed the 
run at 19.5 minutes.
One more note: be sure to compile Parrot optimized - the new 
reg_alloc.c has some very expensive sanity checks in debug mode, or 
better if NDEBUG isn't defined.
I can't. My dev machine's running gcc 2.95.4, and gcc throws lisp 
error messages compiling the switch core if I turn on optimizations.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: phantom core files

2004-11-23 Thread Dan Sugalski
At 10:15 AM -0500 11/23/04, Matt Fowles wrote:
All~
Aperiodically I notice that my parrot directory has quite a few core
files in it (usually around 6), even though I have done nothing with
it except cvs -q update -dP; [make realclean;perl Configure.pl;]
make; make [full]test.  Usually it says that all of the tests pass or
one or two tests fail, but I don't notice it dieing painfully...  Are
these core files anything of significances?
Yeah, they are. You shouldn't ever get core files of things work out 
properly.
Pull 'em into gdb (gdb ./parrot core_file_name) and poke around -- 
that'll give you an idea what's going on and why they died.

GDB's got good help, and I don't have a handy core file to poke 
around with, so I'm not sure how to get the command-line params out 
of the core file. After loading it up, though, a bt will show you 
the call stack at the time things died.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [perl #32418] Re: [PATCH] Register allocation patch - scales better to more symbols

2004-11-23 Thread Dan Sugalski
At 5:40 PM +0100 11/23/04, Leopold Toetsch wrote:
Dan Sugalski [EMAIL PROTECTED] wrote:
 I can't. My dev machine's running gcc 2.95.4, and gcc throws lisp
 error messages compiling the switch core if I turn on optimizations.
You could try:
- perl Configure.pl --optimize
- make -s
- wait a bit until first files start compiling
- interrupt it
- put #ifdef 0 / #endif around the big switch
- make -s
- don't run parrot -S ;)
*But*, I've looked again at the new reg_alloc.c code. It seems to have a
piece of code with qubic order in registers, which is for sure killing
all performance advantage it has for a few hundreds of symbols.
So the scales better to more symbols has some limits when more
reaches 10K ;)
I'll hold off then. I can't picture anything that -O3 could do that 
wouldn't get swamped by a cubic time algorithm.
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [perl #32418] Re: [PATCH] Register allocation patch - scales better to more symbols

2004-11-23 Thread Dan Sugalski
At 9:17 AM -0800 11/23/04, Bill Coffman wrote:
Wait, I just thought of a huge change.
Dan, Does the patch you have implement Leo's U_NON_VOLATILE patch?
It was the patch originally attached to this ticket, over a stock 
parrot from CVS. If there's something else to try let me know -- I'm 
all for it. :)

 If
so, that restricts from 32 to 16 registers, in various cases for
*non-volatile* symbols (did I get that right?).  Anyway, the symbols
that cross sub calls can only use 16 registers, where they used 32
before.  That could have a huge effect in that more variables are
spilling.
Well, if you don't have that patch, then back to the drawing board.
~Bill
On Tue, 23 Nov 2004 11:55:47 -0500, Dan Sugalski [EMAIL PROTECTED] wrote:
 At 5:40 PM +0100 11/23/04, Leopold Toetsch wrote:
 *But*, I've looked again at the new reg_alloc.c code. It seems to have a
 piece of code with qubic order in registers, which is for sure killing
 all performance advantage it has for a few hundreds of symbols.
 
 So the scales better to more symbols has some limits when more
 reaches 10K ;)
 I'll hold off then. I can't picture anything that -O3 could do that
  wouldn't get swamped by a cubic time algorithm.
--
Dan
--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


Re: [perl #32418] Re: [PATCH] Register allocation patch - scales better to more symbols

2004-11-23 Thread Dan Sugalski
At 12:27 PM -0500 11/23/04, Dan Sugalski wrote:
At 9:17 AM -0800 11/23/04, Bill Coffman wrote:
Wait, I just thought of a huge change.
Dan, Does the patch you have implement Leo's U_NON_VOLATILE patch?
It was the patch originally attached to this ticket, over a stock 
parrot from CVS. If there's something else to try let me know -- I'm 
all for it. :)
I should point out that stock parrot does, amazingly, manage to 
compile the Evil Program. And from the -v output for the first sub:

build_reglist: 33064 symbols
allocate_non_interfering, now: 8312 symbols
sub _MAIN:
registers in .imc:   I9980, N0, S2895, P20164
0 labels, 0 lines deleted, 0 if_branch, 0 branch_branch
0 used once deleted
0 invariants_moved
registers needed:I9988, N0, S2900, P20290
registers in .pasm:  I31, N0, S31, P32 - 37 spilled
14722 basic_blocks, 271989 edges
That's quite a feat. And yes, that's 270K edges. It's no wonder this 
thing takes 3G of RAM to build... (I really need to see about 
reducing the edge count there)
--
Dan

--it's like this---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk


  1   2   3   4   5   6   7   8   9   10   >