Re: Musing on registerable event handlers for some specific events

2005-06-09 Thread Nigel Sandever
On Wed, 8 Jun 2005 18:57:30 -0700, [EMAIL PROTECTED] (Larry Wall) wrote:
 On Wed, Jun 08, 2005 at 11:04:30PM +0300, Gaal Yahas wrote:
 : On Wed, Jun 08, 2005 at 12:29:33PM -0700, Larry Wall wrote:
 : To take a notorious example, you mentioned fork() -- if this event manager
 : becomes part of Perl6, does that mean we're required to emulate fork()
 : on win32?
 
 Perl 5 manages it, and Perl 6 is expected to emulate Perl 5 when fed
 Perl 5 code.  It's one of the reasons Parrot is aiming to support an
 ithreads model in some fashion or other, I expect.  But it's okay if
 the Pugs interpreter punts on this for now.
 

If the only reason for using the ithreads model is to allow fork() to be 
emulated using the same mechanism as is used in P5 -- please don't. 

The reason for supporting fork() is (mostly) to allow unix fork()  exec() 
idioms to operate on Win32, but mostly they don't anyway because Win32 doesn't 
support the other components required (signals SIG_CHLD etc.; COW) required to 
allow those idioms run transparently. 

The p5 fork() emulation is barely usable, and has so many caveats that there 
will never be the possibility of transparent portability of programs that use 
fork() to Win32. It will always be necessary to test for platform and make 
special provisions.

And the only instances where the fork emulation does work reasonably well are 
those that are doing fork()  exec(). But thin about that. The emulation, 
spawns 
a thread, duplicates all the code and all the data from the main thread and 
then...Starts a new process. All that copied code and data is never used 
because 
all the spawned thread does is sit and wait for the new process to die.

Other uses of fork() like alarm(), also don't work in Win32.

Cygwin manages to perform a proper fork(). The code isn't even that complicated.

 Larry





[PATCH] Fix 3 of the spawnw.t failures.

2005-06-03 Thread Nigel Sandever
Further thoughts on the questions in comments invited.


njs


win32-exec.c.patch
Description: Binary data


Re: [PATCH] Fix 3 of the spawnw.t failures.

2005-06-03 Thread Nigel Sandever
Apologies for the wrong list. Should I resend to the correct one?
njs





[PATCH] README.Win32.patch

2005-06-02 Thread Nigel Sandever
This patch informs win32 users that nmake v1.5 is not capable of building 
Parrot.


READM.win32.patch
Description: Binary data


Parrot makefile on Win32

2005-05-31 Thread Nigel Sandever
The parrot makefile has several places where nmake baulks at the length of the 
expanded command lines.

I've found that I can work around this is some places using inline files, but 
I'm having trouble working out where/how to make the adjustments. 

I also have my doubts whether this would be compatible with other make programs.

Is anyone else successfully building parrot on win32 native? If so, how are 
they 
avoiding this problem? 

Thanks njs




Re: Parrot makefile on Win32

2005-05-31 Thread Nigel Sandever
On Tue, 31 May 2005 07:07:28 -0700, [EMAIL PROTECTED] (Jerry Gay) wrote:
 On 5/31/05, Nigel Sandever [EMAIL PROTECTED] wrote:
  The parrot makefile has several places where nmake baulks at the length o=
 f the
  expanded command lines.
 =20
 though you weren't explicit, i suspect you're using the ms c++ toolkit
 to build parrot on win32. some months ago, i ran into the same
 problem. since i have switched to msvc, i have not ran into any
 command-line length problems, nor have i read any reports of these
 problems with cygwin or mingw.


jerry,

That was the clue-bat I needed. A bug in the pugs makefile where it was looking 
for 'nmake' rather than 'nmake.exe' caused it to download nmake v1.5. 

The ordering in my path meant that it, was being found before nmake v7 (or v8).

Deleting it, allowed v7 to be found and Parrot now builds correctly.

It might be worth mentioning this nmake version dependancy in readme.win32 
where 
it also suggests getting nmake v1.5.

 
 ~jerry

Thanks for your help.

njs.






Re: Sun Fortress and Perl 6

2005-04-27 Thread Nigel Sandever
On 27 Apr 2005 08:21:27 -, [EMAIL PROTECTED] (Rafael Garcia-Suarez) 
wrote:
 Autrijus Tang wrote in perl.perl6.language :
 
  4. Software Transaction Memory
 
  Like GHC Haskell, Fortress introduces the `atomic` operator that takes a
  block, and ensures that any code running inside the block, in a
  concurrent setting, must happen transactionally -- i.e. if some
  precondition is modified by another thread before this block runs
  completely, then it is rolled back and tried again.  This slides covers
  some aspects of STM as used in GHC:
 
  http://homepages.inf.ed.ac.uk/wadler/linksetaps/slides/peyton-jones.ppt
 
  In Fortress, there is also an `atomic` trait for functions, that
  declares the entire function as atomic.
 
 Interesting; and this rolling-back approach avoids the deadlock issues
 raised by the use of semaphores (like in Java's synchronization
 approach).
 
 But, as the slides point out, you can't do I/O or syscalls from an
 atomic function; and in Haskell you can ensure that with the wise use of
 monads. Perl 6 has no monads, last time I checked...
 

For an alternative approach to concurrency control, that sets out to be a 
possible future standard; specifically designed to address the shortcomings of 
Java's semaphores; that is in the public domain; has already been ported to 
several platforms in several langauges and is known to be implementable on both 
linux and win32; please see 

http://gee.cs.oswego.edu/dl/cpjslides/util.pdf

for the potted overview, and 

http://gee.cs.oswego.edu/dl/classes/EDU/oswego/cs/dl/util/concurrent/intro.html

for a fairly comrehensive examination.

Perl 6/Parrot probably doesn't need everything there, but it might form the 
basis for them.

njs




Re: Thunking semantics of :=

2005-04-24 Thread Nigel Sandever
On Sat, 23 Apr 2005 21:00:11 -0700, [EMAIL PROTECTED] (Larry Wall) wrote:
 On Sun, Apr 24, 2005 at 03:37:23AM +, Nigel Sandever wrote:
 : On Sun, 24 Apr 2005 03:47:42 +0800, [EMAIL PROTECTED] (Autrijus Tang) 
wrote:
 :  
 :  Oh well.  At least the same code can be salvaged to make iThreads
 : 
 : Please. No iThreads behaviour in Perl 6. 
 : 
 : Nobody uses them and whilst stable, the implementation is broken in so many 
way.
 : 
 : But worse, the underlying semantics are completely and utterly wrong.
 
 Are you confusing iThreads with pThreads?  Or are you recommending we
 go back to the pThreads fiasco?


I certainly am not advocating a shared-by-default, or everything-shared model.

 
 From what I've read, the trend in most modern implementations of
 concurrency is away from shared state by default, essentially because
 shared memory simply doesn't scale up well enough in hardware, and
 coordinating shared state is not terribly efficient without shared
 memory.  If you are claiming that modern computer scientists are
 completely and utterly wrong in moving that direction, well, that's
 your privilege.  But you should be prepared for a little pushback
 on that subject, especially as you are merely badmouthing something
 without goodmouthing something else in it's place.
 

When I said iThreads, I was referring to the only defenition I can find for 
iThreads, the Perl 5.8.x implementation. 

The broken underlying semantics are

1) The fork-like, duplicate-everything-in-the-process-at-the-point-of-spawn.

This makes spawning a thread heavier than starting a process in many cases. It 
makes using (the p5 implementation of) iThreads an exercise in frustration 
trying to avoid the semantics it imposes.

2) The shared-memory is really multiple-copies-with-ties.

This removes 90% of the benefits of shared-memory, whilst doing nothing to make 
it easier to use. The user is still responsible for locking and 
synchronisation, 
but looses the ability to use either ties or object-semantics to encapsulate 
it. 
That forces every programmer to re-invent the techniques in every program as it 
is nearly impossible to write clean, effcient apis and put them into modules.

The duplication and tied-updates makes using shared memory so slow and clumsy, 
that it can be quicker to freeze/transmit-via-socket/thaw shared data between 
processes, than to share a piece of memory within a process.

3) Using the low-level pthreads api as the basis for the user view of threading.

This precludes the use of native thread features on any platform that has 
extended or better api, and forces the programmer to deal with the ultra-low-
level at which that API is defined without any of the control that he would 
have 
if using it directly.

 Larry

njs.





Re: Thunking semantics of :=

2005-04-23 Thread Nigel Sandever
On Sun, 24 Apr 2005 03:47:42 +0800, [EMAIL PROTECTED] (Autrijus Tang) wrote:
 
 Oh well.  At least the same code can be salvaged to make iThreads

Please. No iThreads behaviour in Perl 6. 

Nobody uses them and whilst stable, the implementation is broken in so many way.

But worse, the underlying semantics are completely and utterly wrong.




Re: Blocks, continuations and eval()

2005-04-21 Thread Nigel Sandever
On Thu, 21 Apr 2005 08:36:28 -0700, [EMAIL PROTECTED] (Larry Wall) wrote:
 
 Hmm, maybe that's not such a bad policy.  I wonder what other dangerous
 modules we might have.  Ada had UNCHECKED_TYPE_CONVERSION, for instance.
 

How about
use RE_EVAL; # or should that be REALLY_EVIL?

 Larry





Re: Parrot bytecode reentrancy

2005-04-15 Thread Nigel Sandever
On Thu, 31 Mar 2005 21:17:39 -0500, [EMAIL PROTECTED] (MrJoltCola) 
wrote:
 At 05:57 PM 3/31/2005, Nigel Sandever wrote:
 Is Parrot bytecode reentrant?
 
 Yes.
 
 That is, if I want to have two instances of a class in each of two 
 threads, will
 the bytecode for the class need to be loaded twice?
 
 No, just once.
 
 Also, will it be possible to pass objects (handles/references) between 
 threads?
 
 Yes, otherwise threads are no more useful than processes.
 
 -Melvin
 
Thanks. Another question arises.

When a sub that closes over a variable 

my $closure = 0;
sub do_something {
return $closure++:
}

is called from two threads, do the threads share a single closure or each get 
their own separate closure?

njs




Re: Parrot bytecode reentrancy

2005-04-15 Thread Nigel Sandever
15/04/2005 10:35:56, Leopold Toetsch [EMAIL PROTECTED] wrote:

Nigel Sandever [EMAIL PROTECTED] wrote:

 When a sub that closes over a variable

  my $closure = 0;
  sub do_something {
  return $closure++:
  }

 is called from two threads, do the threads share a single closure or
 each get their own separate closure?

AFAIK: the closure bytecode is shared, 

Great.

the Closure PMC with the lexical
pad is distinct. 

I think that makes perfect sense. No implicit sharing.

But that all isn't implemented yet.


Understood. I am being premature in thinking about this. 

But this is where I come unstuck. What would this mean/do when called from 2 
threads?

my $closure :shared = 0;
sub do_something {
return $closure++:
}

or this:

our $closure :shared = 0;
sub do_something {
return $closure++:
}

I struck me a while back that there is a contradiction in idea of a shared, 
'my' variable. 

I want to say lexical, but a var declared with 'our' is in some sense lexical. 

Where I am going is that shared implies global. Access can be constrained by 
requiring a lexical declaration using 'our', but 'my' variables should not be 
able to be marked 'shared'.

One nice thing that falls out of that, is that no 'my' vars would ever be 
shared, which means they never require semaphore checks. That would mean that a 
non threaded app running on a multi-threaded build of Parrot, need never incur 
a 
penalty of semaphore checks if it always use 'my'. *I think*?

In effect, all vars declared 'our' would be implicitly shared, (and would 
require semaphoring), removing the need for a 'shared' attribute. 

In P5, lexicals are already quicker that globals, so any additional penalty 
added to globals because of multithreading will not affect any single-threaded 
code that is striving for ultimate performance, because they would already be 
utilising lexicals. 

Equally, things like filehandles are inherently process-global in scope and 
therefore sharable between threads and require semaphore checks. 

I only throw this into the thought-pot because there seems to me to be a 
natural 
symmetry between the concept of 'global' and the concept of 'shared'.

I won't argue the case for this, but I thought that if I mention it, it might 
also make some sense to others when the time comes for this stuff to be 
designed 
and implemented.

 njs

leo


njs







Parrot bytecode reentrancy

2005-03-31 Thread Nigel Sandever
Is Parrot bytecode reentrant?

That is, if I want to have two instances of a class in each of two threads, 
will 
the bytecode for the class need to be loaded twice?

Also, will it be possible to pass objects (handles/references) between threads?

Thanks njs.




Re: Valid hash keys?

2005-02-27 Thread Nigel Sandever
On Sun, 27 Feb 2005 02:20:59 -0700, [EMAIL PROTECTED] (Luke Palmer) wrote:
 Luke Palmer writes:
  Autrijus Tang writes:
   Just a quick question: Is Hash keys still Strings, or can they be
   arbitary values? 
  
  They can be declared to be arbitrary:
  
  my %hash is shape(Any);
  
  
   If the latter, can Int 2, Num 2.0 and Str 2 point to different
   values?
  
  That's an interesting question.  Some people would want them to, and
  some people would definitely not want them to.  I think the general
  consensus is that people would not want them to be different, since in
  the rest of perl, 2 and 2 are the same.
 
 I forgot an important concretity.  Hashes should compare based on the
 generic equal operator, which knows how to compare apples and apples,
 and oranges and oranges, and occasionally a red orange to an apple.
 
 That is:
 
 3 equal 3  == true
 3 equal { codeblock }  == false
 3 equal 3== true
 

I would have assumed a hash who shape was defined as CInt to perform the 
hashing function directly on the (nominally 32-bit) binary representation Int 
2. 

Likewise, cmy % hash is shape(Double), would perform the hashing on the 
binary 
rep of the (nom.64-bit) Double value.

And Cmy %hash is shape(Ref), on the address of the key passed?

By extension, a C%hash is shape( Any ) would hash the binary representation 
of 
whatever (type of) key it was given, which would make keys of 2, 2.0, '2', 
'2.0', (Int2)2 etc. all map to different keys.

If C%hash is shape(Any) maps all the above representation of 2 to the same 
value, then Cmy %hash is shape(Any) becomes a synonym for 

C%hash is shape(Scalar).

(is that the same as Cmy %hash of Scalar?).

If my auumption is correct, then that would imply that each type hash or 
inherits a .binary or .raw method to allow access to the internal 
representation?

 Luke
 




Re: Valid hash keys?

2005-02-27 Thread Nigel Sandever
On Sun, 27 Feb 2005 15:36:42 -0700, [EMAIL PROTECTED] (Luke Palmer) wrote:
 Nigel Sandever writes:
  On Sun, 27 Feb 2005 02:20:59 -0700, [EMAIL PROTECTED] (Luke Palmer) wrote:
   I forgot an important concretity.  Hashes should compare based on the
   generic equal operator, which knows how to compare apples and apples,
   and oranges and oranges, and occasionally a red orange to an apple.
   
   That is:
   
   3 equal 3  == true
   3 equal { codeblock }  == false
   3 equal 3== true
   
  
  I would have assumed a hash who shape was defined as CInt to perform
  the hashing function directly on the (nominally 32-bit) binary
  representation Int 2. 
 
 I wasn't even thinking about implementation.  Sometimes it's good to let
 implementation drive language, but I don't think it's appropriate here.
 
 When we're talking about hashes of everything, there are a couple of
 routes we can take.  We can go Java's way and define a .hash method on
 every object.  We can go C++'s way and not hash at all, but instead
 define a transitive ordering on every object.  We can go Perl's way and
 find a string representation of the object and map the problem back to
 string hashing, which we can do well.
 
 But the biggest problem is that if the user overloads 'equal' on two
 objects, the hash should consider them equal.  We could require that to
 overload 'equal', you also have to overload .hash so that you've given
 some thought to the thing.  The worry I have is that people will do:
 
 method hash() { 0 }
 
 But I suppose that's okay.  That just punts the work off to 'equal',
 albeit in linear time.
 
 That may be the right way to go.  Use a Javaesque .hash method with a
 sane default (an introspecting default, perhaps), and use a sane
 equivalent default for 'equal'.  
 
 As far as getting 2, 2.0, and 2 to hash to the same object, well, we
 know they're 'equal', so we just need to know how to hash them the same
 way.  In fact, I don't believe 2.0 can be represented as a Num.  At
 least in Perl 5, it translates itself back to an int.  So we can just
 stringify and hash for the scalar types.


My thought is that if cmy %hash is shape(Any) uses the stringyfied values of 
the keys, then it is no different to Cmy %hash,

I think it would be useful for shape(Any) be different to an ordinary hash, and 
hashing the binary representation of the key, so that 

(Int)2, (Num)2, (String)2, (uint)2 (uint4)2 etc.

would be a useful way of collating things according to their type rather than 
their value?
 
 Luke

njs.





Re: Valid hash keys?

2005-02-27 Thread Nigel Sandever
On Sun, 27 Feb 2005 15:36:42 -0700, [EMAIL PROTECTED] (Luke Palmer) wrote:
 Nigel Sandever writes:
 
 When we're talking about hashes of everything, there are a couple of
 routes we can take.  We can go Java's way and define a .hash method on
 every object.  We can go C++'s way and not hash at all, but instead
 define a transitive ordering on every object.  

The more I think about this, please, no. The reason hashes are called hashes is 
because they hash.

If we need bags, sets, or orderThingies with overloadable transitive ordering, 
they can be written as classes--that possibly overload the hash syntax 

objkey = value

or whatever, but don't tack all that overhead on to the basic primitive.


 We can go Perl's way and
 find a string representation of the object and map the problem back to
 string hashing, which we can do well.

The only question is why does the thing that gets hashed have to be stringyfied 
first?

In p5, I often use hash{ pack'V', $key } = $value. # or 'd'

1) Because for large hashes using numeric keys it use up less space for the 
keys. 4-bytes rather than  10 for 2**32.

2) By using all 256 values of each byte, it tends to spread the keys more even 
across fewer buckets;

use Devel::Size qw[total_size size];
undef $h{ pack 'V', $_ } for map{ $_ * 1  } 0 .. 99;

print total_size \%h;
18418136

print scalar %h;
292754/524288

versus

use Devel::Size qw[total_size size];
undef $h{ $_ } for map{ $_ * 1  } 0 .. 99;

print total_size \%h;
48083250

print scalar %h;
644301/1048576

It would also avoid the need for hacks like Tie:RefHash, by hashing the address 
of the ref rather than the stringyfied ref and forcing the key to be stored 
twice and the creation of zillions of anonymous arrays to hold the unstrigyfied 
ref+value.

The same could be extended to hashing the composite binary representations of 
whole structures and objects.

njs.




Re: Junction Values

2005-02-20 Thread Nigel Sandever
On Sat, 19 Feb 2005 18:42:36 +1100, [EMAIL PROTECTED] (Damian Conway) wrote:

 the Awesome Power of Junctions:

As I tried to express elsehwere, this what I'm looking for. 

Instinctively, and for a long time since I first came across Q::S, I thought 
that the killer app of Junctions is there somewhere, I'm just not seeing it 
yet.

I'd really like to see some practical demonstrations of the Awesome Power. 

Something that goes beyond producing a boolean result from a set of values that 
could equally be done using hyperoperators?

Njs.




Re: Junction Values

2005-02-16 Thread Nigel Sandever
On Wed, 16 Feb 2005 12:17:35 +1100, [EMAIL PROTECTED] (Damian Conway) wrote:
 
 ..values tells you what raw values are inside the junction. The other kind of 
 introspection that's desirable is: what raw values can *match* this 
 junction. There would probably be a .states method for that.
 
 To see the difference between the two, consider:
 
   my $ideal_partner = all( any(tall dark rich),
   any(rich old frail),
   any(Australian rich),
 );
 
 $ideal_partner.values would return the three distinct values in the junction:
 
   ( any(tall dark rich),
any(rich old frail),
any(Australian rich),
  );
 
 But $ideal_partner.states would return only those non-junctive values that 
 (smart-)match the junction. Namely, rich.
 
 
 

I, and I think many others, have been trying to follow along on the discussions 
regarding junctions, and I have to say that for the most part, much of it goes
insert graphic of open hand, palm down, waving to and fro above head.

Any chance that you could provide one or two simple but realistic examples of 
using Junctions and their operators?

I see individual snippets of use and they appear to make sense, but when I try 
envisage using them in code I have recently written, I find nothing leaping off 
the page at me as an obvious candidate.

Thanks.




Re: Junction Values

2005-02-16 Thread Nigel Sandever
On Wed, 16 Feb 2005 09:18:42 -0600, [EMAIL PROTECTED] (Patrick R. Michaud) 
wrote:
 On Wed, Feb 16, 2005 at 01:06:22PM +, Nigel Sandever wrote:
  
  Any chance that you could provide one or two simple but realistic examples 
of 
  using Junctions and their operators?
 
 I'll give it a shot, but keep in mind that I'm somewhat new to this 
 also.  :-)
 
 First, junctions are an easy way to represent the relationships
 any, all, one, or none.  So, where we once had to say:
 
if ($x==$a or $x==$b or $x==$c or $x==$d) { ... }
 
 we can now say
 
if $x == any($a, $b, $c, $d) { ... }
 
 Similarly, where we once had to say
 
if (grep { $x == $^z } @list) { ... }
if (grep { $x == $^z } @list) == 0 { ... }


 we can now say
 
if $x == any(@list) { ... }
if $x == none(@list) { ... }
 

I'd write the P5 none() case as 

if ( grep { $x != $_ } @list ) { ... }
or
unless( grep{ $x == $_ } @list ) { ... }

which tends to reduce the sugaryness of the new syntax slightly.

 And for fun, try writing the equivalent of 
 
if $x == one($a, $b, $c, $d) { ... }

I'm also not entirely sure at this point whether that means

if( grep( { $x == $_ } $a, $b, $c, $d ) == 1 ) { ... }

or something different?

 
 without a junction.  (Okay, one can cheat with Cgrep.)


But most of those examples pretty clear and understandable. I'm not sure that I 
see their convenience alone as a convincing arguement for the existance of 
Junctions. I keep thinking that there is something altogether more 
fundementally 
useful and ...well... cleverer underlying their inclusion, but at this stage I 
am not seeing it. 

In this usage, they appear to be replicating functionality that is already 
being 
provided via the inclusion of the hyper-operators.
('scuse me if I skip the unicode and/or get the syntax wrong!)

if(   $x == @list ){ ... }   ## any?
if(   $x != @list ){ ... }   ## none?
if( ( $x == @list ) == 1 ) { ... }   ## one?
if( ( $x == @list ) == @list ) { ... }   ## all?

It would very much depend upon what a hyper operator returns in a boolean 
context, but they /could/ be made to work as I've indicated I think. 

If the hyper operator returned one boolean result for each comparison it made, 
and if a list of boolean values in a boolean context collapsed to a count of 
the 
trues/1s it contained, I think those would work. You would also get the bonus of

if( ( $x == @list ) == 2 ) { ... } ## two() ( three(), ten() etc. )
if( ( $x = @list ) == @list/2 ) { ... } ## fify percent below...

But I think the promise of Junctions appears to come from using them in 
conjunction with each other. 

$x = any( @list1 );
$y = all( @list2 );
$z = none( @list3 );
if( $x = $y ) { ... }
if( any( $x != $y ) or all( $y == $z ) ) { ... }

except that I haven't got a clue what those mean? 

I've sat and starred at the worked examples given elsewhere in this and the 
other thread, and whilst I can follow how the results are derived when someone 
lays them out for me--though I fell into the trap of accepting the one where 
Damian made a mistake, which probably means I wasn't paying close enough 
attention--when I get to the the last line and see that the answer it true (or 
false), I look back at the original unexpanded statement and think: 

Yes! But what does it mean? When and where is that going to be useful?
 
 A programmer can easily use these without having to worry about the
 notions of autothreading or superpositions or the like, and
 their translations to English are fairly obvious.  I suspect that
 constructs like the above will be the majority of use for junctions.
 
 Things start to get weirder when we start storing junctions into
 variables, and/or passing those variables/junctions into subroutines.
 But I think this is not *too* far removed from the idea of placing a 
 subroutine or a rule into a scalar -- i.e., using a scalar to represent
 something more than a single-valued primitive type.  Thus, just as one
 can write
 
$x = new CGI;
$y = rule { \d+ };
$z = sub { ... };
 
 and then use $x, $y, and $z later on for an object, rule, and sub 
 (which may have a variety of side-effects associated with them), it makes
 sense that one can write
 
$w = any(@list);
 
 and then use $w to refer to the junction later.  And a programmer who
 does this is taking on some responsibility for understanding that $w
 isn't one of the trivial scalars anymore (same as for $x, $y, and $z).
 
 However, one difference is that when one typically uses an object, rule,
 or subroutine in a scalar there is some syntax that makes their nature
 apparent.  Some feel that Junctions might be just a bit too magical
 in this respect (e.g., Luke has made some syntactic suggestions to try
 make the existence/threading of a junction more apparent).


I see the dangers of scalars that aren't

Re: specifying the key Type for a Hash

2004-12-06 Thread Nigel Sandever
On Sat, 04 Dec 2004 08:01:46 -0700, [EMAIL PROTECTED] (David Green) wrote:
 In article [EMAIL PROTECTED],
  [EMAIL PROTECTED] (Larry Wall) wrote:
 S9 talk about it.  We current have things like:
 my Cat %pet is shape(Str);
 and parameters to types are in square brackets, so it's more like:
 my %pet is Hash[:shape(Str) :returns(Cat)];
 
 I still prefer shaped, for pronounceability.  Although shape is a 
 bit of a stretch for something that's really more like size, and even 
 stretchier for describing hash keys.  I'm not sure what better word we 
 could use, though.
 
  is built   # a constructive choice
  is determined  # good for typing practice  =P
  is bound   # what if you're bound AND determined?
  is disposed# sounds like a destructor
  is composed# I kinda like this one
  is arrayed # oh, array in that other sense
  is reckoned# bet no other language has that as a keyword
  is cinched # it sounds so easy
  is confined# to quarters
  is walled  # now we're just being silly (no offense to Larry)
  is earmarked   # some people wouldn't hear of it
  is indexed # a bit better than is keyed (especially if it's your car)
  is sized   # I think this was already rejected
  is like# works really well if your type happens to be 'Totally'
  is thus# very vague, but short
 
 Hm.  
 
 On the other hand, imagining Type-shaped holes into which your hash 
 keys fit *does* have a rather picturesque appeal...
 
 
-David the thesaurus is your friend (sometimes) Green

I probably missed teh comprehensive dismissal thread, but why not 'type'?

my %pet is Hash[:type(Str) :returns(Cat)];

njs




Re: Synopsis 9 draft 1

2004-09-04 Thread Nigel Sandever
On Fri, 3 Sep 2004 17:08:00 -0700, [EMAIL PROTECTED] (Larry Wall) wrote:
 On Fri, Sep 03, 2004 at 05:45:12PM -0600, John Williams wrote:
 : On Thu, 2 Sep 2004, Larry Wall wrote:
 : 
 :  The argument to a shape specification is a semicolon list, just like
 :  the inside of a multidimensional subscript.  Ranges are also allowed,
 :  so you can pretend you're programming in Fortran, or awk:
 : 
 :  my int @ints is shape(1..4;1..2); # two dimensions, @ints[1..4; 1..2]
 : 
 : What happens when the Pascal programmer declares
 : 
 : my int @ints is shape(-10..10);
 : 
 : Does it blow up?
 
 No.
 
 : If not, does  @ints[-1]  mean the element with index -1 or the last element?
 
 The element with index -1.  Arrays with explicit ranges don't use the
 minus notation to count from the end.  We probably need to come up
 with some other notation for the beginning and end indexes.  But it'd
 be nice if that were a little shorter than:
 
 @ints.shape[0].beg
 @ints.shape[0].end
 
 Suggestions?  Maybe we just need integers with whence properties... :-)

How about keywords clo and chi?

 
 Larry





Compile-time undefined sub detection

2004-03-05 Thread Nigel Sandever
On the basis of what is known so far, will p6 be able to detect undefined subs 
at compile time?

Regards, Nigel.




Re: [perl #27301] [PATCH] \t\pmc\exec.t Tests spawnw opcode

2004-03-02 Thread Nigel Sandever
On Tue, 02 Mar 2004 02:31:47 -0800, [EMAIL PROTECTED] 
(Nigelsandever @ Btconnect . Com) wrote:
 =_1078223508-4945-0
 Content-Type: text/plain; charset=utf-8
 
 # New Ticket Created by  [EMAIL PROTECTED] 
 # Please include the string:  [perl #27301]
 # in the subject line of all future correspondence about this issue. 
 # URL: http://rt.perl.org:80/rt3/Ticket/Display.html?id=27301 
 
 
 Does the naming and placement of this test fit convention?
 
 The opcode is called 'spawnw'
 The function is 'Parrot_Run_OS_Command'
 The file 'config/gen/platform/*/exec.c'
 
 A little guidance would help occasionally. 
 If it were forthcoming, I would try to write them up into a beginners FAQ.
 
 
 =_1078223508-4945-0--


If this patch is no good, would someone point out what is wrong with it, or 
where  in the documentation I should be looking for this guidance?

If my lack of experience with OSS/unix makes my willingness to contribute to the 
project unviable, then a simple statement; stop bothering us will stop me 
wasting your time and mine.







Re: Threads: Time to get the terminology straight

2004-01-05 Thread Nigel Sandever

05/01/04 04:51:20, Sam Vilain [EMAIL PROTECTED] wrote:

On Mon, 05 Jan 2004 15:43, Nigel Sandever wrote;

   I accept that it may not be possible on all platforms, and it may
   be too expensive on some others. It may even be undesirable in the
   context of Parrot, but I have seen no argument that goes to
   invalidate the underlying premise.

I think you missed this:

LT Different VMs can run on different CPUs. Why should we make atomic
LT instructions out if these? We have a JIT runtime performing at 1
LT Parrot instruction per CPU instruction for native integers. Why
LT should we slow down that by a magnitude of many tenths?

LT We have to lock shared data, then you have to pay the penalty, but
LT not for each piece of code.
.
So far, I have only suggested  using the mechanism in conjuction with
PMCs and PMC registers. 

You can't add an in-use flag to a native integer. But then, native integers
are not a part of the VHLLs (perl/Python/Ruby). The are a consituent part 
of scalars, but they use different register set and opcodes. Copying the
integer value of a scalar into an I register would require locking the scalars
PMC. Once the value is in the I register, operations performed on it would
not need to be synchronised. Once the resultant is calculated, it need to be 
moved back to the PMC and lock is cleared. There should be no need to 
interlock on most opcodes dealing with the I and R register sets.

The S registers are a different kettle of fish, and I haven't worked through 
the implications for these. My gut feels is that the C-style strings pointed 
at by S registers would be protected by the in-use flag set on the PMCs for
the scalars from which they are derived. 

This means that when a single PMC opcode results in a sequence of non-PMC
operations, then other shared threads would be blocked from operations 
until the sequence of non-PMC ops in the first shared thread where complete.
But ONLY if they attempt access to the same PMC. 

If they are processing PMC or non-PMC operations that do not involve the 
in-use PMC, then they will not be blocked and will be scheduled for their 
timeslices in the normal way.


and this:

LT I think, that you are missing multiprocessor systems totally.


If the mutex mechanism that is use to block the shared threads
is SMP, NUMA, AMP etc. safe, then the mechanism I describe is also 
safe in these envirnments.

You are effectively excluding true parallellism by blocking other
processors from executing Parrot ops while one has the lock. 
.
The block only occurs *IF* concurrent operations on the same data
are attempted.

 You may
as well skip the thread libraries altogether and multi-thread the ops
in a runloop like Ruby does.

But let's carry the argument through, restricting it to UP systems,
with hyperthreading switched off, and running Win32.  Is it even true
that masking interrupts is enough on these systems?
.
No masking of interupts is involved anywhere! 
I don't know where the idea arises, but it wasn't from me.


Win32 `Critical Sections' must be giving the scheduler hints not to
run other pending threads whilst a critical section is running.  Maybe
it uses the CPU sti/cli flags for that, to avoid the overhead of
setting a memory word somewhere (bad enough) or calling the system
(crippling).  In that case, setting STI/CLI might only incur a ~50%
performance penalty for integer operations.
.
I don't have access to the sources, but I do know that when one 
thread has entered a critical section, all other threads and processes
continue to be scheduled in the normal way except those that also try
to enter the critical section. 

Scheduling is only disabled for those threads that *ask* to be so,
and no others. Either within the process or other processes. How the 
mechanism works I can only speculate, but no CLI/STI instructions are 
involved.

total speculation 
When the first thread enters the critsec, a flag is set in the 
critsec memory.

When a second thread attempts to enter the critsec, a flag is 
set in the corresponding scheduler table to indicate that it should 
not be scheduled again until the flag is cleared. 

When the first thread leaves the critsec, the flag in the critsec
memory is cleared and the flag(s) in the scheduler tables for any 
thread(s) blocking on the critsec are also cleared. 

Which ever of the blocked threads is next scheduled, it aquires the
critsec, sets the flag in the critsec memory and the process repeats.

/total speculation

No masking of interupts is involved.


but then there's this:

  NS Other internal housekeeping operations, memory allocation, garbage
  NS collection etc. are performed as sysopcodes, performed by the VMI
  NS within the auspices of the critical section, and thus secured.

UG there may be times when a GC run needs to be initiated DURING a VM
UG operation. if the op requires an immediate lare chunk of ram it
UG can trigger a GC pass or allocation request. you can't force those
UG things to only

Re: Threads: Time to get the terminology straight

2004-01-04 Thread Nigel Sandever
On Sun, 4 Jan 2004 15:47:35 -0500, [EMAIL PROTECTED] (Dan Sugalski) wrote:

 *) INTERPRETER - those bits of the Parrot_Interp structure that are 
 absolutely required to be thread-specific. This includes the current 
 register sets and stack pointers, as well as security context 
 information. Basically if a continuation captures it, it's the 
 interpreter.
 
 *) INTERPRETER ENVIRONMENT - Those bits of the Parrot_Interp 
 structure that aren't required to be thread-specific (though I'm not 
 sure there are any) *PLUS* anything pointed to that doesn't have to 
 be thread-specific.
 
 The environment includes the global namespaces, pads, stack chunks, 
 memory allocation regions, arenas, and whatnots. Just because the 
 pointer to the current pad is thread-specific doesn't mean the pad 
 *itself* has to be. It can be shared.
 

 *) SHARED THREAD - A thread that's part of a group of threads sharing 
 a common interpreter environment.

Ignoring the implementation of the synchronisation required, the basic
premise of my long post was that each SHARED THREAD, should have it's
own INTERPRETER (a VM in my terms). and that these should share a 
common INTERPRETER ENVIRONMENT.

Simplistically, 5005threads shared an INTERPRETER ENVIRONMENT 
and a single INTERPRETER. Synchronising threaded access to the shared
INTERPRETER (rather than it's environment) was the biggest headache.
(I *think*).

With ithreads, each SHARED THREAD has it's own INTERPRETER *and*
INTERPRETER ENVIRONMENT. This removes the contention for and the
need to synchronise access to the INTERPRETER, but requires the 
duplication of shared elements of the INTERPRETER ENVIRONMENT and
the copy_on_read, with the inherent costs of the duplication at start-up, 
and slow, indirect access to shared entities across the duplicated 
INTERPRETER ENVIRONMENTS.

My proposal was that each SHARED THREAD, 
should have a separate copy of the INTERPRETER, 
but share a copy of the INTERPRETER ENVIRONMENT.

Everything else, were my attempts at solving the requirements of 
synchronisation that this would require, whilst minimising the cost
of that synchronisation, by avoiding the need for a mutex on every
shared entity, and the costs of attempting to aquire a mutex except
when two SHARED THREADS attempted concurrent access to a
shared entity. 

I think that by having SHARED THREAD == INTERPRETER, sharing
a common INTERPRETER ENVIRONMENT, you can avoid (some) of 
the problems associated with 5005threads but retain the direct 
access of shared entities. 

This imposes it's own set of requirements and costs, but (I believe)
that the ideas that underly the mechanisms I offered as solutions
are sound. The specific implementation is a platform specific detail
that could be pushed down to a lower level.

 ...those bits of the Parrot_Interp 
 structure that aren't required to be thread-specific (though I'm not 
 sure there are any) 

This is were I have a different (and quite possibly incorrect) view.
My mental picture of the INTERPRETER ENVIRONMENT includes
both the impementation of all the classes in the process *plus* all
the memory of every instance of those classes. 

I think your statement above implies that these would not be a part
of the INTERPRETER ENVIRONMENT per se, but would be allocated 
from global heap and only referenced from the bytecode that would live
in the INTERPRETER ENVIRONMENT? 

I realise that this is possible, and maybe even desirable, but the cost
of the GC walking a global heap, especially in the situationof a single 
process that contains to entirely separate instances of the 
INTERPRETER ENVIRONMENT, would be (I *think*) rather high.

I realise that this is a fairly rare occurance on mist platforms,
but in the win32 situation of emulated forks, each pseudo-process
must have an entirely separate INTERPRETER ENVIRONMENT,
potentially with each having multiple SHARED THREADS. 

If the memory for all entities in all pseudo-process is allocated from 
a (real) process-global heap, then the multiple GC's required by the 
multiple pseudo-process are going to be walking the same heap.
Possibly concurrently.  I realise that this problem (if it is such)
does not occur on platforms that have real forks available, but it
would be useful if the high level design would allow for the use of
separate (virtual) heaps tied to the INTERPRETER ENVIRONMENTs
which win32 has the ability to do.


 
  Dan


Nigel.





Re: Threads: Time to get the terminology straight

2004-01-04 Thread Nigel Sandever
05/01/04 01:22:32, Sam Vilain [EMAIL PROTECTED] wrote:

[STUFF] :)

In another post you mentions intel hyperthreading. 
Essentially, duplicate sets of registers within a single CPU.

Do these need to apply lock on every machine level entity that
they access?
 No. 

Why not?

Because they can only modify an entity if it is loaded into a register
and the logic behind hyperthreading won't allow both register sets 
to load the same entity concurrently. 

( I know this is a gross simplificationof the interactions 
between the on-board logic and L1/L2 caching!)

--- Not an advert or glorification of Intel. Just an example -

Hyper-Threading Technology provides thread-level-parallelism (TLP) 
on each processor resulting in increased utilization of processor 
execution resources. As a result, resource utilization yields higher 
processing throughput. Hyper-Threading Technology is a form of 
simultaneous multi-threading technology (SMT) where multiple 
threads of software applications can be run simultaneously on one
 processor. 

This is achieved by duplicating the *architectural state* on each 
processor, while *sharing one set of processor execution resources*.
--

The last paragraph is the salient one as far as I am concerned.

The basic premise of my original proposal was that multi-threaded,
machine level applications don't have to interlock on machine level 
entities, because each operation they perform is atomic. 

Whilst the state of higher level objects, that the machine level 
objects are a part of, may have their state corrupted by two 
threads modifying things concurrently. The state of the threads
(registers sets+stack) themselves cannot be corrupted. 

This is because they have their own internally consistant state,
that only changes atomically, and that is completely separated,
each from the other. They only share common data (code is data
to the cpu, just bytecode is data to a VM).

So, if you are going to emulate a (hyper)threaded CPU, in a 
register-based virtual machine interpreter. And allow for
concurrent threads of execution within that VMI.
Then one way of ensuring that the internal state of the 
VMI was never corrupted, would be to have each thread have
it's own copy of the *architectural state* of the VM, whilst
 *one set of processor execution resources*.

For this to work, you would need to achieve the same opcode
atomicity at the VMI level. Interlocking the threads so that
on shared thread can not start an opcode until anothe shared 
threads has completed gives this atomicity. The penalty is that
if the interlocking is done for every opcode, then shared 
threads end up with very long virtual timeslices. To prevent 
that being the case (most of the time), the interlocking should
 only come into effect *if* concurrent access to a VM level 
entity is imminant. 

As the VMI cannot access (modify) the state of a VM level
entity (PMC) until it has loaded it into a VM register, the
interlosking only need come into effect *if*, the entity
who's reference is being loaded into a PMC register is 
currently in-use by (another) thread. 

The state if a PMCs in-useness can be flagged by a single bit
in its header. This can be detected by a shared thread when
the reference to it is loaded into teh PMC register and 
when it is, that shared thread then waits on the single,
shared mutex before proceeding.

It is only when the combination of atomised VM opcodes,
and lightweight in-use detection come together, that the
need for a mutex/entity can be avoided.

If the mutex used is capable of handling SMP, NUMA,
clusters etc, then the mechinsm will work. 

If the lightweight bit-test-set opcode isn't available,
then a heavyweight equivalent could be used, though the
advantages would be reduced.


Sam Vilain, [EMAIL PROTECTED]

I hope that clarifiies my thinking and how I arrived at it.

I accept that it may not be possible on all platforms, and
it may be too expensive on some others. It may even be 
undesirable in the context of Parrot, but I have seen no
argument that goes to invalidate the underlying premise.

Regards, Nigel





Re: Threads Design. A Win32 perspective.

2004-01-03 Thread Nigel Sandever
On Sat, 03 Jan 2004 01:48:07 -0500, [EMAIL PROTECTED] (Uri Guttman) wrote:
 ding! ding! ding! you just brought in a cpu specific instruction which
 is not guaranteed to be on any other arch. in fact many have such a
 beast but again, it is not accessible from c.

 you can't bring x86 centrism into this. the fact that redmond/intel
 threads can make use of this instruction to do 'critical sections' is a
 major point why it is bad for a portable system with many architectures
 to be supported.

 i disagree. it is covered for the intel case only and that is not good
 enough.

 again, intel specific and it has to be tossed out.

 but atomic operations are not available on all platforms.

 not in a unix environment. real multi-threaded processes already have
 this problem.

 that is a kernel issue and most signalling systems don't have thread
 specific arguments AFAIK.

 virtual ram is what counts on unix. you can't request some large amount
 without using real swap space. 

 again, redmond specific. 

 what win32 does is not portable and not useful at a VM level.

 ok, i can see where you got the test/set and yield stuff from now
 finally, it is redmond specific again and very unportable. i
 have never heard of this fibre concept on any unix flavor.

 and that is not possible on other platforms.

 that is a big point and one that i don't see as possible. redmond can do
 what they want with their kernel and user procs. parrot can only use
 kernel concepts that are reasonably portable across most platforms.
 kernel threads are reasonably portable (FSDO of reasonable) but anything
 beyond that such as fibres, test/set in user space, etc is not. locks
 have to be in kernel space since we can do a fibre yield in user space
 on any other platform. so this rule out user space test/set as well
 since that would need a thread to spin instead of blocking.
 
 your ideas make sense but only on redmond/intel which is not the target
 space for parrot.

That's pretty much the crux. I don't know what is available (in detail) on
other platforms. Hence I needed to express the ideas in terms I understand
and explain them sufficiently that they could be viewed, interpreted  and 
either related to similar concepts on othe platforms, or shot down.

I accept your overall judgement, though not necessarially all the specifics.

Maybe it would be possible (for me + others) to write the core of a win32 specific,
threaded VM interpreter that would take parrot byte code and run it. Thereby,
utilising all the good stuff that preceeds the VM interpreter, plus probably large 
chunks of the parrot VM, but provides it with a more native compatible target. 

That's something that obviously not a simple project and is beyond the scope of 
this list. 

Thanks for taking the time to review this.

Nigel Sandever.




Re: Threads Design. A Win32 perspective.

2004-01-03 Thread Nigel Sandever
On Sat, 3 Jan 2004 11:35:37 +0100, [EMAIL PROTECTED] (Leopold Toetsch) wrote:
 Nigel Sandever [EMAIL PROTECTED] wrote:
 
  VIRTUAL MACHINE INTERPRETER
 
  At any given point in the running of the interpreter, the VM register
  set, program counter and stack must represent the entire state for
  that thread.
 
 That's exactly, what a ParrotInterpreter is: the entire state for a
 thread.

This is only true if a thread == interpreter. 
If a single interpreter can run 2 threads then that single interpreter 
cannot represent the state of both threads safely.

 
  I am completely against the idea of VM threads having a 1:1
  relationship with interpreters.
 

With 5005threads, multiple threads exist in a single interpreter.

 All VHLL level data is shared without duplication, but locking has
 to be performed on each entity. This model is more efficient than
 ithreads. 
 However it was extremely difficult prevent onwanted interaction
 between the threads corrupting the internal state of the interpreter
 given the internal architecture of P5 and so it was abandoned.

With ithreads, each thread is also a seperate interpreter.

 This insulates the internal state of one interpreter from the other
 but also insulates *all* perl level program data in one interpreter
 from the perl level data in the other. 

 Spawning a new thread becomes a process of duplicating everything.
 The interpreter, the perl program, and all it existing data. 

 Sharing data between the threads/interpreters is implemented by 
 tieing the two copies of the variables to be shared and each time 
 a STORE is performed in one thread, the same STORE has too be 
 performed on the copy of that var held on every other threads 
 dataspace.

 If 2 threads need to share a scalar, but the program has 10 other
 threads, then each write to the shared scalar requires the update
 of all 12 copies of that scalar. There is no way to indicate that 
 you only need to share it between threads x  y.

 With ithreads, there can be no shared references, so no shared
 objects and no shared compound data structures

 While these can be separated its not efficient. Please note that Parrot
 is a register-based VM. Switching state is *not* switching a
 stack-pointer and a PC thingy like in stack-based VMs.
  ... The runtime costs of duplicating all
  shared data between interpreters, tying all shared variables, added to
  the cost of locking, throw away almost all of the benefit of using
  threads.
 No, shared resources are not duplicated, nor tied. The locking of course
 remains, but that's it.

The issues above are what make p5 ithreads almost unusable. 

If Parrot has found a way of avoiding these costs and limitations
then everything I offered is a waste of time, because these are
the issues  was attempting to address.

However, I have seen no indication here, in the sources or anywhere 
else that this is the case. I assume that the reason Dan opened the
discussion up in the first place is because the perception by those
looking on was that the p5 ithreads model was being suggested as the
way Parrot was going to go. 

And the reaction from those wh have tried to make use of ithreads
under p5 are all too aware that replicating them for Parrot would
be . [phrase deleted as too emotionally charged:)]

 leo

Nigel.





Re: Threads Design. A Win32 perspective.

2004-01-03 Thread Nigel Sandever
On Sat, 3 Jan 2004 21:00:31 +0100, [EMAIL PROTECTED] (Leopold Toetsch) wrote:
  That's exactly, what a ParrotInterpreter is: the entire state for a
  thread.
 
  This is only true if a thread == interpreter.
  If a single interpreter can run 2 threads then that single interpreter
  cannot represent the state of both threads safely.
 
 Yep. So if a single interpreter (which is almost a thread state) should
 run two threads, you have to allocate and swap all. 

When a kernel level thead is spawned, no duplication of application memory 
is required, Only a set of registers, program counter and stack. These 
represent the entire state of that thread.

If a VM thread mirrors this, by duplicating the VM program counter, 
VM registers and  VM stack, then this VM thread context can also
avoid the need to replicate the rest of the program data (interpreter).

 What should the
 advantage of such a solution be?

The avoidance of duplication. 
Transparent interlocking of VHLL fat structures performed automatically
by the VM itself. No need for :shared or lock().


 
  With 5005threads, multiple threads exist in a single interpreter.
 
 These are obsolete.

ONLY because they couldn't be made to work properly. The reason 
that was true are entirely due to the architecture of P5.

Dan Sugalski suggested in this list back in 2001, that he would prefer
pthreads to ithreads. 

I've used both in p5, and pthreads are vastly more efficient, but flaky and
difficult to use well. These limitations are due to the architecture upon 
which they were built. My interest is in seeing the Parrot architecture
not exclude them.

 
  With ithreads, each thread is also a seperate interpreter.
 
   Spawning a new thread becomes a process of duplicating everything.
   The interpreter, the perl program, and all it existing data.
 
 Partly yes. A new interpreter is created, the program, i.e. the opcode
 stream is *not* duplicated, but JIT or prederef information has to be
 rebuilt (on demand, if that run-core is running), and existing
 non-shared data items are cloned.
 

Only duplicating shared data on demand (COW) may work well on systems
that support COW in the kernel. But on systems that don't, this has to be
emulated in user space, with all the inherent overhead that implies.

My desire was that the VM_Spawn_Thread VM_Share_PMC and 
VM_Lock_PMC opcodes could be coded such that those platforms where
the presence of kernel level COW and other native features mean that
the ithreads-style model of VMthread == kernel thread + interpreter 
is the best way to go, then that would be the underlying implementation.

On those platforms where VMthread == kernel thread + VMthread context
is the best way, then that would be the underlying implementation.

In order for this to be possible, it implies a certain level of support for
both be engrained in the design of the interpreter.

My (long) oroginal post, with all the subjects covered and details given 
was my attempt to describe the support required in the design for the 
latter. It would be necessary to consider all the elements, and the way 
they intereact, and take these into consideration when implementing
Parrots threading in order that this would be achievable.

Each element, the seraration of the VMstate from the interpreter state,
the atomisation of VM operations, the automated detection and locking of
concurrect access attempts and the serialisation of the VM threads when 
it is detected all need support at the highest level before they may be 
implemented at the lowest (platform specific) levels.

It simply isn't possible to implement them on one platform at the lowest
levels unless the upper levels of the design are contructed with the 
possibilities in mind.

   Sharing data between the threads/interpreters is implemented by
   tieing
 
 Parrot != perl5.ithreads
 
  If Parrot has found a way of avoiding these costs and limitations
  then everything I offered is a waste of time, because these are
  the issues  was attempting to address.
 
 I posted a very premature benchmark result, where an unoptimized Parrot
 build is 8 times faster then the equivalent perl5 code.
 
  And the reaction from those wh have tried to make use of ithreads
  under p5 are all too aware that replicating them for Parrot would
  be . [phrase deleted as too emotionally charged:)]
 
 I don't know how ithreads are working internally WRT the relevant issues
 like object allocation and such. But threads at the OS level provide
 shared code and data segments. So at the VM level you have to unshare
 non-shared resources at thread creation. 

You only need to copy them, if the two threads can attempt to modify
the contents of the objects concurrently. By precluding this possibility,
by atomising VMthread level operations by preventing a new VM thread
form being scheduled until any othe VM thread completes its current 
operation and ensuring that each VMthreads state is in a complete and
coherent state before another VM 

Re: Threads Design. A Win32 perspective.

2004-01-02 Thread Nigel Sandever
On Thu, 01 Jan 2004 21:32:22 -0500, [EMAIL PROTECTED] (Uri Guttman) wrote:

UG Uri Guttman
NS Nigel Sandever. (Mostly not reproduced here!)

  NS REENTRANCY

 UG this is true for c level threads but not necessarily true for VM level
 UG threads.  f the VM is atomic at its operation level and can't be
 UG preempted (i.e. it is not using kernel threads with time slicing), 
 
One of the major points of what I am (was?) proposing, is that there 
would be a 1:1 mapping between VM threads and kernel threads. 

The second point is that atomic operation of the VM thread does not imply
that they cannot be preempted. It only means that when one VM thread is 
preempted by the scheduler, no other thread that is waiting to enter the 
same critical section as the first thread is in, will be scheduled 
(by the scheduler), until the first thread exits that critical section.
 
 UG then
 UG it could use thread unsafe calls (as long as it keeps those silly static
 UG buffers clean). 
 
The point about avoiding calling the CRT directly from the main bulk of the 
code does not preclude the use of (suitably thread-aware) CRT calls completely.

The idea is to acknowledge that 
  A) Not all CRT's are equally well crafted or complete.
  B) Thread-aware CRTs are not available with all compilers for all platforms.
  
By writing the main bulk of the code in terms of a virtual CRT using macros, 
the main bulk of the code becomes platform independent. The only change required 
porting the code to different platforms is to write a set of suitable expansions 
for the macros for each platform/ compiler. Where the available CRT is up to the 
job, the macros expand directly to the underlying CRT call. Where they are not, 
a suitable alternative can be written that bypasses the CRT and goes directly to 
runtime. 

As an example: Compiling P5 with Borland 5.5 for NT, I encountered the restriction
that the Borland runtime didn't support (fseek) or (f)tell on files  4GB. The 
underlying platform does, and it was relatively trivial to replace the calls to 
the CRT functions with appropriate NT syscalls and so enable both USE_LARGE_FILES
and PERLIO. 

The only difficulty was that instead of being able to make the changes in a single
place, it was necessary to make essentially the same change in 3 or 4 different files.
To compound matters, the modifications required conditional compilation directives at
all 4 places, and these had to be interspersed with other #ifdefs for other platforms 
and other purposes. 

The idea was to avoid this morass of conditional compilation and copypaste coding by
pushing the conditional directives in the main code to a single

 #if defined( THIS_PLATFORM ) #include this_platform_crt.h
 
Within that header would be a simple set of #ifdefs to include compiler specific
header files. And within those, all the CRT macro expansions. Thus, I sought to 
ease both porting and maintenance. 

It would also become possible to add virtual syscall definitions to the high level
code, and expand them differently, perhaps radically so, on a platform by platform 
basis. In essence, creating a Virtual OS for the VM.
 
 UG parrot will (according to dan) use one interpreter per
 UG VM thread and those may run on kernel threads. 
 
From what I have read, the decision as to whether each VM thread also got a 
separate interpreter was one of the things Dan was opening up to discussion?

Personally, with the perspective of (the forkless) NT platform, and my experience
with trying to make good use of both p5's 5005thread and ithread implementations,
I am completely against the idea of VM threads having a 1:1 relationship with 
interpreters. The runtime costs of duplicating all shared data between interpreters,
tying all shared variables, added to the cost of locking, throw away almost all of 
the benefit of using threads. Combine that overhead with the restrictions of 

 a) No shared references therefore:
 b) No shared objects, therefore:
 c) No shared methods.
 d) No shared compound data structures.
 
And the only hope for multiprocessing becomes forking, which WIn32 doesn't support 
natively. And which, without the benefits of process level COW performed by the 
kernel, results in a kludged emulation of a non-native facility that will never 
be usable in any real sense of the word.
 
 UG it may be possible to
 UG disable preemption and/ or time slicing so the VM threads will be atomic
 UG at the VM operation level and then we don't have to worry as much about
 UG thread unsafe libs. 
 
I was trying to demonstrate that there is no need to disable the scheduler in order
for VM threads to be achieve atomic VM level operations.
 
 UG but i gather that people want real preemption and
 UG priorities and time slicing so that idea may be moot.
 
Yes. I am one of them:)

 UG but on most
 UG platforms that support kernel threads there are thread safe versions of
 UG most/ all the c lib stuff. 
 
As I said above. This is true, but it is also compiler

Threads Design. A Win32 perspective.

2004-01-01 Thread Nigel Sandever
This is going to be extremely light on details with respect to the current state of 
the Parrot interpreter. 

It is also going to be expressed in terms of Win32 APIs.

For both of these I apologise in advance. Time, and the or forever hold your peace 
imperative has overridden my desire to do otherwise. 

My attempts to get up speed enough on the sources to work out how to apply the ideas I 
am going to express, and to become conversant with the *nix 
pthreads implementation and terminology, have moved too slowly to afford me the 
opportunity to do more.

Hoping this stimulates discussion rather than a flame war.

Regards, Nigel Sandever.

PS.  Sorry if this gets posted twice.


THE BASIS OF THE IDEA

Modern OSs succeed in having multiple threads of execution share a single copy of 
process memory without the operations of one thread being able to 
interfere with the state of another. The state of the code running in those threads 
may be corrupted through mismanagement. But the state of the 
threads themselves, their program counters, registers and stacks cannot. The 
mechanisms for this incorruptibility are: 

Each operation (opcode) performed by the thread is atomic. 
The scheduler can never interrupt a thread whilst an operation is in 
progress. Only between operations.

Before the start of an operation, and after the end of one, the state of the 
thread is entirely encapsulated within the registers and stack. 
By swapping the entire state of the CPU register set, when switching from 
one thread to another, the state of each thread is preserved
and reconstituted. No other mechanisms or interlocks are required.

By analogy, a virtual machine that wishes to have multiple threads of execution must 
achieve the same level of atomicity for each operation it performs. 

VIRTUAL MACHINE INTERPRETER

At any given point in the running of the interpreter, the VM register set, program 
counter and stack must represent the entire state for that thread. 
Once an opcode has started execution on a given thread, no other thread of execution 
within that interpreter much be allowed to start an operation until 
the first thread completes its opcode. 

NON-VMI THREADS 

ASYNCHRONOUS IO

Note that this does not mean that no other thread in the process can take a timeslice, 
only that any thread that is allowed to run should not be able to 
affect the state of the VM in question. A thread charged with performing asynchronous 
reads on behalf of the user program running within the VM 
interpreter can go ahead so long as it doesn't directly modify the VMI state. 

EVENT MANAGER THREAD

Equally, an event thread can also be run concurrently in the background to receive 
asynchronous notifications (signals, messages, asynchronous read 
completions etc.). It can then queue these events and set a flag that the VMI can 
inspect between each iteration of the opcode execution loop and action 
appropriately. This gives the benefit of safe signals along with safe and timely 
processing of other, similarly asynchronous events.

GARBAGE COLLECTION

The garbage collector would need to run *synchronously* with the interpreter opcode 
loop. Effectively, it would be another (possibly long running) opcode. 
An analogy of this are all the long running syscalls that operate within a 
multi-tasking OS. Eg. synchronous IO, virtual memory allocation, swapping etc. 
Just as virtual memory operations suspend the affected processes until the operations 
are complete, so garbage collection can be see as a virtual memory 
operation for the virtual machine that requires the VMI to be suspended until the 
operation is complete.

PREREQUISITES

The key is for the VM to be reentrant, and the use of (in win32 terms) a critical 
section.

REENTRANCY

Not only must the VMI be coded in a reentrant fashion, with all state addressed 
through pointers (references) loaded into it's Virtual register set. All 
the code underlying it, including syscalls and CRT must equally be reentrant. Many 
APIs within many CRTs *are not reentrant* (eg. strtok()). All state 
must be on a per-thread not a per-process basis.

To this end, I would propose that no CRT APIs are used directly from the main code!

Instead, a full set of CRT-like macros would be inherited from a header file, where 
either the real CRT API would be called, or an alternative 
implementation. This header file would be conditionally included on the basic of the 
target platform. This concentrates the bulk, if not all, platform 
specific code into a single file (or set of files). 

EXPANDING THE VIRTUAL MACHINE

What I am suggesting here is that the virtual machine analogy be extended to encompass 
not just the VHLL-user-program view of the registers and stack, 
but also it's view of the entire machine and underlying OS. 

By so doing, not only does it allow those working on the core of the VMI to be 
isolated from

Re: More Windows dev questions: Core dumps

2004-01-01 Thread Nigel Sandever
On Wed, 31 Dec 2003 12:17:21 -0500, [EMAIL PROTECTED] (Dan Sugalski) wrote:

 Does Windows do this? (I know other OSes, like VMS, do *not* do it) 
 If so, how do I enable it? If not, I presume there's some reasonably 
 simple way to attach a debugger to a process that's died. (I hope)

You can pursuade Dr.Watson (How I hate that cutesy name!) to produce a dump file (of 
sorts) when a trap occurs. 

To configure this, type drwtsb32 at a command prompt and follow the prompts for the 
naming and locating of the dump file, what gets dumped etc.

See [http://windows.about.com/library/weekly/aa000903b.htm] here for a fairly breif 
but clear overview of the configuration options.

Personally, I haven't made much use of these as I didn't find them very useful, but 
then I didn't make much of a unix dump last (only) time I looked at one either.

For a free, powerful (though GUI) debugger that can be used to debug literally any 
win32 process (even without the presence of symbol files, though these are stringly 
recommended), see [http://www.smidgeonsoft.prohosting.com/#PEBrowse ] . There are 
several other very useful tools and some good information from the home page at 
[http://www.smidgeonsoft.com/].

Regards, Nigel Sandever.









Ps. Sorry if this ends up getting posted twice. I didn;t realise I had to do the 
confirmation step after subscribing.




Re: More Windows dev questions: Core dumps

2004-01-01 Thread Nigel Sandever
On Wed, 31 Dec 2003 12:17:21 -0500, [EMAIL PROTECTED] (Dan Sugalski) wrote:
 If so, how do I enable it?

It is possible to configure DrWatson (Stupid cutesy name) to create a dump file, 
though I haven't ever found it very useful.

If not, I presume there's some reasonably 

There are several very powerful and useful utilities available for free download at 
[http://www.smidgeonsoft.com/] 

Particularly useful is [http://www.smidgeonsoft.prohosting.com/#PEBrowse ] 
Professional Interactive (Debugger).




A high-level Win32 view of threading in the interpreter.

2004-01-01 Thread Nigel Sandever
This is going to be extremely light on details with respect to the current state of 
the Parrot interpreter. 

It is also going to be expressed in terms of Win32 APIs.

For both of these I apologise in advance. Time, and the or forever hold your peace 
imperative has overridden my desire to do 
otherwise. 

My attempts to get up speed enough on the sources to work out how to apply the ideas I 
am going to express, and to become 
conversant with the *nix pthreads implementation and terminology, have moved too 
slowly to afford me the opportunity to do more.

Hoping this stimulates ideas rather than a flame war.

Regards, Nigel Sandever.



THE BASIS OF THE IDEA

Modern OSs succeed in having multiple threads of execution share a single copy of 
process memory without the operations of one 
thread being able to interfere with the state of another. The state of the code 
running in those threads may be corrupted through 
mismanagement. But the state of the threads themselves, their program counters, 
registers and stacks cannot. The mechanisms for 
this incorruptibility are: 

Each operation (opcode) performed by the thread is atomic. The scheduler can 
never interrupt a thread whilst an operation 
is in progress. Only between operations.

Before the start of an operation, and after the end of one, the state of the 
thread is entirely encapsulated within the 
registers and stack. By swapping the entire state of the CPU register set, when 
switching from one thread to another, the state of 
each thread is preserved and reconstituted. No other mechanisms or interlocks are 
required.

By analogy, a virtual machine that wishes to have multiple threads of execution must 
achieve the same level of atomicity for each 
operation it performs. 

VIRTUAL MACHINE INTERPRETER

At any given point in the running of the interpreter, the VM register set, program 
counter and stack must represent the entire 
state for that thread. Once an opcode has started execution on a given thread, no 
other thread of execution within that interpreter 
much be allowed to start an operation until the first thread completes its opcode. 

NON-VMI THREADS 

ASYNCHRONOUS IO

Note that this does not mean that no other thread in the process can take a timeslice, 
only that any thread that is allowed to run 
should not be able to affect the state of the VM in question. A thread charged with 
performing asynchronous reads on behalf of the 
user program running within the VM interpreter can go ahead so long as it doesn't 
directly modify the VMI state. 

EVENT MANAGER THREAD

Equally, an event thread can also be run concurrently in the background to receive 
asynchronous notifications (signals, messages, 
asynchronous read completions etc.). It can then queue these events and set a flag 
that the VMI can inspect between each iteration 
of the opcode execution loop and action appropriately. This gives the benefit of 
safe signals along with safe and timely 
processing of other, similarly asynchronous events.

GARBAGE COLLECTION

The garbage collector would need to run *synchronously* with the interpreter opcode 
loop. Effectively, it would be another (possibly 
long running) opcode. An analogy of this are all the long running syscalls that 
operate within a multi-tasking OS. Eg. synchronous IO, 
virtual memory allocation, swapping etc. Just as virtual memory operations suspend the 
affected processes until the operations are 
complete, so garbage collection can be see as a virtual memory operation for the 
virtual machine that requires the VMI to be 
suspended until the operation is complete.

PREREQUISITES

The key is for the VM to be reentrant, and the use of (in win32 terms) a critical 
section.

REENTRANCY

Not only must the VMI be coded in a reentrant fashion, with all state addressed 
through pointers (references) loaded into it's 
Virtual register set. All the code underlying it, including syscalls and CRT must 
equally be reentrant. Many APIs within many CRTs 
*are not reentrant* (eg. strtok()). All state must be on a per-thread not a 
per-process basis.

To this end, I would propose that no CRT APIs are used directly from the main code!

Instead, a full set of CRT-like macros would be inherited from a header file, where 
either the real CRT API would be called, or an 
alternative implementation. This header file would be conditionally included on the 
basic of the target platform. This concentrates 
the bulk, if not all, platform specific code into a single file (or set of files). 

EXPANDING THE VIRTUAL MACHINE

What I am suggesting here is that the virtual machine analogy be extended to encompass 
not just the VHLL-user-program view of 
the registers and stack, but also it's view of the entire machine and underlying OS. 

By so doing, not only does it allow those working on the core of the VMI to be 
isolated from the underlying machine and OS 
architecture. It allows them to extend the VMI's