Re: iThreads and selective variable copying (was Destructors andiThreads)

2004-07-04 Thread Nick Ing-Simmons
Dave Mitchell [EMAIL PROTECTED] writes:

1. It would be very hard to create these options.
2. Any programmer that used an 'only these' option would almost
certainly create a program that at best would not work, and at worst would
coredump. Whats happens if the user forgot to copy $/ ? What does Perl do
the next time it tries to read from a file and wants to know the current
line delineator?

Then there's stuff like stashes - %main:: is a hash that indirectly
references just about every object in the perl interpreter. Does the
programmer have to remember to exclude that?

You are suggesting opening up a can of worms which I have no great desire
to see opened.

Much as I philosophically like Eric's idea this does indeed look too 
messy for perl5. Lets see if perl6 can or has already fixed this.



Dave.



Re: [PROPOSAL] Cstat opcode and interface

2004-03-22 Thread Nick Ing-Simmons
Dan Sugalski [EMAIL PROTECTED] writes:
At 10:11 AM -0800 3/10/04, Brent \Dax\ Royal-Gordon wrote:
Josh Wilmes wrote:
It's also quite possible that miniparrot is a waste of time.  I'm 
pretty much of the opinion myself that it's an academic exercise at 
this point, but one which keeps us honest, even if we don't use it.

Miniparrot, or something very much like it, is the final build system.

Yep. We need to make sure it always works.

Which, unfortunately, will end up making things a hassle, since 
there's no platform-independent way to spawn a sub-process, dammit. :(

On that topic specifically - the DOS style spawn() API is 
easy to fake with fork/exec but converse is NOT true.


i.e. if Miniparrot assumes:

pid_t my_spawn(const char *progname,int argc,const char *argv[]);
int my_wait(pid_t proc);

then Unix-oids can have

pid_t my_spawn(const char *progname,int argc,const char *argv[]);
{
 pid_t pid = fork();
 if (pid)
  return pid;
 execv(progname,argc,argv);
}

Unidirectional popen() is also reasonably portable.




Re: [PROPOSAL] Cstat opcode and interface

2004-03-22 Thread Nick Ing-Simmons
Dan Sugalski [EMAIL PROTECTED] writes:
At 11:12 AM -0800 3/10/04, Brent \Dax\ Royal-Gordon wrote:
Dan Sugalski wrote:
Which, unfortunately, will end up making things a hassle, since 
there's no platform-independent way to spawn a sub-process, dammit. 
:(

Unixen seem to support system().

D'oh! It's C89 standard. I'm getting stuck in the 80s with the 
multitude of exec variants. Yeah, with that issue taken care of it's 
a lot more doable. Nevermind...

But:
  A. system() is blocking. 
  B. system() takes single string so whatever calls system()
 has to be aware of the System's quoting rules.



Re: Dates and times again

2004-03-22 Thread Nick Ing-Simmons
Larry Wall [EMAIL PROTECTED] writes:

That would seem like good future proofing.  Someday every computer will
have decentish subsecond timing.  I hope to see it in my lifetime...

It isn't having the sub-second time in the computer it is the API 
to get at it... 


My guess is that eventually they'll decide to put a moratorium on
leap seconds, with the recommendation that the problem be revisited
just before 2100, on the assumption that we'll add all of a century's
leap seconds at once at the end of each century.  That would let
civil time drift by at most a minute or two before being hauled
back to astronomical time.  

Given that most people live more than an minute or two from their 
civil-time meridian who will notice? (Says me about 8 minutes west of 
GMT.)


I'd say what's missing are the error bars.  I don't mind if the
timestamp comes back integral on machines that can't support subsecond
timing, but I darn well better *know* that I can't sleep(.25), or
strange things are gonna happen.

But you can fake sleep() with select() or whatever.




Re: Using Ruby Objects with Parrot

2004-03-22 Thread Nick Ing-Simmons
Mark Sparshatt [EMAIL PROTECTED] writes:

I'm not 100% certain about the details but I think this is how it works.

In languages like C++ objects and classes are completely seperate.
classes form an inheritance heirachy and objects are instances of a
particular class.

However in some languages (I think that Smalltalk was the first) there's
the idea that everything is an object, including classes. So while an
object is an instance of a class, that class is an instance of another
class, which is called the metaclass. I don't there's anything special
about these classes other than the fact that their instances are also
classes.


Thinking about it I think you may have the relationship between
ParrotObject and ParrotClass the wrong way around. Since a class is an
object but and object isn't a class it would be better for ParrotClass
to inherit from ParrotObject, rather than the other way round.

In Ruby when you create a class Foo, the Ruby interpreter automatically
creates a class Foo' and sets the klass attribute of Foo to point to Foo'.

This is important since class methods of Foo are actually instance
methods of Foo'. Which means that method dispatch is the same whether
you are calling an instance of class method.

So in perl5-ese when you call 

   Foo-method

you are actually calling sub Foo::method which is in some sense
a method of the %Foo:: stash object.

So what you suggest is as if perl5 compiled Foo-method
into (\%Foo::)-method and the %Foo:: 'stash' was blessed...



foo.method()

looks at foo's klass attribute then checks the returned class object
(Foo) for method

Foo.method()

looks at Foo's klass attribute and again checks the returned class
object (Foo') for method.

The Pickaxe book has got a better explanation of this (at
http://www.rubycentral.com/book/classes.html though without any diagrams
:( )

In Python when defining a class it's possible to set an attribute in the
class that points to the classes metaclass. The metaclass itself is just
a normal class that defines methods which override the normal behaviour
of the class.

IIRC Python has got both class methods and meta class instance methods
which work almost (but not quite) in the same way as each other.

Hopefully someone with more experience with Python will be able to
explain better.

I'm not sure if this has cleared things up or just made them more confusing.



Testing XS modules on Ponie

2004-03-19 Thread Nick Ing-Simmons
Arthur Bergman [EMAIL PROTECTED] writes:
This is Ponie, development release 2


   And, isn't sanity really just a one-trick ponie anyway? I mean all 
you get is one trick, rational thinking, but when you're good and 
crazy, oooh, oooh, oooh, the sky is the limit. -- the tick


Welcome to this second development release of ponie, the mix of perl5 
and parrot. Ponie embeds a parrot interpreter inside perl5 and hands 
off tasks to it, the goal of the project is to hand of all data and 
bytecode handling to parrot.

With this release all internal macros that poke at perl data types are 
converted to be real C functions and to check if they are dealing with 
traditional perl data types or PMC (Parrot data types) data. Perl 
lvalues, arrays and hashes are also hidden inside PMCs but still access 
their core data using traditional macros. The goal and purpose of this 
release is to make sure this approach keeps on working with the XS 
modules available on CPAN and to let people test with their own source 
code. No changes where made to any of the core XS modules.

So ponie-2 compiles and passes all its tests for me.
So how do I see if it can handle the XS module from hell - Tk ?




Re: [perl #16689] [NIT] trailing commas in enumerator lists bad

2002-08-21 Thread Nick Ing-Simmons

Jarkko Hietaniemi [EMAIL PROTECTED] writes:
# New Ticket Created by  Jarkko Hietaniemi 
# Please include the string:  [perl #16689]
# in the subject line of all future correspondence about this issue. 
# URL: http://rt.perl.org/rt2/Ticket/Display.html?id=16689 


Freshly checked out parrot moans a lot:

cc: Info: ./include/parrot/string.h, line 56: Trailing comma found in enumerator 
list. (trailcomma)
} TAIL_flags;
^

Trailing commas in enumerator lists is unportable behaviour in C.

And in case anyone has not come accross the trick before it is not uncommon
to have 

enum foo {
/* auto-genererated stuff */
  foo_MAX
};

where foo_MAX is a handy number of entries value as well 
as avoiding the trailing comma issue.


-- 
Nick Ing-Simmons
http://www.ni-s.u-net.com/




Re: [perl #15006] [PATCH] Major GC Refactoring

2002-07-17 Thread Nick Ing-Simmons

# New Ticket Created by  Mike Lambert 
# Please include the string:  [perl #15006]
# in the subject line of all future correspondence about this issue. 
# URL: http://bugs6.perl.org/rt2/Ticket/Display.html?id=15006 

Tickets from RT don't have an address in the To: line
and so my mailfilter is filing them as SPAM

--
Nick Ing-Simmons
http://www.ni-s.u-net.com/




Re: The internal string API

2001-06-20 Thread Nick Ing-Simmons

Jarkko Hietaniemi [EMAIL PROTECTED] writes:
 Taiwanese read traditional chinese characters, but PRC people read
 simplied chinese. Even we take the same data, and same program (code),
 people just read differently. As an end user, I want to make the decision.
 It will drive me crazy if Perl render/display the text file using
 traditional
 chinese just because it was tagged as Big5.

Perl will (probably, whispers he, crossing his fingers) never
translate data that far.  Perl (5) does not display chr(0x1234) to
me using Unicode fonts, it just pushes the octets to a file
descriptor/handle.  Unicode is language-neutral.

Perl may not, but I assume someone will be fool enough to give it a GUI.
perl5.7.1+/Tk803.???-to-be will now make a stab at rendering Unicode
(not a very good one I am the 1st to admit which is why it isn't released!).

It would be good if Tk-for-perl6 did not have to break the rules or 
provide its own hooks for meta data and could use the string API.

-- 
Nick Ing-Simmons
http://www.ni-s.u-net.com/




Re: Should the op dispatch loop decode?

2001-06-13 Thread Nick Ing-Simmons

Benjamin Stuhl [EMAIL PROTECTED] writes:
I don't see where shadow functions are really necessary -
after all, no one has ever complained that you can't do 

pp_chomp(sv); /* or pp_add(sv1, sv2), for that matter */

in Perl 5. 

Yes we did. And note the doop.c file which is part answer
to the shadows.

Given the inner functions we could presumable generate the decode
functions (c.f. xsubpp)

-- 
Nick Ing-Simmons




Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Nick Ing-Simmons

Uri Guttman [EMAIL PROTECTED] writes:
  DS The one handy thing about push and pop is you don't need to go
  DS tracking the stack manually--that's taken care of by the push and
  DS pop opcodes. They can certainly be replaced with manipulations of
  DS a temp register and indirect register stores or loads, but that's
  DS more expensive--you do the same thing only with more dispatch
  DS overhead.

  DS And I'm considering the stack as a place to put registers
  DS temporarily when the compiler runs out and needs a spot to
  DS squirrel something away, rather than as a mechanism to pass
  DS parameters to subs or opcodes. This is a stack in the traditional
  DS scratch-space sense.

i agree with that. the stack here is mostly a call stack which
save/restores registers as we run out. with a large number like 64, we
won't run out until we do some deep calls. then the older registers (do
we have an LRU mechnism here?) get pushed by the sub call prologue which
then uses those registers for its my vars.

I don't like push/pop - they imply a lot of stack limit checking word-by-word
when it is less overhead for compiler to analyse the needs of whole basic-block
check-for/make-space-on the stack _once_ then just address it.


is the sub call/return stack also the data (scratch) stack? i think
separate ones makes sense here. the data stack is just PMC pointers, the
code call stack has register info, context, etc.

One stack is more natural for translation to C (which has just one).
One problem with FORTH was allocating two growable segments for its 
two stacks - one always ended up 2nd class.

-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/




Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Nick Ing-Simmons

Uri Guttman [EMAIL PROTECTED] writes:

think of this as classic CISC code generation with plenty of registers
and a scratch stack. this is stable technology. we could even find a
code generator guru (i don't know any obvious ones in the perl6 world)

Classic CISC code generation taught us that CISC is a pain to code-gen.
(I am not a Guru but did design TMS320C80's RISC specifically to match 
gcc of that vintage, and dabbled in a p-code for Pascal way back.)


   special registers ($_, @_, events, etc.) are indexed with a starting
   offset of 64, so general registers are 0-63.

  DS I'd name them specially (S0-Snnn) rather than make them a chunk of the 
  DS normal register set.

All that dividing registers into sub-classes does it cause you to do 
register-register moves when things are in the wrong sort of register.
Its only real benefit is for encoding density as you can imply part
of the register number by requiring addresses to be in address registers
etc. It is not clear to me that perl special variables map well to that.
Mind you the names are just a human thing - it is the bit-pattern that 
compiler cares about.




oh, they have macro names which are special. something like:

#defineMAX_PLAIN REG   64  /* 0 - 63 are plain regs */
#defineREG_ARG 64  /* $_ */
#defineREG_SUB_ARG 65  /* @_ */
#defineREG_ARGV66  /* @ARGV */
#defineREG_INT167  /* integer 1 */
#defineREG_INT268  /* integer 1 */

uri
-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/




Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Nick Ing-Simmons

Dan Sugalski [EMAIL PROTECTED] writes:
At 02:08 PM 5/30/2001 +, Nick Ing-Simmons wrote:
Classic CISC code generation taught us that CISC is a pain to code-gen.
(I am not a Guru but did design TMS320C80's RISC specifically to match
gcc of that vintage, and dabbled in a p-code for Pascal way back.)

Right, but in this case we have the advantage of tailoring the instruction 
set to the language, and given the overhead inherent in op dispatch we also 
have an incentive to hoist opcodes up to as high a level as we can manage.

That is of course what they/we all say ;-)

The 68K for example matched quite well to the low-tech compiler technology
of its day, as did UCSD's p-code for USCD Pascal, and DSPs have their own 
reasons (inner loops are more important than generic C) for their CISC nature.

Even the horrible x86 architecture is quasi-sane if you assume all variables
are on the stack addressed by the Base Pointer.

It is interesting now that people are looking at building chips for JVM
how much cursing there is about certain features - though I don't have 
the references to hand.

The overhead of op dispatch is a self-proving issue - if you have complex
ops they are expensive to dispatch. 
In the limit FORTH-like threaded code 

   while (1) *(*op_ptr++)();

is not really very expensive, it is then up to the op to adjust op_ptr
for in-line args etc. Down sides are size op is at least size of a pointer.

With a 16-bit opcode as-per-Uri that becomes:

   while (1) *(table[*op_ptr++])();

(Assuming we don't need to check bounds 'cos we won't generate bad code...)

One can then start adding decode to the loop:
 
   while (1) {
 op_t op = *op_ptr++;
 switch(NUM_ARGS(op))
  case 1:
   *(table[FUNC_NUM(op)])(*op_ptr++);
   break;
  case 3:
   *(table[FUNC_NUM(op)])(op_ptr[0],op_ptr[1],op_ptr[2]);
   op_ptr += 3;
   break;
  ...
   }

Then one can do byte-ordering and mis-aligned hackery and index into reg-array

   while (1) {
 op_t op = GET16BITS(*op_ptr);
 switch(NUM_ARGS(op))
  case 1:
   *(table[FUNC_NUM(op)])(reg_ptr[GET8BITS(*op_ptr)]);
   break;
  ...
   }



-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/




Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Nick Ing-Simmons

Dave Mitchell [EMAIL PROTECTED] writes:

There's no reason why you can.t have a hybrid scheme. In fact I think
it's a big win over a pure register-addressing scheme. Consider...

Which was more or less my own position...


At the start of a new scope, the stack is extended by N to create a new
stack frame (including a one-off check that the stack can be
extended).  There is then a 'stack pointer' (sp) which is initialised
to the base of the new frame, or an initial offset thereof. (So sp is
really just a temporary index within the current frame.)

Then some opcodes can use explicit addressing, while others can be explicit,
or a mixture.

Explicit opcodes specify one or more 'registers' - ie indexes within the
current frame, while implicit opcodes use the current value of sp as an
implicit index, and may alter sp as a side effect. So an ADD opcode
would use sp[0], sp[-1] to find the 2 operands and would store a pointer
to the result at sp[-1], then sp--. The compiler plants code in such a way
that it will never allow sp to go outside the current stack frame.

This allows a big win on the size of the bytecode, and in terms of the
time required to decode each op.

Consider the following code.

$a = $x*$y+$z

Suppose we have r5 and r6 available for scratch use, and that for some
reason we wish to keep a pointer to $a in r1 at the end (perhaps we use
$a again a couple of lines later):


This might have the following bytecode with a pure resiger scheme:

GETSV('x',r5)  # get pointer to global $x, store in register 5
GETSV('y',r6)
MULT(r5,r5,r6)  # multiply the things pointed to by r5 and r6; store ptr to
   # result in r5
GETSV('z',r6)
ADD(r5,r5,r6)
GETSV('a',r1)
SASSIGN(r1,r5)

Globals are a pain. Consider this code:

sub foo
{
 my ($x,$y,$z) = @_;
 return $x*$y+$z;
}

In the pure register (RISC-oid) scheme the bytecode should be:

FOO:
  MULT(arg1,arg2,tmp1)
  ADD(tmp1,arg3,result)
  RETURN

That is lexicals get allocated registers at compile time, and ops
just go get them.

In the pure stack with alloc scheme (x86-oid) scheme it should be
  ENTER +1 # need a temp
  MULT SP[1],SP[2],SP[4]   # $x*$y
  ADD SP[4],SP[3],SP[1]# temp + $z - result
  RETURN -2# Loose temp and non-results   

And in a pure stack (FORTH, PostScript) style it might be
  rot 3# reorder stack to get x y on top
  mpy
  add
  ret   


but might be like this in a hybrid scheme:

SETSP(5)   # make sp point to r5
GETSV('x') # get pointer to global $a, store at *sp++
GETSV('y')
MULT
GETSV('z')
ADD
GETSV('a')
SASSIGN
SAVEREG(r1)# pop pointer at *sp, and store in register 1

The problem that the hybrid scheme glosses over is the re-order the args
issue that is handled by register numbers, stack addressing or 
FORTH/PostScript stack re-ordering.
It avoids it by expensive long range global fetches - which is indeed
what humans do when writing PostScript - use globals - but compilers
can keep track of such mess for us.



Both use the same regsters, have the same net result, but the explicit
scheme requires an extra 11 numbers in the bytecode, not to mention all
the extra cycles required to extract out those nunmbers from the bytecode
in the first place.
-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/




Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Nick Ing-Simmons

Uri Guttman [EMAIL PROTECTED] writes:
 NI == Nick Ing-Simmons [EMAIL PROTECTED] writes:

  NI The overhead of op dispatch is a self-proving issue - if you
  NI have complex ops they are expensive to dispatch.

but as someone else said, we can design our own ops to be as high level
as we want. lowering the number of op calls is the key. that loop will
be a bottleneck as it is in perl5 unless we optimize it now.

  NI With a 16-bit opcode as-per-Uri that becomes:

  NIwhile (1) *(table[*op_ptr++])();

  NI (Assuming we don't need to check bounds 'cos we won't generate bad code...)

i dropped the 16 bit idea in favor of an extension byte code that zhong
mentioned. it has several wins, no ordering issues, it is pure 'byte'
code. 

  NI One can then start adding decode to the loop:
 
  NIwhile (1) {
  NI  op_t op = *op_ptr++;
  NI  switch(NUM_ARGS(op))

no switch, a simple lookup table:

   op_cnt = op_counts[ op ] ;

Myths of 21st Century Computing #1:
 Memory lookups are cheap

Most processors only have only one memory unit and it typically has
a long pipeline delay. But many have several units that can do 
compare etc.

A lookup table may or may-not be faster/denser than a switch.
A lookup may take 9 cycles down a memory pipe while

ans = (op  16) ? 2 : (op  8) ? 1 : 0;

might super-scalar issue in 1 cycle.  Code at high level and let 
C compiler know what is best. C will give you a lookup if that 
is best.

Memory ops need not be expensive if they pipeline well, but 
making one memory op depend on the result of another is bad idea
e.g.

   op   = *op_ptr++;
   arg1 = *op_ptr++;
   arg2 = *op_ptr++;

May apear to happen in 3 cycles, as all the loads can be issued in a pipelined
manner and ++s issued in parallel. While

   op  = *op_ptr++;
   ans = table[op];

could take seem to 18 cycles as can't start 2nd load till 1st one completes.

I have been meaning to try and prove my point with 
a software-pipelined dispatch loop which is fetching one op,
decoding previous one and executing one before that.

-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/




Re: Stacks registers

2001-05-27 Thread Nick Ing-Simmons

Uri Guttman [EMAIL PROTECTED] writes:
  NI No - you keep the window base handy and don't keep re-fetching it,
  NI same way you keep program counter and stack pointer handy.

  NI Getting  
  NIwindow[N] 
  NI is same cost as 
  NInext = *PC++; 

  NI My point is that to avoid keeping too-many things handy window
  NI base and stack pointer should be the same (real machine) register.

if we can control that. 

Maybe not directly, but most compilers will keep common base registers
in machine registers if you code things right.

but i see issues too. i mentioned the idea of
having $_ and other special vars and stuff would have their own PMC's in
this register set. 

Why does it have to be _this_ register set - globals can go in another
register set - SPARC's register scheme has global registers too.

That said my guess is that $_ is usually save/restored across sub/block
boundaries.

dan like the idea. that doesn't map well to a window
as those vars may not change when you call subs. i just don't see
register windows as useful at the VM level.

Call it what you will - I am arguing for an addressable stack
not for windows as such.



   i am just saying register windows don't seem to be any win for us
   and cost an extra indirection for each data access. my view is let
   the compiler keep track of the register usage and just do
   individual push/pops as needed when registers run out.

  NI That makes sense if (and only if) virtual machine registers are real 
  NI machine registers. If virtual machine registers are in memory then 
  NI accessing them on the stack is just as efficient (perhaps more so)
  NI than at some other special location. And it avoids need for 
  NI memory-to-memory moves to push/pop them when we do spill.

no, the idea is the VM compiler keeps track of IL register use for the
purpose of code generating N-tuple op codes and their register
arguments. this is a pure IL design thing and has nothing to do with
machine registers. at this level, register windows don't win IMO.

That quote is a little misleading. My point is that UNLESS machine
(real) machine registers are involved then all IL Registers are 
in memory. Given that they are in memory they should be grouped with
and addressed-via-same-base-as other memory that a sub is accessing.
(The sub will be accessing the stack (or its PAD if you like), and the 
op-stream for sure, and possibly a few hot globals.)

The IL is going to be CISC-ish - so treat it like an x86 where 
you operate on things where-they-are (e.g. on the stack) 

   add 4,BP[4]

rather than RISC where you 

   ld BP[4],X
   add 4,X
   ST X,BP[4]
   
If registers are really memory the extra moves of a RISC scheme
are expensive.

What we _really_ don't want is the worst of both worlds:

   push BP[4];
   push 4
   add
   pop  BP[4] 


i am thinking about writing a short psuedo code post about the N-tuple
op codes and the register set design. the ideas are percolating in my
brane.

uri
-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/




Re: Stacks registers

2001-05-26 Thread Nick Ing-Simmons

Uri Guttman [EMAIL PROTECTED] writes:
  NI i.e. 
  NI  R4 = frame[N]
  NI is same cost as
  NI  R4 = per_thread[N]
  NI and about the same as
  NI  extern REGISTER GlobalRegs4 
  NI  R4 = GlobalRegs4;

well, if there is no multithreading then you don't need the per_thread
lookup. 

Well:
 (a) I thought the plan was to design threads in from the begining this time.
 (b) I maintain that cost is about the same as global variables anyway.

The case for (b) is as follows:
on RISC hardware

R4 = SomeGlobal;

becomes two instructions:

loadhigh SomeGlobal.high,rp 
ld rp(SomeGlobal.low),R4

The C compiler will try and factor out the loadhigh instruction, leaving
you with an indexed load. In most cases 

ld rp(RegBase.low+4),R4

is just a valid and takes same number of cycles, and there is normally
a form like

ld rp(rn),R4

Which allows index by variable amount.


On CISC machines, then either there is an invisible RISC (e.g. Pentium)
which behaves as above or you get something akin to PDP-11 where indirection
reads a literal address via the program counter.

move [pc+n],r4

In such cases 

move [regbase+n],r4 

is going to be just as fast - the issue is the need for a (real machine)
register to hold 'regbase'.

and the window base is not accounted for. you would need 2
indirections, the first to get the window base and the second to get the
register in that window. 

No - you keep the window base handy and don't keep re-fetching it,
same way you keep program counter and stack pointer handy.

Getting  
   window[N] 
is same cost as 
   next = *PC++; 

My point is that to avoid keeping too-many things handy window base
and stack pointer should be the same (real machine) register.

i am just saying register windows don't seem to
be any win for us and cost an extra indirection for each data access. my
view is let the compiler keep track of the register usage and just do
individual push/pops as needed when registers run out.

That makes sense if (and only if) virtual machine registers are real 
machine registers. If virtual machine registers are in memory then 
accessing them on the stack is just as efficient (perhaps more so)
than at some other special location. And it avoids need for 
memory-to-memory moves to push/pop them when we do spill.
 
-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/




Re: Stacks registers

2001-05-24 Thread Nick Ing-Simmons

Uri Guttman [EMAIL PROTECTED] writes:
 NI == Nick Ing-Simmons [EMAIL PROTECTED] writes:

  NI We need to decide where a perl6 sub's local variables are going
  NI to live (in the recursive case) - if we need a stack anyway it
  NI may make sense for VM to have ways of indexing the local frame
  NI rather than having global registers (set per thread by the way?)

i made that thread point too in my long reply to dan.

but indexing directly into a stack frame is effectively a register
window. the problem is that you need to do an indirection through the
window base for every access and that is slow in software (but free in
hardware).

It isn't free in hardware either, but cost may be lower.
Modern machines should be able to schedule indirection fairly efficiently.
But I would contend we are going to have at least one index operation
anyway - if only from the thread pointer, or global base - so 
with careful design so that registers are at right offset from the base
we can subsume the register lookup index into that.

i.e. 
 R4 = frame[N]
is same cost as
 R4 = per_thread[N]
and about the same as
 extern REGISTER GlobalRegs4 
 R4 = GlobalRegs4;




-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/




Re: Stacks registers

2001-05-24 Thread Nick Ing-Simmons

Alan Burlison [EMAIL PROTECTED] writes:
 1. When you call deep enough to fall off the end of the large register
file an expensive system call is needed to save some registers
at the other end to memory and wrap, and then again when you
come back to the now-in-memory registers.

Not a system call but a trap - they aren't the same thing (pedant mode off
;-).  The register spill trap handler copies the relevant registers onto the
stack - each stack frame has space allocated for this.

Pedant mode accepted - and I concur. But trap handler is still significant
overhead compared to just doing the moves (scheduled) inline 
as part of normal code. So register windows win if you stay in bounds
but loose quite seriously if you have active deep calls.

(My own style is to write small functions rather than #define or inline,
for cache reasons - this has tended to make above show. I am delighted to 
say that _modern_ (Sun) SPARCs have deep enough windows even for me - 
but SPARCStation1+ and some of the lowcost CPUs didn't.)



Alan Burlison
-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/




Re: PDD: Conventions and Guidelines for Perl Source Code

2001-05-10 Thread Nick Ing-Simmons

Alan Burlison [EMAIL PROTECTED] writes:

I strongly agree.  The current macro mayhem in perl is an utter abomination,
and drastically reduces the maintainability of the code.  I think the
performance argument is largely specious, and while abstraction is a
laudable aim, in the case of perl it has turned from abstraction into
obfustification.

As I have said more than once before, excessive use of macros can 
be a performance killer. It is better to have slabs of common stuff
in real function (which is cached) rather than replicated all over the 
place. That is the style I use in my own (whoops sorry, TI's) code and 
it does not seem to hurt even on X86 CISC machines.

-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/




Re: Tying Overloading

2001-04-24 Thread Nick Ing-Simmons

Larry Wall [EMAIL PROTECTED] writes:
Nick Ing-Simmons writes:
: You really have to talk about overloading boolean context
: in general.
: 
: Only if you are going to execute the result in the normal perl realm.
: Consider using the perl parser to build a parse tree - e.g. one to 
: read perl5 and write perl 6. This works for all expressions except
: , || and ?: because perl5 cannot overload those - so 
: 
: $c = ($a  b) ? $d : $e;
: 
: calls the bool-ness of $a and in the defered execution mode of a translator
: it wants to return not true/false but it depends on what $a is at run-time.
: It cannot do that and is not passed $b so cannot return 

I think using overloading to write a parser is going to be a relic of
Perl 5's limitations, not Perl 6's.

I am _NOT_ using overloading to write a parser. 
Parse::Yapp is just fine for writing parsers. I am trying to re-use
a parser that already exists - perl5's parser. 

I am using overloading to get at the parse tree that the _existing_ parser 
has produced. 

So I can get at perly.y's : 

term:   ...
|   '!' term
|   term ADDOP term

etc. but NOT 

|   term ANDAND term
|   term OROR term
|   term '?' term ':' term

; 

I can get at the former because overload maps via newBINOP/newUNOP
just fine, I cannot get at latter group because newLOGOP/newCONDOP
don't do overloading.

What _really_ want to do is a dynamically scoped peep-hole optimize
(actually a rewrite) of the op tree - written in perl.

But I can't do that, so I fake it by having 

sub construct () { ... }

and then 

construct { 
  # expression(s) here 
}

and have construct() call the ops with the overload stuff returning a tree.
These days I suppose one could use B:: to poke about in the CV



Larry
-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: Split PMCs

2001-04-23 Thread Nick Ing-Simmons

Dan Sugalski [EMAIL PROTECTED] writes:
At 07:39 PM 4/19/2001 +, [EMAIL PROTECTED] wrote:
Depends what they are. The scheme effectively makes the part mandatory
as we will have allocated space whether used or not.

Well, we were talking about all PMCs having an int, float, and pointer 
part, so it's not like we'd be adding anything. Segregating them out might 
make things faster for those cases where we don't actually care about the 
data. OTOH that might be a trivially small percentage of the times the 
PMC's accessed, so...

What is the plan for arrays these days? - if the float parts 
of the N*100 entries in a perl5-oid AV were collected you might 
get packed arrays by the back door.


So it depends if access pattern means that the part is seldom used,
or used in a different way.
As you say works well for GC of PMCs - and also possibly for compile-time
or debug parts of ops but is not obviously useful otherwise.

That's what I was thinking, but my intuition's rather dodgy at this level. 
The cache win might outweigh other losses.

 I'm thinking that passing around an
 arena address and offset and going in as a set of arrays is probably
 suboptimal in general,

You don't, you pass PMC * and have offset embedded within the PMC
then arena base is (pmc - pmc-offset) iff you need it.

I was trying to avoid embedding the offset in the PMC itself. Since it was 
calculatable, it seemed a waste of space.

But passing extra args around is fairly expensive when they are 
seldom going to be used. Passing an extra arg through N-levels is
going to consume instructions and N * 32 bits of memory or so.


If we made sure the arenas were on some power-of-two boundary we could just 
mask the low bits off the pointer for the base arena address. Evil, but 
potentially worth it at this low a level.

That would work ;-)

-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: PDD for code comments ????

2001-03-26 Thread Nick Ing-Simmons

David L . Nicol [EMAIL PROTECTED] writes:
Jarkko Hietaniemi wrote:

 Some sort of simple markup embedded within the C comments.  Hey, let's
 extend pod!  Hey, let's use XML!  Hey, let's use SGML!  Hey, let's use
 XHTML!  Hey, let's use lout!  Hey, ...

Either run pod through a pod puller before the C preprocessor gets to
the code, or figure out a set of macros that can quote and ignore pod.

The second is Yet Another Halting Problem so we go with the first?

Which means a little program to depod the source before building it,
or a -HASPOD extension to gcc

Or just getting in the habit of writing 

/*
=pod


and

=cut
*/

Perhaps we could teach pod that /* was alias for =pod
and */ an alias for =cut ?


-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: Vtables: what do we know so far?

2001-02-02 Thread Nick Ing-Simmons

Edwin Steiner [EMAIL PROTECTED] writes:
Filipe Brandenburger wrote:
[...]
 struct sv {
 vtable_sv  * ptr_to_vtable;
 void   * ptr_to_data;
 void   * gc_data;
 };
[...]
 I don't think I can get further from here. Note that, in all examples,
 I didn't write the `this' pointer that every function would receive.
 This would correspond to the `ptr_to_data' from the struct sv.

I think the `this' pointer should be the SV* (== ptr_to_vtable) so virtual functions 
can themselves call virtual functions on the same object.

Definitely. It also allows them to change what ptr_to_data is for example.


-Edwin
-- 
Nick Ing-Simmons




Modular subsystem design (was Re: Speaking of signals...)

2001-01-11 Thread Nick Ing-Simmons

Filipe Brandenburger [EMAIL PROTECTED] writes:
 
But, back to the efficiency issue, I _THINK_ the scenario I described is not 
inefficient. What it does differently from a monolithic system: it uses 
callbacks instead of fixed function calls, and it doesn't inline the 
functions. First, Callbacks take at most 1 cycle more than fixed function 
calls (is this right???), 

No - a memory fetch can take a long time (10s of cycles).
Mostly that can be hidden by a pipeline, but branches (i.e. calls)
tend to expose it more. But we are already thinking of "vtables" which 
are no better.

because the processor must fetch the code address 
from an address of memory, instead of just branching to a fixed memory 
address. Comparing to all the code Perl uses to handle SVs and such stuff, I 
think 1 cycle wouldn't kill us at all! 
 
Well, inline functions _CAN_ make a difference if there are many calls to 
one function inside a loop, or something like this. And this _CAN_ be a 
bottleneck. 

Inline functions can also cost you - the out-of-line function 
may be in the cache, and the plethora of inline functions not in cache,
or extra code size thrashes cache.

Well, I have one idea that keeps our design modular, breaks 
dependencies between subsystems (like that of using async i/o system without 
having to link to the whole thing), and achieves efficiency through inline 
functions. We could develop a tool that works in the source code level and 
does the inlining of functions for us. I mean a perl program that opens the 
C/C++ source of the kernel, looks for pre-defined functions that should be 
inlined, and outputs processed C/C++ in ``spaghetti-style'', very messy, 
very human-unreadable, and very efficient. 

And already discussed ;-) 

-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: perl IS an event loop (was Re: Speaking of signals...)

2001-01-08 Thread Nick Ing-Simmons

Simon Cozens [EMAIL PROTECTED] writes:
On Fri, Jan 05, 2001 at 11:42:32PM -0500, Uri Guttman wrote:
   SC 5x slowdown. 
 
 not if you just check a flag in the main loop. you only check the event
 system if you have pending events or signals, etc. the key is not
 checking all events on each pass thru the loop. 

Which is exactly what Chip did in his safe-signals patch. 33% slowdown.

I don't believe it - can we add a stub test and bench mark it?

-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: perl IS an event loop (was Re: Speaking of signals...)

2001-01-08 Thread Nick Ing-Simmons

Dan Sugalski [EMAIL PROTECTED] writes:
At 01:02 PM 1/6/01 -0500, Uri Guttman wrote:
that is what i would expect form a simple flag test and every N tests
doing a full event poll. and even up to 5-10% slowdown i would think is
a good tradeoff for the flexibilty and ease of design win we get in the
i/o and event guts. but then, i have always traded off speed for
flexibility and ease. hey, so has perl! :)

Not always. :) The flexibility really does need to balance out the speed 
hit. (If Nick wasn't in the middle of rewriting the whole IO system, I'd 
probably be assaulting sv_gets to make up for the speed hit I introduced 
way back with the record reading code...)

Nick has yet to touch sv_gets() - partly 'cos it was too scary to mess
with - so you can if you like ;-)

-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: perl IS an event loop (was Re: Speaking of signals...)

2001-01-08 Thread Nick Ing-Simmons

Bart Lateur [EMAIL PROTECTED] writes:

Apropos safe signals, isn't it possible to let perl6 handle avoiding
zombie processes internally? What use does having to do wait() yourself,
have anyway?

Valid point - perl could have a CHLD handler in C and stash away returned
status to pass to wait() when it did get called.


-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: Speaking of signals...

2001-01-05 Thread Nick Ing-Simmons

Uri Guttman [EMAIL PROTECTED] writes:

but the question remains, what code triggers a signal handler? would you
put a test in the very tight loop of the the op dispatcher? 

Not a test. The C level signal handler just fossicks with the variables
that very tight loop is using. 



  n But if "runops" looked like:

  n while (PL_op = PL_next_op)
  n  {
  PL_op- perform(); # assigns PL_next_op;
  n  }

  n (Which is essentially FORTH-like) then there is little to get in a mess.
  n The above is simplistic - we need a way to "disable interrupts" too.

and where is the event test call made? 

It isn't. PL_next_op is set by C signal handler.

In practice I suspect we need the test :

while (PL_op = (PL_sig_op) ? PL_sig_op : PL_next_op)
 {
  PL_op-perform;
 }

or somehow the next op delivered
will be the next baseline op or the dispatch check op. that is basically
the same as my ideas above, just a different style loop.

What I am trying to get to is adding minimal extra tests to the tight loop.
We probably need at least ONE test in the loop - let us try and make 
that usable for all the "abnormal" cases.



uri
-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: Anyone want to take a shot at the PerlIO PDD?

2001-01-03 Thread Nick Ing-Simmons

Dan Sugalski [EMAIL PROTECTED] writes:
Would someone like to take a crack at a PDD for the PerlIO system? It 
doesn't need to be particularly fancy (nor complete) to start with, but 
having one will give us a place to work from. (Waiting for me to spec it 
out may take a while...)

I am willing to cast bleadperl5's PerlIO into the form of a _draft_ PDD
for perl6 - i.e. "this is what it does now", not "this is what it should do".

Then we can discuss it here some more.

-- 
Nick Ing-Simmons




Re: standard representations

2000-12-31 Thread Nick Ing-Simmons

Dan Sugalski [EMAIL PROTECTED] writes:
 That's fine. I was thinking of smaller processors that might be used in
 embedded apps and such. (I'm also not sure what's the most efficient
 integer representation on things like the ARM microprocessors are)

ARM7/ARM9 are both 32-bit
MIPS has both 32-bit and 64-bit variants.

That's good. Though do either of them have 16-bit data busses?

Not at the CPU no - what happens at chip boundary depends on what customer
asks for.

The 68XXX in Palm-Pilots are the issue there.


DSPs are more messy.

That's probably a bit too specialized a piece of hardware to worry about. 
Unlss things have changed lately, they're not really general-purpose CPUs.

Some of them are.


It is micro-controllers that you have to worry about

Yeak, I know a lot of the old 8 and 16 bit chips are in use as control 
devices places. Those are the ones I'm thinking about. (Not that hard, but 
I don't want to rule them out needlessly)

I suspect that any that are up to running anything approximating perl
will have 32-bit ops in a library in any case.


-- 
Nick Ing-Simmons




Re: standard representations

2000-12-30 Thread Nick Ing-Simmons

Dan Sugalski [EMAIL PROTECTED] writes:

Anyone know of a good bigint/bigfloat library whose terms are such that we 
can just snag the source and use it in perl?

There was some traffic on gcc list recently about a GNU one (presumably GPL
only).

I don't really care to write 
the code for division, 

As I recall Knuth has something on it.
I know that some hardware FPUs do division (N/M) by 
Newton-Raphson expansion of 1/M and then do N*(1/M).

let alone the transcendental math ops...

TI's sources for those site some book or other.
The snag with those and sqrt() etc. is that the published algorithms
"know" how many terms of power series are needed to reach (say) IEEE-754
"double". 

Thus a "big float" still needs to decide how precise it is going to 
be or atan2(1,1)*4 (aka PI) is going to take a while to compute...


-- 
Nick Ing-Simmons




Re: standard representations

2000-12-30 Thread Nick Ing-Simmons

Dan Sugalski [EMAIL PROTECTED] writes:
At 01:05 PM 12/29/00 +, Nick Ing-Simmons wrote:
Dan Sugalski [EMAIL PROTECTED] writes:
 
 I'm reasonably certain that all platforms that perl will ultimately run on
 can muster hardware support for 16-bit integers.

Hmm, most modern RISCs are very bad at C-like 16-bit arithmetic - they have
a tendency to widen to 32-bits.

That's fine. I was thinking of smaller processors that might be used in 
embedded apps and such. (I'm also not sure what's the most efficient 
integer representation on things like the ARM microprocessors are)

ARM7/ARM9 are both 32-bit
MIPS has both 32-bit and 64-bit variants.
DSPs are more messy.

It is micro-controllers that you have to worry about 


-- 
Nick Ing-Simmons




Re: standard representations

2000-12-29 Thread Nick Ing-Simmons

Dan Sugalski [EMAIL PROTECTED] writes:

Strings can be of three types--binary data, platform native, and UTF-32. 
No, we are not messing around with UTF-8 or 16, nor are we messing with 
EBCDIC, shift-JIS, or any of that stuff. 

I don't understand that in the light of supporting "platform native".
That could easily be any of those as you note below. So what operations
are supported on "platform native" strings? Are we at the mercy of locale's
idea of upper/lower case, sort order etc.?

Strings can be stored internally 
that way (and the native form might be one of them) but as far as the 
interface is concerned we have only three. Yes, this does mean if we mess 
with strings in UTF-8 format on a non-UTF-8 system they'll need to be fed 
out in UTF-32. It's bigger, but we can deal.

-- 
Nick Ing-Simmons




Re: standard representations

2000-12-29 Thread Nick Ing-Simmons

Dan Sugalski [EMAIL PROTECTED] writes:

I'm reasonably certain that all platforms that perl will ultimately run on 
can muster hardware support for 16-bit integers. 

Hmm, most modern RISCs are very bad at C-like 16-bit arithmetic - they have
a tendency to widen to 32-bits.

I also expect that they 
can all muster at least software support for 32-bit integers. However

The issue isn't support, it's efficiency. Since we're not worrying about 
loss of precision (as we will be upconverting as needed) the next issue is 
speed, and that's where we want things to be in a platform convenient size.

I honestly can't think of any reason why the internal representation of an 
integer matters to the outside world, but if someone can, do please 
enlighten me. :)

I can't think of anything except the range that is affected by the 
representation.

-- 
Nick Ing-Simmons




Re: standard representations

2000-12-29 Thread Nick Ing-Simmons

Dan Sugalski [EMAIL PROTECTED] writes:

BigInt and BigFloat are both pure perl, and as such their speed leaves a 
*lot* to be desired. Fixing that (at least yanking some of it to XS) has 
been on my ToDo list for a while, but other stuff keeps getting in the 
way... :)

My own "evolutionary" view of things is that if we did XS versions 
of BigInt and BigFloat for perl5 we would learn some issues that might 
affect Perl6. i.e. the vtable entries for "ints" may be influenced 
by their use as building blocks for "floats".

For example the choice of radix in the BigInt case - should it be N*16-bits
or should we try and squeeze 32-bits - or to avoid issues with sign 
should that be 15 or 31? (If we assume we use 2's complement then LS words
are treated as unsigned only MS word has sign bit(s).)

BigFloat could well build on BigInt for its "mantissa" and have another
int-of-some-kind as its exponent. We don't need to pack it tightly
so we should probably avoid IEEE-like hidden MSB. The size of exponent 
is one area where "known range of int" is important.

-- 
Nick Ing-Simmons




Re: mixed numeric and string SVs.

2000-12-21 Thread Nick Ing-Simmons

David Mitchell [EMAIL PROTECTED] writes:
  2. Each SV has 2 vtable pointers - one for it's numeric representation
  (if any), and one for its string represenation (if any). Flexible, but
  may require an extra 4/8 bytes per SV.
 
 It may not be terrible. How big is the average SV already anyway?

True, but I've just realised a complication with my suggestion. If
there are a multiple vtable ptrs per SV, which type 'owns' the SV carcass,

Perl owns the carcass. Each vtable would have its own payload portion
and be responsible for its destruction and cleanup.
This is classical "multiple inheritance" scheme.

and is responsible for destruction, and has permission to put its
own stuff in the payload area etc? I think madness might this way lie.

So here's a modified suggestion. Rather than having 2 vtable ptrs per scalar,
we allow a string type to contain an optional pointer to another
subsidiary SV containing its numeric value. (And vice versa).

That would work too.


Then for example the getint() method for a utf8 string type might look like:

utf8_getint(SV *sv) {
   if (sv-subsidiary_numeric_sv == NULL) {
   sv-subsidiary_numeric_sv = Numeric-new(aton(sv-value));
   }
   return sv-subsidiary_numeric_sv-getint();
}

(uft8 stringgy methods that alter the string value of the SV are then
responsible for either destroying the subsidiary numeric SV, or for making
sure it's value gets updated, or for setting a flag warning that it's
value needs recalculating.)

Similarly, the stringy methods for numeric types are wrappers that
optionally create a subsidiary string SV, then pass the call onto that
object.

Or to avoid the conditional each time, there could be 2 vtables for each
type, containing 'with subsidiary' and 'without subsidiary' methods;
the role of the latter being to create the subsidiary SV and update the
type of the main SV to the 'with subsidiary' type.
-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: String representation

2000-12-21 Thread Nick Ing-Simmons

Nicholas Clark [EMAIL PROTECTED] writes:
 
 where it is possible to get "smart" when one arg is a "special case" of 
 the other.

 And similarly numbers must be convertable to "complex long double" or
 what ever is the top if the built-in tree ? (NV I guess - complex is
 over-kill.)

 It is the how do we do the generic case that worries me.

Maybe this is a digression, but it does suggest that there may not
be 1 top to the tree (at least for builtin numbers). Which may also hold
for strings.

Which is why it worries me. If I invent a new number type (say),
what vtable entries must it have to allow all the generic things
to function? Given a choice between NV/UV/IV possibles on what basis do we 
choose one branch over the other?



 We old'ns need people that don't know "it can't be done" to tell us
 how to do it - but we reserve the right to say "we tried that it didn't
 work" too.
  ^ because

Nicholas Clark
-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: mixed numeric and string SVs.

2000-12-20 Thread Nick Ing-Simmons

David Mitchell [EMAIL PROTECTED] writes:
Has anyone given thought to how an SV can contain both a numeric value
and string value in Perl6?
Given the arbitrary number of numeric and string types that the vatble
scheme of Perl6 support it will be unviable to to have special types
for all permuations (eg, utf8_nv, unicode32_iv, ascii_bitint, ad nauseum).

It seems to me the following options are poossible:

1. We no longer save conversions, so
   $i="3"; $j+=$i for (...);
does an aton() or similar each time round the loop

Well just the 1st time - then it is a number...


2. Each SV has 2 vtable pointers - one for it's numeric representation
(if any), and one for its string represenation (if any). Flexible, but
may require an extra 4/8 bytes per SV.

This is my favourite.


3. We decree that all string to numeric conversions should return
a particular numeric type (eg NV), and that all numeric to string
conversions should similary convert to a fixed string type (eg utf8).
(Although I'm not sure that really helps.)

I can't see how that helps.

-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: String representation

2000-12-20 Thread Nick Ing-Simmons
e the mess, or
*might* increase it, depending on how its done. 

If it "depends" then it isn't strictly "orthogonal".


One final thing - I'm fairly new to this game (I thought the start of Perl6
would be a good time to get involved, without having to understand
the horrors of perl5 internals in depth), which means I run more of a risk
than most of speaking from my derierre. So far I have been reluctant to
put forward any really substantial suggestions as to how to handle
all this stuff, mainly for fear of irritating people who know what
they are talking about, and who have to take time out to explain to me why I'm
wrong! On the other hand, I do seem to have ended up taking a lot about
this subject on perl6-internals!!
So, should I have the courage of my convictions and let rip, or should I
just leave this to wiser people? Answers on a postcard, please

We old'ns need people that don't know "it can't be done" to tell us
how to do it - but we reserve the right to say "we tried that it didn't
work" too.


-- 
Nick Ing-Simmons




Re: String representation

2000-12-18 Thread Nick Ing-Simmons

David Mitchell [EMAIL PROTECTED] writes:

Personally I feel that that string part of the SV API should include most
(if not all) string functions, including regex matching and substitution.

What are string functions in your view?
  m//
  s///
  join()
  substr
  index
  lc, lcfirst, ...
   | ~
  ++
  vec  
  '.'
  '.='

It rapidly gets out of hand.
  
Why not eval "$string" as well ? ;-)
then in the limit perl can just become eval scalar(ARGV);

Seriously - I think we need to considr the original question 
"What is the representation" based on perl5 hindsight, then think what 
operations we want to perform on it, then divide those into the ones
which make sense to be "methods" (vtable entries) of string, 
those that are part of string API, and those which are just ops messing 
with strings.

That way way there can be multiple regex implementations to handle different
cases (eg  fast one(s) for fixed width ASCII, UTF-32 etc, and a slow horrible one
for variable-length UTF-8, etc). Of course perl itself could provide a default regex
engine usable by all string types, but implementors would then be free to add
variants for custom string types.

I would argue one does that by making the regex API more modular.

-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: String representation

2000-12-18 Thread Nick Ing-Simmons

Simon Cozens [EMAIL PROTECTED] writes:

So, before we start even thinking about what we need, it's time to look at the
vexed question of string representation. How do we do Unicode without getting
into the horrendous non-Latin1 cockups we're seeing on p5p right now? 

Well - my theorist's answer is that everything is Unicode - like Java.
As I pointed out on p5p even EBCDIC machines can use that model - but 
the downside is that ord('A') == 65 which will breaks backward compatibility 
with EBCDIC scripts. 

If perl5.7+ EBCDIC continues down its alternate road
and we need to be able to translate perl5 - perl6 I strongly suspect 
that perl6 cannot use the "java-oid" model either as the programmer's
intent will not be obvious enough to auto-translate.
I still haven't grasped what the current EBCDIC "model as seen by perl
programmer" _is_.

Larry
suggested aeons ago that everything is an array of numbers, and Perl shouldn't
care what those numbers represent. But at some point, it has to, and that
means things have to be tagged with their character repetoires and encodings.

Tagging a string with a repertoire and encoding is horrible - you are aware 
of the trickyness of even getting the SvUTF8 bit "right". To have 
a general representation carried around we need a pointer rather just a bit
and we cannot say 
   if (SvUTF8(sv))

we have to say 

   if (SvENCODING(sv)-some_predicate)

e.g. 

   if (SvENCODING(sv_a) != SvENCODING(sv_b))
{
 if (SvENCODING(sv_a)-is_superset_of(SvENCODING(sv_b))
  {
   sv_upgrade_to(sv_b,SvENCODING(sv_a));
  }
 elsif if (SvENCODING(sv_b)-is_superset_of(SvENCODING(sv_a))
  {
   sv_upgrade_to(sv_a,SvENCODING(sv_b));
  }
 else
  {
   Encoding *x = find_superset_encoding(SvENCODING(sv_a),SvENCODING(sv_b))
   sv_upgrade_to(sv_a,x);
   sv_upgrade_to(sv_b,x);
  }
} 

Personally I would not use such a beast 

The only sane compromise I can imagine is close to what we have at the 
moment with maybe a few extra special cases in the "flags" bits:
   ASCII only   (0..7f)
   Native-single-byte   (iso8859-x, IBM1047)
   wchar_t 
   UTF-8
   UNICODE

There needs to be a hierachy of _repertoires_ such that:

ASCII is subset of Native is subset of wchar_t is subset of UNICODE.


The "Native-single-byte" would have one - global-to-interpreter
encoding object - not just iso8859-1 - basically the one that LC_CTYPE
gives the "right answers for" - though how the  "ยฃ!$^ยฌ!*% one is supposed 
to find that out is beyond me - so we would presumably invert that 
and use the Unicode CTYPE-oid stuff to do isALPHA() etc.

-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: String representation

2000-12-18 Thread Nick Ing-Simmons

David Mitchell [EMAIL PROTECTED] writes:
 Personally I would not use such a beast 

But with different encodings implemented by different SV types - each with their
own vtable - surely most of this will "come out in the wash", by the correct
method automatically being called. I thought that was the big selling point
of vtables :-)

(Or to put it another way - is the debate about handling multiple string
encodings really just the same debate as the handling of multiple numeric types
(but harder...) ?)

It is exactly the same as the 
   enormous_int ** complex_rational  problem.

   if ("N{gamma}".title_case(join($klingon,@welsh)) =~ /$urdu/)

who's operators get called ? 


-- 
Nick Ing-Simmons




Re: String representation

2000-12-18 Thread Nick Ing-Simmons

Nicholas Clark [EMAIL PROTECTED] writes:
On Fri, Dec 15, 2000 at 11:18:00AM -0600, Jarkko Hietaniemi wrote:

 As painful as it may sound (codingwise) I would urge to spare some
 thought to using (internally) UTF-32 for those encodings for which
 UTF-8 would be *longer* than the UTF-32 (mainly the Asian scripts).

most CPUs can load a 32 bit quantity in 1 machine instruction
most CPUs would take 2 or 3 machine instructions to load 2 or 3 bytes of
variable length encoding, and I'd guess that on most RISC CPUs those
three instructions take three times the space, 

Okay so far.

(and take 3 times the
single load instruction)

Almost certainly more than the single load, but much less than 3 
due to cache effects.

And that's ignoring the code to bit shuffle those bytes that make up the
character.

So it may be more total space efficient to use 32 bits for data.
And although it feels like we'll be shifting 32 bits of data round per
character instead of 8-40 with an average less than 32, it might still take
longer because we're doing it less efficiently.

My big worry is that "strings" are would fill the data cache much more quickly.



Just a passing thought. Extrapolated up from 1 RISC CPU I know quite well.

Nicholas Clark
-- 
Nick Ing-Simmons




Re: String representation

2000-12-18 Thread Nick Ing-Simmons

David Mitchell [EMAIL PROTECTED] writes:
Nick Ing-Simmons [EMAIL PROTECTED] wrote:
 What are string functions in your view?
   m//
   s///
   join()
   substr
   index
   lc, lcfirst, ...
| ~
   ++
   vec  
   '.'
   '.='
 
 It rapidly gets out of hand.

Perhaps, but consider that somewhere within the perl internals there
have to be functions which implement all these ops anyway. If we
provide vtable slots for all these functions and just fill most of the
slots with pointers to the 'default' Perl implementation, we havent
really lost anything, except possibly a slight delay due to the extra
indirection which that may be compensated for elsewhere). On the other
hand, we have gained the ability to replace the default implementation
with something more efficent where it suits us.

I have just been through exactly that process with the PerlIO stuff.
So I hope you will not take offence when I say that your observation above
is simplistic. The problem is "what are the (types of) the arguments passed
to the functions?" - the existing code will be expecting its args in 
a particular form. So your wonderous new function must accept exactly 
those args and types - and convert them as necessary before becoming 
more efficient. So to get any win the args/types of all the functions 
has to be designed with pluggable-ness in mind from the outset.
At best this means taking an indirection hit for all the args as well 
as the function (this is what PerlIO does - PerlIO is now essentially 
a FILE ** rather than a FILE *). 

At worst we have to write a "worst case" override entry for each op and 
then work what it needs back - this is exemplified by PerlIO_getpos()
the "position" arg had to stop being an Fpos_t and become an SV *
so that stdio could stuff an Fpos_t in it, but a transcoding layer
could put the Fpos_t, and the escape-state and partial characters in as 
well.



Take the example of substr() - if this is a standalone function, then
it has to work without reference to any of the internals of its args,
and thus has to rely on extracting a 'standard' representation of the
string value from the SV in order to operate upon it. This then implies
messiness of coding and inefficiency, with all the unicode hell that
infects perl5 re-appearing.  If substr() were a per-type op, then the
messy details of UTF8 would lie almost completely within the internal
implementation of that datatype.

True, but the messy details would now occur multiple times,
as soon as substr_utf8 exists then _ALL_ the other string ops 
_must_ be overridden as well because nothing but string_utf8 "class" 
knows what is going on.


In fact, I would argue that in general most if not all the operations currently
performed by pp_* should have vtable equivalents, both for numeric and string
types (including unary ops, mutators, binops etc etc).

Hmm - that is indeed a logical position. 



 Seriously - I think we need to considr the original question 
 "What is the representation" based on perl5 hindsight, then think what 
 operations we want to perform on it, then divide those into the ones
 which make sense to be "methods" (vtable entries) of string, 
 those that are part of string API, and those which are just ops messing 
 with strings.

If an "op messing with strings" might be able to do a faster job given
access to the internals of that string type, then I'd argue that that op
should be in the vtable too.

I can see your position. 

perl6 = Union_of(I32_perl, I64_perl, float_perl, double_perl, long_double_perl,
  ASCII_perl,  UTF8_perl, ShiftJis_perl, 
  Complex_rational_perl, right_to_left_perl,
)

or 

class perl
{
 virtual SV *add(SV *,SV *);
 ...
 virtual SV *y(SV *,SV *); 
}

The snag here is that the volume of code explodes and gets splattered 
all over the sub-classes. So to fix a bug in the '+' operator (pp_plus)
one has to go visit lots of places - but, presumably, the bug will 
only be in one of them.

If this is to fly (and I am not saying it cannot), then the 
"multiple despatch" issue needs to have a clean process so that 
it is clear what happens if someone writes:

  my $complex_rational = $urdu_string / sqrt(-$big_integer);

The string needs to get converted to a number knowing which characters
are digits and what the Urdu for 'i' is. The big integer needs to get 
negated (no sweat) then someone's sqrt() gets called and had better not 
barf on the -ve value, then complex_rational can do the right thing.

In other words - string ops on strings of uniform type, math ops on 
well understood hierachies etc. are all easy enough - it is the 
combinations that get very messy very very quickly. 


 
   


-- 
Nick Ing-Simmons




Re: String representation

2000-12-18 Thread Nick Ing-Simmons

Jarkko Hietaniemi [EMAIL PROTECTED] writes:
On Mon, Dec 18, 2000 at 03:21:05PM +, Nick Ing-Simmons wrote:
 Simon Cozens [EMAIL PROTECTED] writes:
 
 So, before we start even thinking about what we need, it's time to look at the
 vexed question of string representation. How do we do Unicode without getting
 into the horrendous non-Latin1 cockups we're seeing on p5p right now? 
 
 Well - my theorist's answer is that everything is Unicode - like Java.

That would be nice, yes.

 As I pointed out on p5p even EBCDIC machines can use that model - but 
 the downside is that ord('A') == 65 which will breaks backward compatibility 
 with EBCDIC scripts. 

Maybe we need $ENV{PERL_ENCODING} to control ord() and chr(), too?

That was my suggestion last week some time - though not stated as clearly!


 Tagging a string with a repertoire and encoding is horrible - you are aware 

Indeed.  We have had a very rough ride trying to get just two
encodings to play well together, trying to support more simultaneously
would be pure combinatorial masochism.  I say we should strive for
converting everything to/from one agreed-upon internal encoding.  Yes,
this is somewhat counter to the idea 'no preferred internal encoding'.
After pondering about the issue I have come around to "Oh, yes, there
should be one preferred internal encoding.", otherwise we banish
ourselves to much nashing of the teeth.  Off-hand, I think it's only
when there would be information loss when the One True Encoding
conversion shouldn't be done.  What's the OTE, then?  Well, UTF-16 or
UTF-32, I guess.  The redeeming features of UTF-8, that it is 1:1 for
ASCII, and also compact for ASCII, frankly are getting rather thing in
my eyes.

But not in mine (yet) - but then IO is just throwing gobs of bytes about
and regexps are introspecting. (And Encode has to handle variable-length
multi-byte gunk anyway.) 

-- 
Nick Ing-Simmons




Re: Opcodes (was Re: The external interface for the parser piece)

2000-12-12 Thread Nick Ing-Simmons

David Mitchell [EMAIL PROTECTED] writes:

I think this this boils down to 2 important questions, and I'd be interested in
hearing people's opinions of them.

1. Does the Perl 6 language require some explicit syntax and/or semnatics to
handle multiple and user-defined numeric types?
Eg "my type $scalar",  "$i + integer($r1+$r2)" and so on.

That is a Language and not an internals issue - Larry will tell us.
But I suspect the answer is that it should "work" without any special 
stuff for simple perl5-ish types - because you need to be able to 
translate 98% of 98% of perl5 programs.

So we should start from the premise "no" and see where we get ...

2. If the answer to (1) is yes, is it possible to decide what the numeric part of
the vtable API should be until the details of (1) has been agreed on?

I supect the answers are yes and no.

I suspect the answers are "no" and (2) is eliminated as "dead code" ;-)


Dave.
-- 
Nick Ing-Simmons




Re: The external interface for the parser piece

2000-11-30 Thread Nick Ing-Simmons

Nicholas Clark [EMAIL PROTECTED] writes:

We're trying to make this an easy embedding API.

Yes, and we are in danger of "premature optimization" of the _interface_.  

What we need to start with is a list of "what we need to know" - they may 
as well be separate parameters at this point - then we can decide 
how best to group them and provide wrapper(s) that call the zillion 
parameter version. If there turns out to be only one sensible wrapper
then it can become _the_ interface.

-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: The external interface for the parser piece

2000-11-29 Thread Nick Ing-Simmons

Tom Hughes [EMAIL PROTECTED] writes:
In message [EMAIL PROTECTED]
Dan Sugalski [EMAIL PROTECTED] wrote:

 At 10:42 AM 11/29/00 +, Nick Ing-Simmons wrote:

 FILE * is not a good idea. PerlIO * is fine.
 
 The problem with that is we're potentially getting the filehandle from
 something that isn't perl. Or so my thinking went at the time. Right
 now I'm thinkng that I need to rethink things.

That was my point. The Parser API should stick to PerlIO * - which is 
an abstract interface. How that interface gets provided is none
of the _parser's_ business.

There is another side to this - perl itself (particularly on Win32 or 
other places where stdio is "broken") may not have a FILE * to give 
you - it may only have PerlIO *.


That shouldn't matter so long as there's a simple way to create
a PerlIO * from a FILE * or whatever.

Bleadperl work on PerlIO is teaching that it is not necessarily "simple" to 
convert one to the other. One can wrap a FILE * inside a PerlIO simply 
enough, provided that the provider then promisses not to touch it in 
anyway while perl is messing with it, but the FILE *-ness gets
exposed. For example there are issues with FILE *'s
'textmode' and PerlIO's crlf layer fighting. Unless we inherit perl5's
twin-IO * concept (which I would not recommend at this stage) there are also 
issues with bi-directional things like sockets.

It is (currently) much better to open a PerlIO * from the outset,
either from a pathname, or a low level "file descriptor" (what a
low-level descriptor is on non-UNIX is work in progress).

Now we should be able to clean that up some (even in perl5) but 
we don't want to expose all the mess to the _parser_ API.

If we say FILE * we _may_ have to say - FILE *, open in binmode, not 
line buffered, not to a socket on Win32, ... and a whole host 
of other gunk. And then (presumably) inside the parser wrapper do 
the right thing to turn it into a PerlIO * so we can use UTF8, 
CRLF, encode/decode etc.

Better (IMHO) to do that _outside_ the parser API under another 
committee's juristiction. 


That's probably something that needs to be specific to different
language bindings - if you're embedding perl in C++ you probably
have an iostream, and if you're embedding in Java you'll have a
Java stream object. In each case you'll want an easy way to create
a PerlIO object from that.

Why not export the PerlIO API and have language call that?

But if not possible then we can write 

parse_FILE(FILE *x,...)

which takes the FILE *, wraps in in a PerlIO * calls the generic Parser
unwraps can cleans up and returns.


Tom
-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: To get things started...

2000-11-28 Thread Nick Ing-Simmons

Bart Lateur [EMAIL PROTECTED] writes:

But what if you choose wrong, forgat a really important one, and this
instruction gets a multibyte representation? We're stuck with it
forever...?

I have had some thoughts on "dynamic opcodes", where the meaning of
opcode bytes needn't be fixed, but can be dynamically assigned,
depending on how often they occur (for example). A bit like how a
Huffman compressor may choose shorter representations for the most
occurring byte patterns.

This is just like HW processor opcodes.  x86 has lasted so well
because the initial guess at the short/common opcodes was not too bad.
But the escape bytes are getting out of hand now...

-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: Perl Implementation Language

2000-09-20 Thread Nick Ing-Simmons

Tom Hughes [EMAIL PROTECTED] writes:

What I'd like to see us avoid is the current situation where trying
to examine the value of an SV in the debugger is all but impossible
for anybody other than a minor god.

What is so hard about:

gdb call Perl_sv_dump(sv)

???



Tom
-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: A tentative list of vtable functions

2000-09-14 Thread Nick Ing-Simmons

Nathan Torkington [EMAIL PROTECTED] writes:
Dan Sugalski writes:
 It's possible, for example, for a tied/overloaded/really-darned-strange 
 variable to look true but still be false. If you do:
 
$foo = $bar || $baz;
 
 and both $bar and $baz are objects, the 'naive' way is to make $foo be 
 $bar. But it's distinctly possible that $bar really should be treated as a 
 false value and $baz be used instead. Why? Dunno. Serious hand-waving here. 
 (And yes, I know that's a danger sign... :) But I don't see any reason to 
 preclude the possibility.

You can do that right now in perl5, by using overload.pm and supplying
a 'bool' method.

In practice both Damian and I have been bitten by inability to overload || 
and  - you can indeed pick which side is kept but you cannot make 
it keep both. So "defered" action is not possible.

I can make $a + $b return bless ['+',$a,$b],'OperatorNode' but you cannot
get $a  $b to produce bless ['',$a,$b],'OperatorNode' whatever you do. 

-- 
Nick Ing-Simmons




Re: RFCs for thread models

2000-09-11 Thread Nick Ing-Simmons

Steven W McDougall [EMAIL PROTECTED] writes:
1. All threads execute the same op tree

Consider an op, like

   fetch(b)

If you actually compile a Perl program, like

   $a = $b
   
and then look at the op tree, you won't find the symbol "$b", or "b"
anywhere in it. 

But it isn't very far away (at least for lexicals) ;-)

The fetch() op does not have the name of the variable
$b; rather, it holds a pointer to the value for $b.

It holds and index into the scratch-pad. Subs have scratch-pads 
which are cloned as needed during recursion etc. 


If each thread is to have its own value for $b, then the fetch() op
can't hold a pointer to *the* value. 

Each thread's view of the sub has its own scratch-pad - value is at same 
index in each.

-- 
Nick Ing-Simmons




Re: Event model for Perl...

2000-09-09 Thread Nick Ing-Simmons

Grant M. [EMAIL PROTECTED] writes:
I am reading various discussions regarding threads, shared objects,
transaction rollbacks, etc., and was wondering if anyone here had any
thoughts on instituting an event model for Perl6? I can see an event model
allowing for some interesting solutions to some of the problems that are
currently being discussed.

Yes - Uri has started [EMAIL PROTECTED] to discuss that stuff.


Grant M.
-- 
Nick Ing-Simmons




Re: A tentative list of vtable functions

2000-09-09 Thread Nick Ing-Simmons

Ken Fox [EMAIL PROTECTED] writes:
Short
circuiting should not be customizable by each type for example.

We are already having that argument^Wdiscussion elsewhere ;-)

But I agree variable vtables are not the place for that.

-- 
Nick Ing-Simmons




Re: RFC 178 (v2) Lightweight Threads

2000-09-08 Thread Nick Ing-Simmons

Alan Burlison [EMAIL PROTECTED] writes:
Nick Ing-Simmons wrote:

 The tricky bit i.e. the _design_ - is to separate the op-ness from the
 var-ness. I assume that there is something akin to hv_fetch_ent() which
 takes a flag to say - by the way this is going to be stored ...

I'm not entirely clear on what you mean here - is it something like
this, where $a is shared and $b is unshared?

   $a = $a + $b;

because there is a potential race condition between the initial fetch of
say $a and the assignment to it?  

My response to this is simple - tough.  

That is mine too - I was trying to deduce why you thought op tree had to change.

I can make a weak case for 

   $a += $b;

Expanding to 

   a-vtable[STORE](DONE = 1) = a-vtable[FETCH](LVALUE = 1) + 
 b-vtable[FETCH](LVALUE = 0);
   
but that can still break easily if b turns out to be tied to something 
that also dorks with a.

-- 
Nick Ing-Simmons




Re: RFC 130 (v4) Transaction-enabled variables for Perl6

2000-09-08 Thread Nick Ing-Simmons

Bart Lateur [EMAIL PROTECTED] writes:
On Wed, 06 Sep 2000 11:23:37 -0400, Dan Sugalski wrote:

Here's some high-level emulation of what it should do.

 eval {
 my($_a, $_b, $c) = ($a, $b, $c);
 ...
 ($a, $b, $c) = ($_a, $_b, $_c);
 }

Nope. That doesn't get you consistency. What you need is to make a local 
alias of $a and friends and use that.

My example should have been clearer. I actually intended that $_a would
be a variable of the same name as $a. It's a bit hard to write currently
valid code that way. Second attempt:

   eval {
   ($a, $b, $c) = do {
   local($a, $b, $c) = ($a, $b, $c); #or my(...)
   ... # code which may fail
   ($a, $b, $c);
   };
   };

So the final assignment of the local values to the outer scoped
variables will happen, and in one go, only if the whole block has been
executed succesfully.

So what is wrong with (if you mean that) saying:

 eval {
 my($_a, $_b, $_c) = ($a, $b, $c);
 ...
lock $abc_guard;
 ($a, $b, $c) = ($_a, $_b, $_c);
 }

Then no one has to guess what is going on?

But what do you do if $b (say) is tied so that assign to it needs
a $abc_guard lock in another thread for assign to complete?
i.e. things get hairy in the "final assignment".


I would simply block ALL other threads while the final group assignment
is going on. This should finish typically in a few milliseconds.

So we "only" stall the other CPUs for a few million instructions each ;-)


It also means that if we're including *any* sort of external pieces (even 
files) in the transaction scheme we need to have some mechanism to roll 
back changes. If a transaction fails after truncating a 12G file and 
writing out 3G of data, what do we do?

That does not belong in the kernel of a language. All that you may
expect, is transactions on simple variables; plus maybe some hooks to
attach external transaction code (transactions on files etc) to it. A
simple "create a new file, and rename to the old filename when done"
will usually do.

I am concerned that this is making "simple things easyish, BUT hard things 
impossible". i.e. we have a scheme which will be hard to explain, 
will only cover a few fairly uninteresting cases, and get in the 
way of doing it "properly".


-- 
Nick Ing-Simmons




Re: RFC 178 (v2) Lightweight Threads

2000-09-08 Thread Nick Ing-Simmons

Chaim Frenkel [EMAIL PROTECTED] writes:

What tied scalar? All you can contain in an aggregate is a reference
to a tied scalar. The bucket in the aggregate is a regular bucket. No?

I tied scalar is still a scalar and can be stored in a aggregate.

Well if you want to place that restriction on perl6 so be it but in perl5
I can say 

tie $a[4],'Something';

Indeed that is exactly how tied arrays work - they (automatically) add 
'p' magic (internal tie) to their elements.

Tk apps to this all the time :

 $parent-Lable(-textvariable = \$somehash{'Foo'});

The reference is just to get the actual element rather than a copy.
Tk then ties the actual element so it can see STORE ops and up date 
label.

-- 
Nick Ing-Simmons




Re: RFC 130 (v4) Transaction-enabled variables for Perl6

2000-09-07 Thread Nick Ing-Simmons

Dlux [EMAIL PROTECTED] writes:
| I've  deemed  to be  "too  complex".)  (Also  note  that I'm  not  a
| database
| guru, so  please bear with  me, and don't ask  me to write  the code
| :-)

Implementing threads  must be  done in  a very clever  way. It  may be
put in  a shared library (mutex  handling code, locking, etc.),  but I
think there  are more clevery  guys out  there who are  more competent
in this, and I think it is covered with some other RFCs...

If amazingly clever threads handling is a requirement of this RFC 
then it is probably doomed. Multi-processing needs detailed explicit 
specifications to be done right - not vague requests.



I also  don't like the overhead,  that's why I made  the "simple" mode
default (look  at the "use  transaction" pragma again...).  This means
NO  overhead,  

Not none, perhaps minimal ;-) - it has at least got to be looking 
at something pragma can set.

no  locking  between  threads:  this  can  be  used  in
single-thread  or multi-process  environment. Other  modes CAN  switch
on locking functions,  but this is not default! If  you implement that
intelligently (separated .so  for the thread handling),  then it means
minimal overhead (some more callback call, and that's all).

I would need to understand just where the thread hooks need to go.
So far my non-detailed reading suggests that the hooks are pretty 
fundamental.

-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Nick Ing-Simmons

Chaim Frenkel [EMAIL PROTECTED] writes:
 "JH" == Jarkko Hietaniemi [EMAIL PROTECTED] writes:

JH Multithreaded programming is hard and for a given program the only
JH person truly knowing how to keep the data consistent and threads not
JH strangling each other is the programmer.  Perl shouldn't try to be too
JH helpful and get in the way.  Just give user the bare minimum, the
JH basic synchronization primitives, and plenty of advice.

The problem I have with this plan, is reconciling the fact that a
database update does all of this and more. And how to do it is a known
problem, its been developed over and over again.

Yes - by the PROGRAMMER that does the database access code - that is far higher
level than typical perl code. 

If all your data lives in database and you are prepared to lock database
while you get/set them. 

Sure we can apply that logic to making statememts coherent in perl:

while (1)
 {
  lock PERL_LOCK; 
  do_state_ment
  unlock PERL_LOCK;
 }

So ONLY 1 thread is ever _in_ perl at a time - easy!
But now _by constraint_ a threaded perl program can NEVER be a performance
win. 

The reason this isn't a pain for databases is they have other things
to do while they wait ...

-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Nick Ing-Simmons

Chaim Frenkel [EMAIL PROTECTED] writes:

Some series of points (I can't remember what they are called in C)

Sequence points.

where operations are consider to have completed will have to be
defined, between these points operations will have to be atomic.

No, quite the reverse - absolutely no promisses are made as to state of
anything between sequence points - BUT - the state at the sequence 
points is _AS IF_ the operations between then had executed in sequence.

So not _inside_ these points the sub-operations are atomic, but rather
This sequence of operations is atomic.

The problem with big "atoms" is that it means if CPU A. is doing a 
complex atomic operation. the CPU B has to stop working on perl and go 
find something else to do till it finishes.


chaim
-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: Profiling

2000-09-05 Thread Nick Ing-Simmons

[EMAIL PROTECTED] writes:
 
 Anyone surprised by the top few entries:

Nope. It looks close to what I saw when I profiled perl 5.004 and 5.005
running over innlog.pl and cleanfeed. The only difference is the method
stuff, since neither of those were OO apps. The current Perl seems to
spend most of its time in the op dispatch loop and in dealing with
internal data structures.

What initially surprised me is why the op-despatch loop spends so long in 'self' code
when there is so little of it. My assumption is this is where we see 
the "cache miss" time. 

-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: A tentative list of vtable functions

2000-09-01 Thread Nick Ing-Simmons

Dan Sugalski [EMAIL PROTECTED] writes:
is_equal (true if this thing is equal to the parameter thing)
is_same (True if this thing is the same thing as the parameter thing)

is_equal in what sense? (String, Number, ...)

and how is is_same different from just comparing addresses of the things?

-- 
Nick Ing-Simmons




Re: RFC 146 (v1) Remove socket functions from core

2000-08-30 Thread Nick Ing-Simmons

David L . Nicol [EMAIL PROTECTED] writes:
Nick Ing-Simmons wrote:

 We need to distinguish "module", "overlay", "loadable", ... if we are
 going to get into this type of discussion. Here is my 2ยข:
 
 Module   - separately distributable Perl and/or C code.  (e.g. Tk800.022.tar.gz)
 Loadable - OS loadable binary e.g. Tk.so or Tk.dll
 Overlay  - Tightly coupled ancillary loadable which is no use without
its "base"  - e.g. Tk/Canvas.so which can only be used
when a particular Tk.so has already be loaded.

I know I've got helium Karma around here these days but I don't like
"overlay" it is reminiscent of old IBM machines swapping parts of the
program out because there isn't enough core.  

Which is exactly why I chose it - the places these things makes sense are 
on little machines where memory is a premium. 

Linux modules have
dependencies on each other and sometimes you have to load the more basic
ones first or else get symbol-undefined errors.  So why not follow
that lead and call Overlays "dependent modules."

A. Name is too long.
B. That does not have same "feel" as what we have.


If a dependent module knows what it depends on, that module can be
loaded on demand for the dependent one.

But - like old-style overlays our add-ons are going to be loaded on need
by the parent and only depend on the parent.

e.g. perl discovers it needs to getpwuid() do it loads the thing
that has those functions.

We are not going to be in the middle of getpwuid() and decide we need perl...


-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




The evils of #define ...

2000-08-29 Thread Nick Ing-Simmons

Jarkko Hietaniemi [EMAIL PROTECTED] writes:
On Tue, Aug 29, 2000 at 01:46:17AM -, [EMAIL PROTECTED] wrote:
 
 This is a build failure report for perl from [EMAIL PROTECTED],
 generated with the help of perlbug 1.32 running under perl v5.7.0.

Now I tracked this one down (change #6891).  The hunt mainly consisted
of debugging the following charming line :-)

SV *perinterp_sv = * Perl_hv_fetch(((PerlInterpreter 
*)pthread_getspecific((*Perl_Gthr_key_ptr(((void *)0) )) ) )  ,   
(*Perl_Imodglobal_ptr(((PerlInterpreter 
*)pthread_getspecific((*Perl_Gthr_key_ptr(((void *)0) )) ) )   ))  ,
"Storable(" "0.703"  ")"  ,  sizeof("Storable(" "0.703"  ")" )-1 ,  (1)  )  ;
stcxt_t *cxt  = ( stcxt_t * )(perinterp_sv  ((  perinterp_sv  )-sv_flags   
0x0001 )? (  stcxt_t *  )(unsigned long )(  ((XPVIV*)  (  perinterp_sv  
)-sv_any )-xiv_iv  )  : ((void *)0) )  ;  (  cxt  = (  stcxt_t 
*)Perl_safesysmalloc  ((size_t  )((  1 )*sizeof(  stcxt_t , (__extension__ 
(__builtin_constant_p (   (  1 )*sizeof(  stcxt_t )  )  (   (  1 )*sizeof(  stcxt_t 
)  ) = 16? ((   (  1 )*sizeof(  stcxt_t )  ) == 1? ({ void *__s = (   
(char*)(  cxt )   );   *((__uint8_t *) __s) = (__uint8_t)0  ; __s; })  : 
({ void *__s = (   (char*)(  cxt )   );   union { unsigned int __ui;  
unsigned short int __usi;   unsigned char __uc; } *__u = __s;   __uint8_t __!
c = (__uint8_t) (   0  );  switch ((unsigned int) ( (  1 )*sizeof(  stcxt_t ) 
  )) {   case 15:__u-__ui = __c * 0x01010101;   __u = __extension__ 
(void *)((char *) __u + 4); case 11:__u-__ui = __c * 0x01010101;   __u = 
__extension__ (void *)((char *) __u + 4); case 7: __u-__ui = __c * 0x01010101;   __u 
= __extension__ (void *)((char *) __u + 4); case 3: __u-__usi = (unsigned short int) 
__c * 0x0101; __u = __extension__ (void *)((char *) __u + 2); __u-__uc = (unsigned 
char) __c;break;  case 14:__u-__ui = __c * 0x01010101;   __u = 
__extension__ (void *)((char *) __u + 4); case 10:__u-__ui = __c * 
0x01010101;   __u = __extension__ (void *)((char *) __u + 4); case 6: __u-__ui = __c 
* 0x01010101;   __u = __extension__ (void *)((char *) __u + 4); case 2: __u-__usi = 
(unsigned short int) __c * 0x0101; break;  case 13:__u-__ui = __c * 
0x01010101;   __u = __extension__ (void *)((char *) __u + 4); case 9: __u-__ui = __c 
* 0x01010101;   __u = __extension__ (void *)((char *) __u + 4); c!
ase 5: __u-__ui = __c * 0x01010101;   __u = __extension__ (void *)((char *) __u + 
4); case 1: __u-__uc = (unsigned char) __c;break;  case 16:__u-__ui 
= __c * 0x01010101;   __u = __extension__ (void *)((char *) __u + 4); case 12:
__u-__ui = __c * 0x01010101;   __u = __extension__ (void *)((char *) __u + 4); case 
8: __u-__ui = __c * 0x01010101;   __u = __extension__ (void *)((char *) __u + 4); 
case 4: __u-__ui = __c * 0x01010101;   case 0: break;  }   __s; }) )   : 
(__builtin_constant_p ( 0 )  ( 0 ) == '\0'  ? ({ void *__s = (  (char*)(  cxt )  ); 
__builtin_memset ( __s , '\0',  (  1 )*sizeof(  stcxt_t )   ) ; __s; }) : 
memset (  (char*)(  cxt )  ,  0 ,(  1 )*sizeof(  stcxt_t )     ;  
Perl_sv_setiv(((PerlInterpreter *)pthread_getspecific((*Perl_Gthr_key_ptr(((void *)0) 
)) ) )  ,   perinterp_sv ,  ( IV )(unsigned long )(  cxt  )   )  ;





-- 
Nick Ing-Simmons




Re: RFC 155 - Remove geometric functions from core

2000-08-29 Thread Nick Ing-Simmons

David L . Nicol [EMAIL PROTECTED] writes:

does sysV shm not support the equivalent security as the file system?

mmap() has the file system.


Did I not just describe how a .so or a DLL works currently?

And behind the scenes that does something akin to:

int fd = open("file_of_posn_indepenant_byte_code",O_RDONLY);
struct stat st;
fstat(fd,st);
code_t *code = mmap(NULL,st.st_len,PROT_READ,MAP_SHARED,fd,0);
close(fd);

strace (linux) or truss (solaris) will show you what I mean.

And then trusts to OS to honour MAP_SHARED.  (mmap() is POSIX.)

Win32 has "something similar" but I don't remember the function names off
hand.

Or you can embed your bytecode in 

const char script[] = {...};

and link/dlopen() it and then you have classical shared text.



-- 
Nick Ing-Simmons




Re: RFC 155 - Remove geometric functions from core

2000-08-29 Thread Nick Ing-Simmons

Sam Tregar [EMAIL PROTECTED] writes:
On Tue, 29 Aug 2000, Nick Ing-Simmons wrote:
 David L . Nicol [EMAIL PROTECTED] writes:
 
 does sysV shm not support the equivalent security as the file system?
 
 mmap() has the file system.

I wasn't aware that mmap() was part of SysV shared memory. 

It is NOT. It is another (POSIX) way of getting shared memory bewteen 
processes. Even without MAP_SHARED OS will share un-modified pages 
between processes.

It happens to be the way modern UNIX implemements "shared .text".
i.e. the ".text" part of the object file is mmap()'ed  into 
each process.

My
mistake?  It's not on the SysV IPC man pages on my Linux system.  The mmap
manpage doesn't mention SysV IPC either.

SysV IPC is a mess IMHO. 

My point was that if the "file system" is considered
sufficient then mmap()ing file system objects will get you "shared code"
or "shared data" without any tedious reinventing of wheels.

-- 
Nick Ing-Simmons




Re: RFC 161 (v2) OO Integration/Migration Path

2000-08-28 Thread Nick Ing-Simmons

Nathan Torkington [EMAIL PROTECTED] writes:
Dan Sugalski writes:
 If the vtable stuff goes into the core perl engine (and it probably will,
 barring performance issues), then what could happen in the

I have a lot of questions.  Please point me to the appropriate place
if they are answered elsewhere.

vtables are tables of C functions?  

I am using them as tables of machine-code functions (compiled from C 
being the obvious but not the only way to create those).

Perl functions?  

Not directly. But given a "C" API it is normally easy enough to 
wrap the perl function (e.g. the FETCH/GET tie methods layered under
"magic" in perl5).

Either?  How
would you use them to handle overloading of operators?  One function
in the vtable for every operation?  

If the table is with the data yes, else if table is with the code 
one function for every type.

How does that extend to
user-defined operators?

Badly. But it makes user-defined implementations of existing operators easy.



Nat
-- 
Nick Ing-Simmons




RE: RFC 146 (v1) Remove socket functions from core

2000-08-28 Thread Nick Ing-Simmons

Fisher Mark [EMAIL PROTECTED] writes:
Leaping to conculusions based on no tests at all is even worse...

Will anyone bite the bullet and write the "Internals Decisions should
be based on actual tests on multiple platforms" RFC ?

BTW, I have access to Rational Software's Quantify (and PureCoverage and
Purify) on WinNT and HP-UX 10.20 which I'd be glad to use for such tests.

If you want to get "in the mood" it would be good to fire it up on 
(say) perl5.6.0 and see where the hot-spots are.


===
Mark Leighton Fisher[EMAIL PROTECTED]
Thomson Consumer ElectronicsIndianapolis IN
"Display some adaptability." -- Doug Shaftoe, _Cryptonomicon_
-- 
Nick Ing-Simmons




Re: RFC 155 - Remove geometric functions from core

2000-08-27 Thread Nick Ing-Simmons

Jarkko Hietaniemi [EMAIL PROTECTED] writes:
   bytes

microperl, which has almost nothing os dependent (*) in it 1212416
shared libperl 1277952 bytes + perl 32768 bytes1310720
dynamically linked perl1376256
statically linked perl with all the core extensions2129920

  (*) I haven't tried building it in non-UNIX boxes, so I can't be certain
  of how fastidiously features have been disabled.

"bytes" of what? - size of executable, size of .text, ???
If we are taling executable with -g size then  a lot of that is symbol-table
and is tedious repetition of "sv.h"  co. re-itteerated in each .o file.

But the basic point is that these things are small.


So ripping all this 'cruft' would save us about 100-160 kB, still
leaving us with well over a 1MB-plus executable.  It's Perl itself
that's big, not the thin glue to the system functions.

My support for the idea is not to reduce the size of perl in the UNIX
case, but to allow replacement. I would also like to have the mechanism
worked out and "proven" on something that we know gets used so 
that we can have good solid testing of the mechanism. Then something 
less obvious (say Damian's any/all operators) which might be major
extra size and not of universal appeal can use a well tried mechanism,
and we can flip default to re-link sockets or sin/cos/tan into the core.

-- 
Nick Ing-Simmons




RE: RFC 146 (v1) Remove socket functions from core

2000-08-27 Thread Nick Ing-Simmons

Al Lipscomb [EMAIL PROTECTED] writes:
I wonder if you could arrange things so that you could have statically
linked and dynamic linked executable. Kind of like what they do with the
Linux kernel. When your installation is configured in such a way as to make
the dynamic linking a problem, just compile a version that has (almost)
everything bolted in. Otherwise compile the features as modules.

If we make it possible to move socket or math functions out of execuable
into "overlays" then there will always be an option NOT to do that
and build one executable - (and that will probably be the default!).

We need to distinguish "module", "overlay", "loadable", ... if we are 
going to get into this type of discussion. Here is my 2ยข:

Module   - separately distributable Perl and/or C code.  (e.g. Tk800.022.tar.gz)
Loadable - OS loadable binary e.g. Tk.so or Tk.dll
Overlay  - Tightly coupled ancillary loadable which is no use without
   its "base"  - e.g. Tk/Canvas.so which can only be used 
   when a particular Tk.so has already be loaded.  

Tk has these "overlays" - I think DBI has something similar. perl5 itself
does not as such (although POSIX.so is close).

_I_ would like to see RFC 146 mutate into or be replaced by an RFC 
which said perl should have a mechanism to allow parts of functionality
to be split out into separate binary (sharable) files.


-- 
Nick Ing-Simmons




Re: RFC 146 (v1) Remove socket functions from core

2000-08-27 Thread Nick Ing-Simmons

Michael G Schwern [EMAIL PROTECTED] writes:
Like all other optimizing attempts, the first step is analysis.
People have to sit down and systematically go through and find out
what parts of perl (and Perl) are eating up space and speed.  The
results will be very surprising, I'm sure, but it will give us a
concrete idea of what we can do to really help out perl's performance.

There should probably be an RFC to this effect, and I'm just visiting
here in perl6-language so I dump it on somebody else.

Alan Burlison [EMAIL PROTECTED] writes:

Drawing conclusions based on a single test can be
misleading.

Leaping to conculusions based on no tests at all is even worse...

Will anyone bite the bullet and write the "Internals Decisions should
be based on actual tests on multiple platforms" RFC ?

-- 
Nick Ing-Simmons




Re: RFC 155 (v1) Remove geometric functions from core

2000-08-25 Thread Nick Ing-Simmons

Chaim Frenkel [EMAIL PROTECTED] writes:
I don't think that you should require a use. That is too violent a
change. Moving things that were in the core of Perl5 out should be
invisible to the user.

I strenuosly object to having to add use, for every stupid module.

Don't worry - so do Dan and I at least.


Anything that is part of the shipped perl should not need a use.

That is the "definition" of the "shipped perl" ;-)

If course we need a new name (not perl or Perl) for the "bundle of 
a perl and some handly Modules" which will be perl-6.0.0.tar.gz

The entire set of constants and namespace should be immediately
avaiable.

The only possible use for a use for core functions would be to pass
options or perhaps to select a non-default version.

Yes - 

use math 'vector-processor';

use socket 'IPv7';

use getpw  qw(paranoid); 




Modules that are from CPAN or local should be able to be promoted to
autoloadable by some simple mechanism.

Once we have a fast easy to use way of loading adjuncts for 
math/socket/gpw*/Registry/... and we can "byte compile" or whatever 
a module there should be no reason why not.

-- 
Nick Ing-Simmons




Re: RFC 146 (v1) Remove socket functions from core

2000-08-25 Thread Nick Ing-Simmons

Bart Lateur [EMAIL PROTECTED] writes:
On Fri, 25 Aug 2000 12:19:24 -0400, Dan Sugalski wrote:

Code you don't call won't eat up any cache space, nor crowd 
out some other code. And if you do call it, well, it ought to be in the cache.

Probably a stupid question... But can't you group the code for the most
often used constructs? 

We can - and we will once we know what the "often used constructs" will be
in perl6.

Larry started will with pphot.c in perl5 - but over the years
the bells and whistles have got tacked on where it seemed easiest and 
now perl5 needs a re-write to clean it up - perl6 will be that thing
but while perl5 runs a language called Perl5, perl6 (being defined here)
will run a language called Perl6 - being defined on [EMAIL PROTECTED]

So that, if one of those things is loaded in the
cache, the others are in there with it?

That is the first approximation to what happens - but it is a start...


If all the less needed stuff is more at the back of the executable, it
wouldn't even have to be loaded, most of the time.

Besides, I'm more worried about unnecessarily loading 600k from disk,
than from main memory to cache. For short-lived scripts, this loading
overhead could be quite significant.

Most mordern (and sane) OSes will keep "useful" pages in memory till 
they need them for something else. This would be _the_ win for 
true byte-compiled (not modified at runtime) scripts/modules - those
pages would not be re-loaded either.

-- 
Nick Ing-Simmons




Re: RFC 127 (v1) Sane resolution to large function returns

2000-08-24 Thread Nick Ing-Simmons

Dan Sugalski [EMAIL PROTECTED] writes:
At 02:25 PM 8/24/00 -0400, Chaim Frenkel wrote:
But

($foo, $baz, @bar) = (1,(2,3),4) # $foo = 1 $baz=2, @bar=(3,4)

Actually, looking at it like that makes it an ugly situation. The 'new'
expectation would be to have it become
 # $foo=1 $baz=2 @bar=(4)

Wouldn't that be $baz = 3, since the middle list would be taken in scalar 
context?

Which has sanely become the length of the list rather than last element.

-- 
Nick Ing-Simmons




Re: Vtable speed worry

2000-08-20 Thread Nick Ing-Simmons

David L . Nicol [EMAIL PROTECTED] writes:
No, because each table lookup takes less time than comparing one
letter of a text string.

Er, I don't think so. 

A lookup takes serveral cycles on a RISC machine 
due to memory latency even to the cache. A pipelined string compare takes
less than a cycle per char.

Also what has comparing got to do with SvPVX ?


 sv-vtable-svpvx;
 
 Isn't this going to really, really hurt?
-- 
Nick Ing-Simmons




Re: Design by Contract for perl internals

2000-08-17 Thread Nick Ing-Simmons

Michael G Schwern [EMAIL PROTECTED] writes:

I wouldn't mind an optional OO contract system in the core of Perl,
but this may be a case of "why do it in core when a module will work?"

I _think_ the proposal was to have design-by-contract in the perl core
in the sense that contract is checked when one part of the core 
calls another.  e.g. that when someone does

char *s = SvPV(sv);

Something checks that 'sv' is a SV * and not an HV * or whatever.
Obviously this is a perl-core-compile-time option and would NOT be 
on for production perl as it is SLOW.

Sort of automatically and liberally inserted assert() statements.

-- 
Nick Ing-Simmons




Re: Threaded In-Line Code (was Re: Typed Intermediate Language)

2000-08-16 Thread Nick Ing-Simmons

Chaim Frenkel [EMAIL PROTECTED] writes:
 "DS" == Dan Sugalski [EMAIL PROTECTED] writes:

DS I was actually thinking that @b * @c would boil down to a single vtable 
DS call--we'd just hit the multiply function for variable @b, and pass it a 
DS pointer to @c, and let it Do The Right Thing.

But that was my question in the _other_ thread. How?

Given N different fundemental types, we end up with NxN vtbl entries.

Which is not necessarily a problem if N is small (a 4x4 vtable is easy).

-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: Internal Filename Representations (was Re: Summary of I/O related RFCs)

2000-08-16 Thread Nick Ing-Simmons

Jarkko Hietaniemi [EMAIL PROTECTED] writes:
On Fri, Aug 11, 2000 at 02:16:31AM -0700, Nathan Wiger wrote:
 [cc'ed on internals as FYI]
 
  =item 36 (v1): Structured Internal Representation of Filenames
 
 I think this should be discussed a good amount. I think URIs are cool,
 but too much trouble for simple stuff. I don't want to have to write
 "file:///etc/motd" everytime I want to address a file. Too cumbersome.

URI's have thought of that already - you can have a "relative URI" with 
a parent specified. 

We would have a default parent of file://localhost/$PWD


The (vague) idea wasn't that "everything shall be an URI".  It was
the other way round: "the representation should be generic enough so
that also URIs could be handled".  In other words: things like
the protocol, the port number, the username, the password, could
be part of a "file spec".

Quite.


-- 
Nick Ing-Simmons




Re: Internal Filename Representations (was Re: Summary of I/O related RFCs)

2000-08-16 Thread Nick Ing-Simmons

Johan Vromans [EMAIL PROTECTED] writes:
Nathan Wiger [EMAIL PROTECTED] writes:

$fo = open "C:\Windows\System\IOSUBSYS\RMM.PDR";
$fo-pathdrive = "C:" ;

I think the drive is "C", not "C:".

The reason for including the ':' is so that the rule for reconstructing 
the path is easy and we don't need another slot for 'drive separator'.


$fo-patharray = [ Windows, System, IOSUBSYS, RMM.PDR ];

I think the patharray is [ Windows, System, IOSUBSYS ].
The file name is RMM, the extension is PDR.

$fo = open "/etc/inet/inetd.conf";
$fo-pathdrive = ""; 

I think this should be the mount point, e.g., "/".

 Splitting apart or putting together either one of these paths is trivial

I think it's far from trivial, especially if you want to take into
account network names, file versions, protection attributes and ACLs, ...

-- Johan
-- 
Nick Ing-Simmons




Re: Typed Intermediate Language

2000-08-15 Thread Nick Ing-Simmons

David L . Nicol [EMAIL PROTECTED] writes:
Just in case I'm not the only one here who doesn't know what TIL means:

http://www.cs.cornell.edu/home/jgm/tilt.html

Well I have been using 'TIL' to mean "Threaded Interpretive Language"

There is a Z80 FORTH clone defined in :

 "Threaded Interpretive Languages"
 R. G. Loeliger.
 Byte Books / McGraw-Hil 1981
 ISBN 0-07-038360-X
 

-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




Re: RFC 35 (v1) A proposed internal base format for perl

2000-08-14 Thread Nick Ing-Simmons

Larry Wall [EMAIL PROTECTED] writes:
Nick Ing-Simmons writes:
: It's not clear to me whether the intrinsic types should have a different
: solution to this than the extrinsic types.
: 
: _This_ thread is about using vtables for intrinsic types. If we cannot 
: make them work there then the proposed innermost SV * replacment is flawed.

Sure, but we may have to warp our ideas of what a vtable is to encompass
the notion of a vtable that is the cross-product of two vtables.

That wouldn't be a 'vector' table but a 'matrix' table ! only 1/2 ;-)



Larry
-- 
Nick Ing-Simmons




Re: vector and matrix calculations in core? (was: Re: Ramblings on base class for SV etc.)

2000-08-13 Thread Nick Ing-Simmons

Bart Lateur [EMAIL PROTECTED] writes:
On Wed, 09 Aug 2000 12:46:32 -0400, Dan Sugalski wrote:

@foo = @bar * @baz;

Given that the default action of the multiply routine for an array in 
non-scalar context would be to die, allowing user-overrides of the 
functions would probably be a good idea... :)

[Is this still -internals? Or should we stop CC'ing?]

One problem: overloading requires objects, or at least one. Objects are
(currently) scalars. You can't make an array into an object.

We are thinking of adding "objects" in the implementation of perl.
i.e. perl's primitive "things" (scalars, arrays, hashes) will have 'vtables'
(table of functions that do the work). So in that sense an array as in @foo
can be an "object" at some level of meaning while not being an "object" 
at the perl level.

-- 
Nick Ing-Simmons




Re: Method call optimization.

2000-08-12 Thread Nick Ing-Simmons

David L . Nicol [EMAIL PROTECTED] writes:

One assumes that if you redefine (@ISA) perl5 throws away this cache?

Not all at once. It increments a "generation number".
When perl finds it is about to use a cached method it checks to see
if the value post-dates the current generation number, or does 
the re-lookup.



If D isa C isa B and D looked up method f and found it in B's methods,
then C redefines itself as an A, does perl5 figure out to throw away
D-f ?

Sub definition increments the generation number too.


I mean, who redefines ISA at  run time?  

A. Almost all perl code prior to invention of 'use base' in perl5.00404.
   (Had to do @ISA = ... as part of run phase.)

B. Any module loaded after another is compiled sets _its_ ISA after
   the base class has been compiled.


And how did Ing-Simmons get on the reply-to-all CC list twice?

Posted from work and from home.


-- 
Nick Ing-Simmons




Re: Method call optimization.

2000-08-10 Thread Nick Ing-Simmons

Dan Sugalski [EMAIL PROTECTED] writes:
At 03:35 PM 8/9/00 -0700, Damien Neil wrote:
On Wed, Aug 09, 2000 at 03:32:41PM -0400, Chaim Frenkel wrote:

Each sub is assigned an index.  This index is unique for the package
the sub is in, and all ancestor packages.

Add all sibling packages of all the packages involved ;-)

If we are not careful we can end up making the compile NP complete.

We just had all the numbers nicely sorted and then someone reads in:

package Foo;
use base qw(Meth_is_1 Other_is_1);

sub Meth ...

sub Other ...

And now we have to recompute the whole tree so that Meth and Other don't
share the index.


The first runtime reassignment of @ISA shoots this one down hard. Sorry. 
(MI also makes it more difficult, since dependency trees will have to be 
built...)

Yes - this is why Malcolm dodged MI with 'fields' module.

-- 
Nick Ing-Simmons




Re: Language RFC Summary 4th August 2000

2000-08-09 Thread Nick Ing-Simmons

Dan Sugalski [EMAIL PROTECTED] writes:
At 11:40 AM 8/5/00 +, Nick Ing-Simmons wrote:
Damian Conway [EMAIL PROTECTED] writes:
 It definitely is, since formats do things that can't be done in 
 modules.
 
 Such as???

Quite.

Even in perl5 an XS module can do _anything at all_.

It can't access data the lexer's already tossed out. 

A source filter can, but not elegantly.

That's where the 
current format format (so to speak) runs you into trouble.

   Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
  teddy bears get drunk
-- 
Nick Ing-Simmons




Re: RFC 61 (v2) Interfaces for linking C objects into pe

2000-08-08 Thread Nick Ing-Simmons

Perl6 Rfc Librarian [EMAIL PROTECTED] writes:
This and other RFCs are available on the web at
  http://dev.perl.org/rfc/

=head1 TITLE

Interfaces for linking C objects into perlsubs

=head1 VERSION

  Maintainer: David Nicol [EMAIL PROTECTED]
  Date: 7 Aug 2000
  Version: 2
  Mailing List: [EMAIL PROTECTED]
  Number: 61

As this is all about what the interface looks like and has no details
of implementation it is not really appropriate (IMHO) for internals list
yet.



This document is not precisely concerned with the details of the
implementation
of the interfaces it specifies, beyond a general attempt to restric
itself to the possible.

But this list is _only_ concerned with implementation details.

-- 
Nick Ing-Simmons




Re: pramgas as compile-time-only

2000-08-08 Thread Nick Ing-Simmons

Chaim Frenkel [EMAIL PROTECTED] writes:
 "GB" == Graham Barr [EMAIL PROTECTED] writes:

 A different op would be a better performance win. Even those sections
 that didn't want the check has to pay for it.

GB That may not be completly true. You would in effect be increasing the
GB size of code for perl itself. Whether or not it would be a win would
GB depend on how many times the extra code caused a cache miss and a fetch
GB from main memory.

GB As Chip says, human intuition is a very bad benchmark.

Does the cache hit/miss depend on the nearness of the code 

To some extent.

or simply
on code path? 

That can have an effect too. not just caches but pre-fetch and branch 
prediction mess here as well.

Obviously having the checked version be a wrapper of
the base op and near it on the same page would be a VM win.

Caches work well with small-ish linear-ish hotspots that keep being re-used.
When access pattern does not follow that pattern things get (gradually) worse.

How gradual and how -ish depends on cache architecture which is 
fun, often proprietary and off-topic ;-) - I can write a quick "turorial"
e-mail if there is general interest (and I must have a biblography somewhere
at work).


-- 
Nick Ing-Simmons




Re: RFC 35 (v1) A proposed internal base format for perl

2000-08-06 Thread Nick Ing-Simmons

Ken Fox [EMAIL PROTECTED] writes:

When we document this, can we move the low level interfaces out of the
pod directory? It would be a shame to have people accidentally start using
the internal interfaces just because they're well documented. ;) 

If they are well documented then the risks they will be taking will be 
obvious.

(And if
we say something like "this is fast" people will ignore all the warnings.)

As one of the worst offenders I certainly will ;-)


- Ken
-- 
Nick Ing-Simmons




Re: Ramblings on base class for SV etc.

2000-08-06 Thread Nick Ing-Simmons

Ken Fox [EMAIL PROTECTED] writes:

This got me thinking about whether it's necessary to define exactly what
an SV struct is. The following seems over-specified:


Dan's struct that includes thread sync stuff is also over-specified.

I think the only thing we have to standardize on is the vtable interface
and the flags. This seems like a good thing, at least during early
experimentation with perl 6.


True. I think just the vtable and flags is the minimal "interface" 
rest of the stuff is just data that access functions mess with (even the
thread sync stuff).

None the less - it makes sense to have a "straw man" of how the essential
types will be implemented.


We could wrap the basic operations on an SV with inlines so that the
abstraction won't kill performance. The entire low-level definition of SV
could be done in a header file. When building perl just pick what header
you want. This would have no effect on external modules since they all go
through the public interface.


BTW, SV isn't a good name for this struct. 

Agreed.

It's really a value
binding, not the value itself. 

Nor is this "base" just for scalars - so apart from the fact it isn't a
scalar and isn't a value "scalar value" is ideal :-(

IMHO it would be a lot easier to read the
code if we clearly differentiated between what's now called SV and the
collection of xpv*'s.

 inline IV
 SvIV(SV *sv)
 {
  return (*sv-vtable.SvIV)(sv);
 }
 
 With the simplest case being
 
 IV
 nativeIV(SV *sv)
 {
  return sv-data.words.iv;
 }

What component of the system is responsible for representation
shifting? 

The vtable functions.

It might be a really clean design for us to always shift
representations so that the current vtable always points to the
corrent semantics. (Look, ma, no flags required!)

We need to see what the flags turn out to be.
Perl5 has things like IOK, ROK, NOK, POK, UTF8, ...

While those are not always necessary with the vtable stuff, they tend to 
get used by Perl in DWIM mode - it looks at flags to see if thing is 
"naturally" a string or a number.

-- 
Nick Ing-Simmons




Re: RFC 35 / Re: perl6-internals-gc sublist

2000-08-06 Thread Nick Ing-Simmons

Chaim Frenkel [EMAIL PROTECTED] writes:

And why carry around IV/NV and ptr? Make it into a union, to allow room.

_my_ original "ramblings" posting did.

I kept the triple to pad the thing to 8 words.
Partly as devil's advocate against the "squeeze it all into one word" camp ;-)

Essentially what I am proposing is removing the sv_any indirection for 
the simple scalar cases this reduces housekeeping, indirections 
and keeps actual data near "SV/PMC" for cache reasons.


The string/number duality should be handled by the vtbl. So a string
that is never accessed as a number, doesn't waste the space. And
numbers that are rarely accessed as a string save some room. And
as needed the vtbl can be promoted to a duality version that maintains
both.

All true - but we can minimize the malloc-ing that such conversions entail
if we have a "reasonable" amount of (multi-purpose) storage in the root.

-- 
Nick Ing-Simmons




Re: Ramblings on base class for SV etc.

2000-08-06 Thread Nick Ing-Simmons

Dan Sugalski [EMAIL PROTECTED] writes:

The rest is also there to optimize the common case. (Though I do think it's 
overkill in many circumstances if all variables share the same base 
structure--arrays don't really need an integer portion, neither do hashes) 

So you re-use the space for AvLEN or whatever is "hot" for arrays and 
hashes. 

That's not a bad thing and, like I said, the big win's not in optimizing 
scalars, it's in optimizing hashes and arrays. Skimping here's likely not 
worth it in the long run.

-- 
Nick Ing-Simmons




Re: C--

2000-08-05 Thread Nick Ing-Simmons

John Tobey [EMAIL PROTECTED] writes:
Joshua N Pritikin [EMAIL PROTECTED] wrote:
 A few more clicks and I found:
 
   http://www.cminusminus.org/

Thanks, Joshua.  Quickie summary.  Implementations: one[1] semi-free
(non-DFSG-compliant) complete.  Others in progress.

Why not specify as a C extension: I'm still looking for that.

Could one do a GCC front end for C-- ?

-- 
Nick Ing-Simmons




Re: RFC: Foreign objects in perl

2000-08-05 Thread Nick Ing-Simmons

Benjamin Stuhl [EMAIL PROTECTED] writes:
--- Dan Sugalski [EMAIL PROTECTED] wrote:
 actual work. The
 dispatch routine has a function signature like so:
 
   int status = dispatch(void *native_obj, sv
 *perl_scalar, char *method_called,
  int *num_args_in, perl_arg_stack
 *arg_stack,
  int *num_args_out, perl_arg_stack
 *return_values);

One thing: remember, there is a lot of talk about having
perl6 use Unicode internally, which means that things like
method names should be wchar_t * (or whatever).

I doubt that - I guess names will be UNICODE but will be encoded in UTF8 rather
than as wide chars.

-- 
Nick Ing-Simmons




Re: RFC 35 (v1) A proposed internal base format for perl

2000-08-05 Thread Nick Ing-Simmons

Perl6 Rfc Librarian [EMAIL PROTECTED] writes:

This is similar to the structure used in perl 5, with one major
difference. Rather than having all the intellegence needed to use a
variable separate from that variable, this RFC embeds that information
into the variable itself. This allows for more efficient code to
access the vriables, and it lets us add in variable types on the
fly. This way perl doesn't, for example, have to know how to access an
individual element of an array of integers--it just asks the array to
return it a particular element. Code MUST use the vtable functions to
get or set values from variables. They MUST NOT directly access the data.

This base structure should be considered immobile, so it's safe to
maintain pointers to it. The data portion of a variable should be
considered moveable, and may be shuffled around if a variable changes
its type, or the garbage collector needs to compact the heap.

Implementation on various types (arrays, hashes, scalars) as well as
sub-types (integer scalars, string scalars, objects) is left to
another RFC.

All good so far.


=head1 IMPLEMENTATION

The base variable structure looks like:

struct {
  IV GC_data;
  void *variable_data;
  IV flags;
  void *vtable;
  void *sync_data;
}

The fields, in order, are:

=over 4


=item variable_data

Equivalent to perl5's sv_any pointer, this is a pointer to the actual
data structure for the variable. It may, in certain cases, be coopted
to hold the actual value. (This is likely the case for a scalar that
holds just an integer, where the native int size is equal to or
smaller than the native pointer size)

I think we should allow more than just a pointer, and that really 
simple variables (IV, NV) should be able to use _just_ 
the structure above without auxillary malloc'ed data.


The actual structure that hangs off will depend both on the class of
variable (scalar, hash, array) and the type of that class (integer
array, integer scalar, filehandle, reference) and isn't specified here

=item flags

This field holds various flags that hold the status of the
variable. (Flags to be RFC'd later)

=item vtable

The vtable field holds a pointer to the vtable for a variable. Each
variable type has its own vtable, holding pointers to functions for
the variable. Vtables are shared between variables of the same
type. (All integer arrays have the same vtable, as do all string
scalars and so on)

vtable contents will be RFCd separately. All variables will share a
common set of functions, though scalars, arrays, and hashes will have
their own set of extensions on top of that.

The vtable should be non-opaque to the perl-core.


=head1 IMPACT ON EMBEDDING

None. Generally embedding apps won't deal with actual perl data

=head1 IMPACT ON EXTENSIONS

None. Extensions get pointers to this structure, which as far as they
know is a magic cookie. (In fact the official perl term for the thing
handed to extensions is a Perl Magic Cookie, or PMC) Knowledge of the
internals is a no-no at this level.

Moot - I think there are two classes of "extension":
  A. As above that treat this stuff as opaque.
  B. External "ops" - which will assume same as ops below - i.e.
 they can call via vtable etc.



=head1 IMPACT ON OP FUNCTIONS

Op functions have intimate knowledge of the internals and unrestricted
access. Therefore they're assumed to know what they're doing, and will
therefore heed the info in this RFC.

=head1 REFERENCES

sv.h
-- 
Nick Ing-Simmons




Re: RFC 35 / Re: perl6-internals-gc sublist

2000-08-05 Thread Nick Ing-Simmons

John Tobey [EMAIL PROTECTED] writes:
Nick Ing-Simmons [EMAIL PROTECTED] wrote:
 John Tobey [EMAIL PROTECTED] writes:
 Dan Sugalski [EMAIL PROTECTED] wrote:
  Yup, and I realized one of my big problems to GCs that move memory 
  (references that are pointers and such) really isn't, if we keep the 
  two-level variable structure that we have now. The 'main' SV structure 
  won't move, while the guts that the equivalent of sv_any points to can 
  without a problem.
 
 I certainly hope this data layout factoid is still subject to change.
 
 Having an SV have a fixed address is handy for C extensions.
 The 'entity' has got to have some 'handle' to defines its existance.
 If not the SV* data structure then what is it that defines the thing?

Just like in Perl, if you want a reference to it, put it somewhere and
use a pointer.  Otherwise, use it by value.  (three words, or two if
flags are dropped, or four if vptr is added, or one if everything is
crammed into a pointer with low bits commandeered).

So what exactly is your point here ?
We currently (perl5) have the "token" being:

struct sv {
void*   sv_any; /* pointer to something */
U32 sv_refcnt;  /* how many references to us */
U32 sv_flags;   /* what we are */
};

3 words (which is a "funny" number in a binary world.

(With the type hiding in flags as a small int.)

Current perl6 "token" proposed by RFC 35 is:

struct {
  IV GC_data; // REFCNT
  void *variable_data;// sv_any, possibly used for IV, RV 
  IV flags;   // sv_flags  
  vtable_t *vtable;   // _new_ - explcit type 
  void *sync_data;// _new_ - for threads etc.
}

5 words (which is if anything a slighty worse number...)

I think I would make it 4 or 8 words:

struct {
  vtable_t *vtable;   // _new_ - explcit type 
  IV flags_and_GC;// sv_flags 
  void *sync_data;// _new_ - for threads etc.
  void *variable_data;// sv_any, possibly used for IV, RV 
}

That squeezes flags and GC into a word - no big deal for 'mark' bit
but if we have a REFCNT it had better be 8 or 16 bits or update will 
be too many cycles.

So my own favourite right now allows a little more - to keep data and token
together for cache, and to avoid extra malloc() in simple cases.
Some variant like:

struct {
  vtable_t *vtable;   // _new_ - explcit type 
  IV flags;   // sv_flags  
  void *sync_data;// _new_ - for threads etc.
  IV GC_data; // REFCNT
  void *ptr;  // SvPV, SvRV
  IV   iv;// SvIV, SvCUR/SvOFF
  NV   nv;// SvNV
}

The other extreme might be just a pointer with LS 3 bits snaffled for 
"mark" and two "vital" flags - but that just means above lives via the 
pointer and everything is one more de-ref away _AND_ needs masking 
except for "all flags zero" case (which had better be the common one).

As I recall my LISP it has two pointers + flags

-- 
Nick Ing-Simmons




Re: Language RFC Summary 4th August 2000

2000-08-05 Thread Nick Ing-Simmons

Damian Conway [EMAIL PROTECTED] writes:
It definitely is, since formats do things that can't be done in modules.

Such as???

Quite.

Even in perl5 an XS module can do _anything at all_.

-- 
Nick Ing-Simmons




Re: RFC 27 (v1) Coroutines for Perl

2000-08-05 Thread Nick Ing-Simmons

Dan Sugalski [EMAIL PROTECTED] writes:
At 01:17 PM 8/4/00 +0500, Tom Scola wrote:
 [I think this belongs on the language list, FWIW, Cc'd there]
 
 I like this, but I'd like to see this, inter-thread queues, and events
all
 use the same communication method. Overload filehandles to pass events

 around instead, so:

I'm proposing that events and threads be dropped in lieu of coroutines.

Not gonna happen. Tk and signals, at the very least, will see to that. 

As far as I am aware any multi-processing problem can be reduced to message
passing and these "co routines as IO" are just one stab at that.
For example occurance of a signal could just "print" down the handler "pipe",
Likewise mouse click could just "print" down the Tk-ish ButtonPress-1 pipe.

It is the "return path" that bothers me - and of course the thread behind 
the co routine still has locking issues if it updates "global" state.

-- 
Nick Ing-Simmons




Re: inline mania

2000-08-03 Thread Nick Ing-Simmons

Dan Sugalski [EMAIL PROTECTED] writes:
At 05:39 PM 8/2/00 +0100, Tim Bunce wrote:
On Wed, Aug 02, 2000 at 12:05:20PM -0400, Dan Sugalski wrote:
 
  Reference counting is going to be a fun one, that's for sure.
 
  I'd like the interface to be something like:
 
 stat = perl_get_value(sv *, int what, destination)

And what type is perl_get_value declared as returning?

An integer--it is a status value after all...

Are we sure the value to be should not be returned and the status to be 
the extra arg?

It is neater to be able to say 

   int err;
   int circ = perl_get_value(radius_sv,PL_INTEGER,err)*2*M_PI;

rather than:

   int radius;
   int err = perl_get_value(radius_sv,PL_INTEGER,radius);
   int circ = radius*2*M_PI; 
   
Remember the compiler cannot put anything which has its address taken 
in a register - so if the value is likely to be used in an expression 
it is better to avoid forcing it to the stack.

-- 
Nick Ing-Simmons [EMAIL PROTECTED]
Via, but not speaking for: Texas Instruments Ltd.




RE: inline mania

2000-08-02 Thread Nick Ing-Simmons

Brent Fulgham [EMAIL PROTECTED] writes:
  Having thought about it a bunch more (because of this) I'm 
  proposing we let the compiler decide. The caller doesn't 
  know enough to make that decision. 
 
 Read carefully.  I said we *let* the caller decide, not *make* the
 caller decide.  What, specifically, disturbs you about my proposal?
 

The 'inline' keyword is just a hint to the compiler.  If optimization
is turned off, no inlining is done.  If optimization is on, the
compiler may or may not decide to inline.  Performance on different
compilers will vary.  

To repeat:  Even if I say "inline" on everything, the compiler is
free to disregard that if its optimization routines decide not to.
(Also, if I fail to say "inline" on something, the compiler may
decide to inline if optimization is active).

So aren't we all saying the same thing?

I don't think so - it is a question which way we code the source:

A. Use 'inline' every where and trust compiler not to do what we told it 
   if it knows better.
B. No inline hints in the source and trust the compiler to be able to 
   do the right thing when prodded with -O9 or whatever.
C. Make "informed" guesses at to which calls should be inlined.

My view is that (B) is the way to go, aiming for (C) eventually, because 
(A) gives worst-case cache purging.



-Brent
-- 
Nick Ing-Simmons




  1   2   >