from:"Nick Ing\-Simmons"

Re: iThreads and selective variable copying (was Destructors andiThreads)

2004-07-04 Thread Nick Ing-Simmons

Dave Mitchell <[EMAIL PROTECTED]> writes:
>
>1. It would be very hard to create these options.
>2. Any programmer that used an 'only these' option would almost
>certainly create a program that at best would not work, and at worst would
>coredump. Whats happens if the user forgot to copy $/ ? What does Perl do
>the next time it tries to read from a file and wants to know the current
>line delineator?
>
>Then there's stuff like stashes - %main:: is a hash that indirectly
>references just about every object in the perl interpreter. Does the
>programmer have to remember to exclude that?
>
>You are suggesting opening up a can of worms which I have no great desire
>to see opened.

Much as I philosophically like Eric's idea this does indeed look too 
messy for perl5. Lets see if perl6 can or has already fixed this.


>
>Dave.

Re: Using Ruby Objects with Parrot

2004-03-22 Thread Nick Ing-Simmons

Mark Sparshatt <[EMAIL PROTECTED]> writes:
>
>I'm not 100% certain about the details but I think this is how it works.
>
>In languages like C++ objects and classes are completely seperate.
>classes form an inheritance heirachy and objects are instances of a
>particular class.
>
>However in some languages (I think that Smalltalk was the first) there's
>the idea that everything is an object, including classes. So while an
>object is an instance of a class, that class is an instance of another
>class, which is called the metaclass. I don't there's anything special
>about these classes other than the fact that their instances are also
>classes.
>
>
>Thinking about it I think you may have the relationship between
>ParrotObject and ParrotClass the wrong way around. Since a class is an
>object but and object isn't a class it would be better for ParrotClass
>to inherit from ParrotObject, rather than the other way round.
>
>In Ruby when you create a class Foo, the Ruby interpreter automatically
>creates a class Foo' and sets the klass attribute of Foo to point to Foo'.
>
>This is important since class methods of Foo are actually instance
>methods of Foo'. Which means that method dispatch is the same whether
>you are calling an instance of class method.

So in perl5-ese when you call 

   Foo->method

you are actually calling sub Foo::method which is in some sense
a "method" of the %Foo:: "stash" object.

So what you suggest is as if perl5 compiled Foo->method
into (\%Foo::)->method and the %Foo:: 'stash' was blessed...


>
>foo.method()
>
>looks at foo's klass attribute then checks the returned class object
>(Foo) for method
>
>Foo.method()
>
>looks at Foo's klass attribute and again checks the returned class
>object (Foo') for method.
>
>The Pickaxe book has got a better explanation of this (at
>http://www.rubycentral.com/book/classes.html though without any diagrams
>:( )
>
>In Python when defining a class it's possible to set an attribute in the
>class that points to the classes metaclass. The metaclass itself is just
>a normal class that defines methods which override the normal behaviour
>of the class.
>
>IIRC Python has got both class methods and meta class instance methods
>which work almost (but not quite) in the same way as each other.
>
>Hopefully someone with more experience with Python will be able to
>explain better.
>
>I'm not sure if this has cleared things up or just made them more confusing.

Re: Dates and times again

2004-03-22 Thread Nick Ing-Simmons

Larry Wall <[EMAIL PROTECTED]> writes:
>
>That would seem like good future proofing.  Someday every computer will
>have decentish subsecond timing.  I hope to see it in my lifetime...

It isn't having the sub-second time in the computer it is the API 
to get at it... 

>
>My guess is that eventually they'll decide to put a moratorium on
>leap seconds, with the recommendation that the problem be revisited
>just before 2100, on the assumption that we'll add all of a century's
>leap seconds at once at the end of each century.  That would let
>civil time drift by at most a minute or two before being hauled
>back to astronomical time.  

Given that most people live more than an minute or two from their 
civil-time meridian who will notice? (Says me about 8 minutes west of 
GMT.)

>
>I'd say what's missing are the error bars.  I don't mind if the
>timestamp comes back integral on machines that can't support subsecond
>timing, but I darn well better *know* that I can't sleep(.25), or
>strange things are gonna happen.

But you can fake sleep() with select() or whatever.

Re: [PROPOSAL] C opcode and interface

2004-03-22 Thread Nick Ing-Simmons

Dan Sugalski <[EMAIL PROTECTED]> writes:
>At 11:12 AM -0800 3/10/04, Brent \"Dax\" Royal-Gordon wrote:
>>Dan Sugalski wrote:
>>>Which, unfortunately, will end up making things a hassle, since 
>>>there's no platform-independent way to spawn a sub-process, dammit. 
>>>:(
>>
>>Unixen seem to support system().
>
>D'oh! It's C89 standard. I'm getting stuck in the 80s with the 
>multitude of exec variants. Yeah, with that issue taken care of it's 
>a lot more doable. Nevermind...

But:
  A. system() is blocking. 
  B. system() takes single string so whatever calls system()
 has to be aware of the System's quoting rules.

Re: [PROPOSAL] C opcode and interface

2004-03-22 Thread Nick Ing-Simmons

Dan Sugalski <[EMAIL PROTECTED]> writes:
>At 10:11 AM -0800 3/10/04, Brent \"Dax\" Royal-Gordon wrote:
>>Josh Wilmes wrote:
>>>It's also quite possible that miniparrot is a waste of time.  I'm 
>>>pretty much of the opinion myself that it's an academic exercise at 
>>>this point, but one which keeps us honest, even if we don't use it.
>>
>>Miniparrot, or something very much like it, is the final build system.
>
>Yep. We need to make sure it always works.
>
>Which, unfortunately, will end up making things a hassle, since 
>there's no platform-independent way to spawn a sub-process, dammit. :(

On that topic specifically - the DOS style spawn() API is 
easy to fake with fork/exec but converse is NOT true.


i.e. if Miniparrot assumes:

pid_t my_spawn(const char *progname,int argc,const char *argv[]);
int my_wait(pid_t proc);

then Unix-oids can have

pid_t my_spawn(const char *progname,int argc,const char *argv[]);
{
 pid_t pid = fork();
 if (pid)
  return pid;
 execv(progname,argc,argv);
}

Unidirectional popen() is also reasonably portable.

Re: Dates and times again

2004-03-22 Thread Nick Ing-Simmons

Dan Sugalski <[EMAIL PROTECTED]> writes:
>In an attempt to drain the swamp...
>
>So far as I can see, we need, in descending order of importance (and 
>speed) (And if there's stuff missing, add them):
>
>1) A timestamp value
>2) A way to chop the timestamp to pieces
>3) A way to turn the timestamp into a string
>4) A way to turn pieces to a timestamp
>5) A way to turn the string into a timestamp
>
>All of which is confounded by the joys of timezones and platform limitations.
>
>As far as I can tell, the only thing we can absolutely count on are:
>
>  asctime, ctime, difftime, gmtime, localtime, mktime, strftime

Everything gives you a ticks of size ? since ? hook or three.
In most places the ticks are less than second.

All the stringify and human/planet-izing seems to be library fodder.

gettimeofday() is widely available or fakeable. (Tk uses it.)
Seems $^T could start based on time(2) and then get deltas using 
something finer. Perhaps aspire to clock_gettime() and fake
that interface where not available?


>
>We can't even count on timegm, unfortunately. Neither can we count on 
>getting fractional time. (Or even really count on getting a GMT time 
>that's actually GMT, as far as that goes, but that's 
>user-misconfiguration and there's a limit to what I'm willing to care 
>about) Nor strptime for time parsing, though a case could be made 
>there that we could do our own. (Cases to be made for that should be 
>accompanied by unencumbered source to do the parsing ;) Can't even 
>count on the full range of output when splitting the time up--if you 
>check the CVS logs you'll see I yanked out a few elements because 
>they're not C89 and there wasn't any way to synthesize them easily 
>that I know of.
>
>That means we can't convert to TAI, since that needs leap second info 
>we don't have, so base time can't be TAI. From what I can tell from 
>the interfaces and long painful experience we can't convert to and 
>from anything other than the current system timezone. (Maybe. I'm not 
>100% sure that's reliable either)
>
>Right now, you can get a black-box integer timestamp that's fixed to 
>GMT time, and you can disassemble that timestamp into day/month/year 
>pieces. I adjusted the year to be a real year, but I haven't adjusted 
>the month. We can do that, though. We can easily:
>
>*) Give a float timestamp
>*) Adjust the timestamp to various base dates (I've already made my
>preferences clear :)
>
>My general rule-of-thumb for ops is that they must:
>
>*) Be something we want to guarantee behaviour on everywhere
>*) Require C code
>*) Have a fixed signature
>
>Being primitive isn't necessary as such, but doesn't hurt. Having to 
>be required present at all times isn't necessary either, though we 
>should nail down lexical oplibs if we want to start talking about 
>secondary libraries of this stuff.
>
>Anyway, given the restrictions on what we have to work with, the 
>first question is:
>
>*) Is what we're providing correct
>*) What *aren't* we providing that we must to allow full and
>proper date processing in modules without much pain?

Testing XS modules on Ponie

2004-03-19 Thread Nick Ing-Simmons

Arthur Bergman <[EMAIL PROTECTED]> writes:
>This is Ponie, development release 2
>
>
>   "And, isn't sanity really just a one-trick ponie anyway? I mean all 
>you get is one trick, rational thinking, but when you're good and 
>crazy, oooh, oooh, oooh, the sky is the limit." -- the tick
>
>
>Welcome to this second development release of ponie, the mix of perl5 
>and parrot. Ponie embeds a parrot interpreter inside perl5 and hands 
>off tasks to it, the goal of the project is to hand of all data and 
>bytecode handling to parrot.
>
>With this release all internal macros that poke at perl data types are 
>converted to be real C functions and to check if they are dealing with 
>traditional perl data types or PMC (Parrot data types) data. Perl 
>lvalues, arrays and hashes are also hidden inside PMCs but still access 
>their core data using traditional macros. The goal and purpose of this 
>release is to make sure this approach keeps on working with the XS 
>modules available on CPAN and to let people test with their own source 
>code. No changes where made to any of the core XS modules.

So ponie-2 compiles and passes all its tests for me.
So how do I see if it can handle the XS module from hell - Tk ?

Re: Perl and Parrot disagree about sched_yield on Solaris

2004-03-16 Thread Nick Ing-Simmons

Andrew Dougherty <[EMAIL PROTECTED]> writes:
>Whilst trying to build ponie-2 on Solaris 8, I came across the following
>issue:  In order to use threads, both perl-5.[89].x and parrot need to
>call some sort of yield() function.
>
>In parrot, sched_yield is used; this function is available in the -lrt
>library, so the solaris hints file adds that in.  There appears to be no
>way to override this from the Configure.pl command line.
>
>In perl, the plain yield() function is used; this function is available in
>the standard C library, so -lrt is not used.  (Indeed, it's not even
>mentioned in Configure.)  The hints/solaris_2.sh file unconditionally sets
>sched_yield='yield' (bad hints file! I'll supply a patch for that
>separately.)
>
>Underneath it all, it doesn't matter -- both functions are the same on
>Solaris -- but leaving things the way they are gives the following error
>message for ponie-2:

The worry might be that adding -lrt may have other side effects.
Alan?

>
>cc -L/usr/lib -L/usr/ccs/lib -L/opt/SUNWspro/SC4.2/lib 
>-L/home/doughera/src/parrot/ponie-andy/parrot/blib/lib -o miniperl \
>miniperlmain.o opmini.o libperl.a -lsocket -lnsl -ldl -lm -lpthread -lc -lparrot
>Undefined   first referenced
> symbol in file
>sched_yield 
>/home/doughera/src/parrot/ponie-andy/parrot/blib/lib/libparrot.a(thread.o)
>
>One fix for ponie may be to add rt to the $libswanted when it calls perl's
>Configure (adding -A prepend:libswanted=rt should do the trick, but I
>haven't tested it).
>
>Longer term, it'd probably be good for parrot to have yield() vs.
>sched_yield() set by some Configure.pl-time variable. For now, it would
>likely be sufficient to simply key off of perl5's $Config{sched_yield}.
>
>Similarly, perl's Configure/hints combination ought to be a bit more
>flexible about yield() vs. sched_yield().  This would help make perl
>slightly more flexible in adapting to being extended or embedded.

Re: [perl #16689] [NIT] trailing commas in enumerator lists bad

2002-08-21 Thread Nick Ing-Simmons


Jarkko Hietaniemi <[EMAIL PROTECTED]> writes:
># New Ticket Created by  Jarkko Hietaniemi 
># Please include the string:  [perl #16689]
># in the subject line of all future correspondence about this issue. 
># http://rt.perl.org/rt2/Ticket/Display.html?id=16689 >
>
>
>Freshly checked out parrot moans a lot:
>
>cc: Info: ./include/parrot/string.h, line 56: Trailing comma found in enumerator 
>list. (trailcomma)
>} TAIL_flags;
>^
>
>Trailing commas in enumerator lists is unportable behaviour in C.

And in case anyone has not come accross the "trick" before it is not uncommon
to have 

enum foo {
/* auto-genererated stuff */
  foo_MAX
};

where foo_MAX is a handy "number of entries" value as well 
as avoiding the trailing comma issue.


-- 
Nick Ing-Simmons
http://www.ni-s.u-net.com/

Re: [perl #15006] [PATCH] Major GC Refactoring

2002-07-17 Thread Nick Ing-Simmons


># New Ticket Created by  Mike Lambert 
># Please include the string:  [perl #15006]
># in the subject line of all future correspondence about this issue. 
># http://bugs6.perl.org/rt2/Ticket/Display.html?id=15006 >

Tickets from RT don't have an address in the To: line
and so my mailfilter is filing them as SPAM

--
Nick Ing-Simmons
http://www.ni-s.u-net.com/

Re: De Morgan's theorum

2002-02-20 Thread Nick Ing-Simmons


Brian Lee Ray <[EMAIL PROTECTED]> writes:
>From: "Nicholas Clark" <[EMAIL PROTECTED]>
>Sent: Tuesday, February 19, 2002 3:15 PM
>Subject: De Morgan's theorum
>> I have remembered the name correctly, haven't I?
>Yes. If we were really serious about optimizing logical expressions,
>we would probably want to use Karnaugh maps.

Karnaugh maps are for Humans with visual ways of understanding.
There is an easy-to-code Algorithm (Quine McLusky?) which does
the job for computers - it it can handle what would be (projection of)
an n-Dimensional hyper cube of a Karnaugh map.

>However, I just don't
>think most programs spend enough time doing logical comparison to
>really matter. Besides which, such techniques work best on complex
>expressions, which are rare indeed.
>I could be wrong, of course. Maybe someone could run some benchmarks?
>
>brian.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/

Re: De Morgan's theorum

2002-02-20 Thread Nick Ing-Simmons


Nicholas Clark <[EMAIL PROTECTED]> writes:
>I have remembered the name correctly, haven't I?
>
>Would it gain us much implementing De Morgan's theorem in the peephole
>optimiser?

It gets even more fun when there are NOTs on the other side as well...
Speaking of which why does NOT have two UNOPs - can we collapse
those ?


>
>nick@Bagpuss [nick]$ perl -MO=Terse -e 'print 0+(!$l && !$r)'
>LISTOP (0x164048) leave [1]
>OP (0x164070) enter
>COP (0x164008) nextstate
>LISTOP (0x163fc0) print
>OP (0x163fe8) pushmark
>BINOP (0x163f98) add [1]
>SVOP (0x163db0) const  IV (0xed2b8) 0
>UNOP (0x163f78) null
>LOGOP (0x10db20) and
>UNOP (0x163e68) not
>UNOP (0x163e48) null [15]

Why two UNOPs ?

>SVOP (0x163dd0) gvsv  GV (0x10a974) *l
>UNOP (0x163f58) not
>UNOP (0x163f38) null [15]
>SVOP (0x163e88) gvsv  GV (0x10a98c) *r
>-e syntax OK
>nick@Bagpuss [nick]$ perl -MO=Terse -e 'print 0+!($l || $r)'
>LISTOP (0x164028) leave [1]
>OP (0x164050) enter
>COP (0x163fe8) nextstate
>LISTOP (0x163fa0) print
>OP (0x163fc8) pushmark
>BINOP (0x163f78) add [1]
>SVOP (0x163db0) const  IV (0xed2b8) 0
>UNOP (0x163f58) not
>UNOP (0x163f38) null
>LOGOP (0x10db20) or
>UNOP (0x163e48) null [15]
>SVOP (0x163dd0) gvsv  GV (0x10a974) *l
>UNOP (0x163f18) null [15]
>SVOP (0x163e68) gvsv  GV (0x10a98c) *r
>-e syntax OK
>
>
>For "much" equal to 1 op in total.
>
>I think that the answer is "no, do it by hand if it matters that much",
>doesn't it?
>
>This also might be a perl6 question, for a more "serious" -O2 optimiser.
>Hmm. Would parrot benefit from nand and nor ops?
>
>[beware of cross posting when replying]
>
>Nicholas Clark
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/

Re: [PATCH] Stop win32 popping up dialogs on segfault

2002-02-10 Thread Nick Ing-Simmons


Michael G Schwern <[EMAIL PROTECTED]> writes:
>On Fri, Feb 08, 2002 at 07:19:26PM +0100, Mattia Barbon wrote:
>Content-Description: Mail message body
>> The following patch adds a Parrot_nosegfault() function
>> to win32.c; after it is called, a segmentation fault will print
>> "This process received a segmentation violation exception"
>> instead of popping up a dialog. I think it might be useful
>> for tinderbox clients.
>
>I don't suppose you could put something like this into Perl 5?

We could. However I would __MUCH__ rather that Perl 5 did not segfault.
Irritating though the popups are they do at least allow me to get 
a backtrace to the segfault.

Maybe have the handler unless -DDEBUGGING ?


-- 
Nick Ing-Simmons
http://www.ni-s.u-net.com/

Re: The internal string API

2001-06-20 Thread Nick Ing-Simmons

Jarkko Hietaniemi <[EMAIL PROTECTED]> writes:
>> Taiwanese read traditional chinese characters, but PRC people read
>> simplied chinese. Even we take the same data, and same program (code),
>> people just read differently. As an end user, I want to make the decision.
>> It will drive me crazy if Perl render/display the text file using
>> traditional
>> chinese just because it was tagged as "Big5".
>
>Perl will (probably, whispers he, crossing his fingers) never
>translate data that far.  Perl (5) does not "display" chr(0x1234) to
>me using Unicode fonts, it just pushes the octets to a file
>descriptor/handle.  Unicode is language-neutral.

Perl may not, but I assume someone will be fool enough to give it a GUI.
perl5.7.1+/Tk803.???-to-be will now make a stab at rendering Unicode
(not a very good one I am the 1st to admit which is why it isn't released!).

It would be good if Tk-for-perl6 did not have to break the rules or 
provide its own hooks for meta data and could use "the" string API.

-- 
Nick Ing-Simmons
http://www.ni-s.u-net.com/

Re: Should the op dispatch loop decode?

2001-06-13 Thread Nick Ing-Simmons

Benjamin Stuhl <[EMAIL PROTECTED]> writes:
>I don't see where shadow functions are really necessary -
>after all, no one has ever complained that you can't do 
>
>pp_chomp(sv); /* or pp_add(sv1, sv2), for that matter */
>
>in Perl 5. 

Yes we did. And note the doop.c file which is part answer
to the shadows.

Given the inner functions we could presumable generate the decode
functions (c.f. xsubpp)

-- 
Nick Ing-Simmons

RE: Should we care much about this Unicode-ish criticism?

2001-06-07 Thread Nick Ing-Simmons

Dan Sugalski <[EMAIL PROTECTED]> writes:
>
>I think I'd agree there. Different versions of a glyph are more a matter of 
>art and handwriting styles, and that's not really something we ought to get 
>involved in. 

But the human sitting in front of the machine cannot see the bit pattern,
they can only push the available keys and look at the presented glyphs.
There is indeed a similarity to locales - without a choice of glyph being 
presented then asian texts will be as readable to a native as if 
english was rendered in cyrillic or greek alphabets.
(Would you recognise Delta-alpha-nu as "Dan" ?)

>The european equivalent would be to have many versions of "A", 
>so we could represent the different ways it was drawn in various 
>illuminated manuscripts. That seems rather excessive.

Not entirely true. Consider German "ß" vs "ss" or French not normally putting 
accents on upper-case vowels, or even in english I detest spell checkers
which prefer naive vs naïve role vs rôle.  

>
>   Dan
>
>--"it's like this"---
>Dan Sugalski  even samurai
>[EMAIL PROTECTED]             have teddy bears and even
>  teddy bears get drunk
-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: Should we care much about this Unicode-ish criticism?

2001-06-07 Thread Nick Ing-Simmons


Dan Sugalski <[EMAIL PROTECTED]> writes:
>
>It does bring up a deeper issue, however. Unicode is, at the moment, 
>apparently inadequate to represent at least some part of the asian 
>languages. Are the encodings currently in use less inadequate? I've been 
>assuming that an Anything->Unicode translation will be lossless, but this 
>makes me wonder whether that assumption is correct.

One reason perl5.7.1+'s Encode does not do asian encodings yet is that 
the tables I have found so far (Mainly Unicode 3.0 based) are lossy.


-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-06-06 Thread Nick Ing-Simmons

Dan Sugalski <[EMAIL PROTECTED]> writes:
>
>I'm not entirely sure of that one--processing a full regex requires the 
>perl interpreter, it's not all that modular. Though whether being able to 
>yank out the RE engine and treat it as a standalone library is important 
>enough to warrant being treated as a design goal or not is a separate 
>issue. (I think so, as it also means I can treat it as a black box for the 
>moment so there's less to try and stuff in my head at once)

We are way past that point in perl5 - having tried to use perl's regexps
in a non perl app you need perl there, not the interpreter perhaps
by bits of the tokenizer and of course SV *.

So to make it modular in perl6 yoiu have to re-write it, and I am will 
Larry - lets make the main op-state machine handle those ops too.
That will help with need to special case regexps for signal despatch 
etc.

>
>*) It makes the amount of mental space the core interpreter takes up smaller

But surely we are considering "expandable" intepreter already?
Adding regexp ops is just one such extension.

>*) It can make performance tradeoffs separately from the main perl engine
>*) We can probably snag the current perl 5 source without much change

I doubt that.

>*) The current RE engine's scared (or is that scarred?) me off enough that 
>I'd as soon leave it to someone who's more tempermentally suited for such 
>things.
>*) Treating regexes as non-atomic operations brings some serious threading 
>issues into things.

Leaving them atomic does as well - I will switch threads as soon
as the regexp completes ...

-- 
Nick Ing-Simmons

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Nick Ing-Simmons

Uri Guttman <[EMAIL PROTECTED]> writes:
>>>>>> "NI" == Nick Ing-Simmons <[EMAIL PROTECTED]> writes:
>
>  NI> The "overhead of op dispatch" is a self-proving issue - if you
>  NI> have complex ops they are expensive to dispatch.
>
>but as someone else said, we can design our own ops to be as high level
>as we want. lowering the number of op calls is the key. that loop will
>be a bottleneck as it is in perl5 unless we optimize it now.
>
>  NI> With a 16-bit opcode as-per-Uri that becomes:
>
>  NI>while (1) *(table[*op_ptr++])();
>
>  NI> (Assuming we don't need to check bounds 'cos we won't generate bad code...)
>
>i dropped the 16 bit idea in favor of an extension byte code that zhong
>mentioned. it has several wins, no ordering issues, it is pure 'byte'
>code. 
>
>  NI> One can then start adding decode to the loop:
> 
>  NI>while (1) {
>  NI>  op_t op = *op_ptr++;
>  NI>  switch(NUM_ARGS(op))
>
>no switch, a simple lookup table:
>
>   op_cnt = op_counts[ op ] ;

Myths of 21st Century Computing #1:
 "Memory lookups are cheap"

Most processors only have only one memory unit and it typically has
a long pipeline delay. But many have several units that can do 
compare etc.

A lookup table may or may-not be faster/denser than a switch.
A lookup may take 9 cycles down a memory pipe while

ans = (op > 16) ? 2 : (op > 8) ? 1 : 0;

might super-scalar issue in 1 cycle.  Code at high level and let 
C compiler know what is best. C will give you a lookup if that 
is best.

Memory ops need not be expensive if they pipeline well, but 
making one memory op depend on the result of another is bad idea
e.g.

   op   = *op_ptr++;
   arg1 = *op_ptr++;
   arg2 = *op_ptr++;

May apear to happen in 3 cycles, as all the loads can be issued in a pipelined
manner and ++s issued in parallel. While

   op  = *op_ptr++;
   ans = table[op];

could take seem to 18 cycles as can't start 2nd load till 1st one completes.

I have been meaning to try and prove my point with 
a software-pipelined dispatch loop which is fetching one op,
decoding previous one and executing one before that.

-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Nick Ing-Simmons


Dave Mitchell <[EMAIL PROTECTED]> writes:
>
>There's no reason why you can.t have a hybrid scheme. In fact I think
>it's a big win over a pure register-addressing scheme. Consider...

Which was more or less my own position...

>
>At the start of a new scope, the stack is extended by N to create a new
>stack frame (including a one-off check that the stack can be
>extended).  There is then a 'stack pointer' (sp) which is initialised
>to the base of the new frame, or an initial offset thereof. (So sp is
>really just a temporary index within the current frame.)
>
>Then some opcodes can use explicit addressing, while others can be explicit,
>or a mixture.
>
>Explicit opcodes specify one or more 'registers' - ie indexes within the
>current frame, while implicit opcodes use the current value of sp as an
>implicit index, and may alter sp as a side effect. So an ADD opcode
>would use sp[0], sp[-1] to find the 2 operands and would store a pointer
>to the result at sp[-1], then sp--. The compiler plants code in such a way
>that it will never allow sp to go outside the current stack frame.
>
>This allows a big win on the size of the bytecode, and in terms of the
>time required to decode each op.
>
>Consider the following code.
>
>$a = $x*$y+$z
>
>Suppose we have r5 and r6 available for scratch use, and that for some
>reason we wish to keep a pointer to $a in r1 at the end (perhaps we use
>$a again a couple of lines later):
>
>
>This might have the following bytecode with a pure resiger scheme:
>
>GETSV('x',r5)  # get pointer to global $x, store in register 5
>GETSV('y',r6)
>MULT(r5,r5,r6)  # multiply the things pointed to by r5 and r6; store ptr to
>   # result in r5
>GETSV('z',r6)
>ADD(r5,r5,r6)
>GETSV('a',r1)
>SASSIGN(r1,r5)

Globals are a pain. Consider this code:

sub foo
{
 my ($x,$y,$z) = @_;
 return $x*$y+$z;
}

In the pure register (RISC-oid) scheme the bytecode should be:

FOO:
  MULT(arg1,arg2,tmp1)
  ADD(tmp1,arg3,result)
  RETURN

That is lexicals get allocated registers at compile time, and ops
just go get them.

In the pure stack with alloc scheme (x86-oid) scheme it should be
  ENTER +1 # need a temp
  MULT SP[1],SP[2],SP[4]   # $x*$y
  ADD SP[4],SP[3],SP[1]# temp + $z -> result
  RETURN -2# Loose temp and non-results   

And in a pure stack (FORTH, PostScript) style it might be
  rot 3# reorder stack to get x y on top
  mpy
  add
  ret   

>
>but might be like this in a hybrid scheme:
>
>SETSP(5)   # make sp point to r5
>GETSV('x') # get pointer to global $a, store at *sp++
>GETSV('y')
>MULT
>GETSV('z')
>ADD
>GETSV('a')
>SASSIGN
>SAVEREG(r1)# pop pointer at *sp, and store in register 1

The problem that the hybrid scheme glosses over is the re-order the args
issue that is handled by register numbers, stack addressing or 
FORTH/PostScript stack re-ordering.
It avoids it by expensive long range global fetches - which is indeed
what humans do when writing PostScript - use globals - but compilers
can keep track of such mess for us.

>
>
>Both use the same regsters, have the same net result, but the explicit
>scheme requires an extra 11 numbers in the bytecode, not to mention all
>the extra cycles required to extract out those nunmbers from the bytecode
>in the first place.
-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Nick Ing-Simmons

Dan Sugalski <[EMAIL PROTECTED]> writes:
>At 02:08 PM 5/30/2001 +0000, Nick Ing-Simmons wrote:
>>Classic CISC code generation taught "us" that CISC is a pain to code-gen.
>>(I am not a Guru but did design TMS320C80's RISC specifically to match
>>gcc of that vintage, and dabbled in a p-code for Pascal way back.)
>
>Right, but in this case we have the advantage of tailoring the instruction 
>set to the language, and given the overhead inherent in op dispatch we also 
>have an incentive to hoist opcodes up to as high a level as we can manage.

That is of course what they/we all say ;-)

The 68K for example matched quite well to the low-tech compiler technology
of its day, as did UCSD's p-code for USCD Pascal, and DSPs have their own 
reasons (inner loops are more important than generic C) for their CISC nature.

Even the horrible x86 architecture is quasi-sane if you assume all variables
are on the stack addressed by the Base Pointer.

It is interesting now that people are looking at building chips for JVM
how much cursing there is about certain features - though I don't have 
the references to hand.

The "overhead of op dispatch" is a self-proving issue - if you have complex
ops they are expensive to dispatch. 
In the limit FORTH-like threaded code 

   while (1) *(*op_ptr++)();

is not really very expensive, it is then up to the "op" to adjust op_ptr
for in-line args etc. Down sides are size op is at least size of a pointer.

With a 16-bit opcode as-per-Uri that becomes:

   while (1) *(table[*op_ptr++])();

(Assuming we don't need to check bounds 'cos we won't generate bad code...)

One can then start adding decode to the loop:

   while (1) {
 op_t op = *op_ptr++;
 switch(NUM_ARGS(op))
  case 1:
   *(table[FUNC_NUM(op)])(*op_ptr++);
   break;
  case 3:
   *(table[FUNC_NUM(op)])(op_ptr[0],op_ptr[1],op_ptr[2]);
   op_ptr += 3;
   break;
  ...
   }

Then one can do byte-ordering and mis-aligned hackery and index into reg-array

   while (1) {
 op_t op = GET16BITS(*op_ptr);
 switch(NUM_ARGS(op))
  case 1:
       *(table[FUNC_NUM(op)])(reg_ptr[GET8BITS(*op_ptr)]);
   break;
  ...
   }

>
-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Nick Ing-Simmons

Uri Guttman <[EMAIL PROTECTED]> writes:
>
>think of this as classic CISC code generation with plenty of registers
>and a scratch stack. this is stable technology. we could even find a
>code generator guru (i don't know any obvious ones in the perl6 world)

Classic CISC code generation taught "us" that CISC is a pain to code-gen.
(I am not a Guru but did design TMS320C80's RISC specifically to match 
gcc of that vintage, and dabbled in a p-code for Pascal way back.)

>
>  >> special registers ($_, @_, events, etc.) are indexed with a starting
>  >> offset of 64, so general registers are 0-63.
>
>  DS> I'd name them specially (S0-Snnn) rather than make them a chunk of the 
>  DS> normal register set.

All that dividing registers into sub-classes does it cause you to do 
register-register moves when things are in the wrong sort of register.
Its only real benefit is for encoding density as you can "imply" part
of the register number by requiring addresses to be in address registers
etc. It is not clear to me that perl special variables map well to that.
Mind you the names are just a human thing - it is the bit-pattern that 
compiler cares about.

>
>oh, they have macro names which are special. something like:
>
>#defineMAX_PLAIN REG   64  /* 0 - 63 are plain regs */
>#defineREG_ARG 64  /* $_ */
>#defineREG_SUB_ARG 65  /* @_ */
>#defineREG_ARGV66  /* @ARGV */
>#define    REG_INT1    67  /* integer 1 */
>#defineREG_INT268  /* integer 1 */
>
>uri
-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: Stacks, registers, and bytecode. (Oh, my!)

2001-05-30 Thread Nick Ing-Simmons


Uri Guttman <[EMAIL PROTECTED]> writes:
>  DS> The one handy thing about push and pop is you don't need to go
>  DS> tracking the stack manually--that's taken care of by the push and
>  DS> pop opcodes. They can certainly be replaced with manipulations of
>  DS> a temp register and indirect register stores or loads, but that's
>  DS> more expensive--you do the same thing only with more dispatch
>  DS> overhead.
>
>  DS> And I'm considering the stack as a place to put registers
>  DS> temporarily when the compiler runs out and needs a spot to
>  DS> squirrel something away, rather than as a mechanism to pass
>  DS> parameters to subs or opcodes. This is a stack in the traditional
>  DS> scratch-space sense.
>
>i agree with that. the stack here is mostly a call stack which
>save/restores registers as we run out. with a large number like 64, we
>won't run out until we do some deep calls. then the older registers (do
>we have an LRU mechnism here?) get pushed by the sub call prologue which
>then uses those registers for its my vars.

I don't like push/pop - they imply a lot of stack limit checking word-by-word
when it is less overhead for compiler to analyse the needs of whole basic-block
check-for/make-space-on the stack _once_ then just address it.

>
>is the sub call/return stack also the data (scratch) stack? i think
>separate ones makes sense here. the data stack is just PMC pointers, the
>code call stack has register info, context, etc.

One stack is more natural for translation to C (which has just one).
One problem with FORTH was allocating two growable segments for its 
two stacks - one always ended up 2nd class.

-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: Stacks & registers

2001-05-27 Thread Nick Ing-Simmons

Uri Guttman <[EMAIL PROTECTED]> writes:
>  NI> No - you keep the window base "handy" and don't keep re-fetching it,
>  NI> same way you keep "program counter" and "stack pointer" "handy".
>
>  NI> Getting  
>  NI>window[N] 
>  NI> is same cost as 
>  NI>next = *PC++; 
>
>  NI> My point is that to avoid keeping too-many things "handy" window
>  NI> base and stack pointer should be the same (real machine) register.
>
>if we can control that. 

Maybe not directly, but most compilers will keep common base registers
in machine registers if you code things right.

>but i see issues too. i mentioned the idea of
>having $_ and other special vars and stuff would have their own PMC's in
>this register set. 

Why does it have to be _this_ register set - globals can go in another
register set - SPARC's register scheme has global registers too.

That said my guess is that $_ is usually save/restored across sub/block
boundaries.

>dan like the idea. that doesn't map well to a window
>as those vars may not change when you call subs. i just don't see
>register windows as useful at the VM level.

Call it what you will - I am arguing for an addressable stack
not for windows as such.

>
>  >> i am just saying register windows don't seem to be any win for us
>  >> and cost an extra indirection for each data access. my view is let
>  >> the compiler keep track of the register usage and just do
>  >> individual push/pops as needed when registers run out.
>
>  NI> That makes sense if (and only if) virtual machine registers are real 
>  NI> machine registers. If virtual machine registers are in memory then 
>  NI> accessing them "on the stack" is just as efficient (perhaps more so)
>  NI> than at some other "special" location. And it avoids need for 
>  NI> memory-to-memory moves to push/pop them when we do "spill".
>
>no, the idea is the VM compiler keeps track of IL register use for the
>purpose of code generating N-tuple op codes and their register
>arguments. this is a pure IL design thing and has nothing to do with
>machine registers. at this level, register windows don't win IMO.

That quote is a little misleading. My point is that UNLESS machine
(real) machine registers are involved then all IL "Registers" are 
in memory. Given that they are in memory they should be grouped with
and addressed-via-same-base-as other "memory" that a sub is accessing.
(The sub will be accessing the stack (or its PAD if you like), and the 
op-stream for sure, and possibly a few hot globals.)

The IL is going to be CISC-ish - so treat it like an x86 where 
you operate on things where-they-are (e.g. "on the stack") 

   add 4,BP[4]

rather than RISC where you 

   ld BP[4],X
   add 4,X
   ST X,BP[4]

If "registers" are really memory the extra "moves" of a RISC scheme
are expensive.

What we _really_ don't want is the worst of both worlds:

   push BP[4];
   push 4
   add
   pop  BP[4] 

>
>i am thinking about writing a short psuedo code post about the N-tuple
>op codes and the register set design. the ideas are percolating in my
>brane.
>
>uri
-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: Stacks & registers

2001-05-26 Thread Nick Ing-Simmons

Uri Guttman <[EMAIL PROTECTED]> writes:
>  NI> i.e. 
>  NI>  R4 = frame[N]
>  NI> is same cost as
>  NI>  R4 = per_thread[N]
>  NI> and about the same as
>  NI>  extern REGISTER GlobalRegs4 
>  NI>  R4 = GlobalRegs4;
>
>well, if there is no multithreading then you don't need the per_thread
>lookup. 

Well:
 (a) I thought the plan was to design threads in from the begining this time.
 (b) I maintain that cost is about the same as global variables anyway.

The case for (b) is as follows:
on RISC hardware

R4 = SomeGlobal;

becomes two instructions:

loadhigh SomeGlobal.high,rp 
ld rp(SomeGlobal.low),R4

The C compiler will try and factor out the loadhigh instruction, leaving
you with an indexed load. In most cases 

ld rp(RegBase.low+4),R4

is just a valid and takes same number of cycles, and there is normally
a form like

ld rp(rn),R4

Which allows "index" by variable amount.

On CISC machines, then either there is an invisible RISC (e.g. Pentium)
which behaves as above or you get something akin to PDP-11 where indirection
reads a literal address via the "program counter".

move [pc+n],r4

In such cases 

move [regbase+n],r4 

is going to be just as fast - the issue is the need for a (real machine)
register to hold 'regbase'.

>and the window base is not accounted for. you would need 2
>indirections, the first to get the window base and the second to get the
>register in that window. 

No - you keep the window base "handy" and don't keep re-fetching it,
same way you keep "program counter" and "stack pointer" "handy".

Getting  
   window[N] 
is same cost as 
   next = *PC++; 

My point is that to avoid keeping too-many things "handy" window base
and stack pointer should be the same (real machine) register.

>i am just saying register windows don't seem to
>be any win for us and cost an extra indirection for each data access. my
>view is let the compiler keep track of the register usage and just do
>individual push/pops as needed when registers run out.

That makes sense if (and only if) virtual machine registers are real 
machine registers. If virtual machine registers are in memory then 
accessing them "on the stack" is just as efficient (perhaps more so)
than at some other "special" location. And it avoids need for 
memory-to-memory moves to push/pop them when we do "spill".

-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: Stacks & registers

2001-05-24 Thread Nick Ing-Simmons

Alan Burlison <[EMAIL PROTECTED]> writes:
>> 1. When you call deep enough to fall off the end of the large register
>>file an expensive "system call" is needed to save some registers
>>at the other end to memory and "wrap", and then again when you
>>come "back" to the now-in-memory registers.
>
>Not a system call but a trap - they aren't the same thing (pedant mode off
>;-).  The register spill trap handler copies the relevant registers onto the
>stack - each stack frame has space allocated for this.

Pedant mode accepted - and I concur. But trap handler is still significant
overhead compared to just doing the moves (scheduled) inline 
as part of normal code. So register windows win if you stay in bounds
but loose quite seriously if you have "active" deep calls.

(My own style is to write small functions rather than #define or inline,
for cache reasons - this has tended to make above show. I am delighted to 
say that _modern_ (Sun) SPARCs have deep enough windows even for me - 
but SPARCStation1+ and some of the lowcost CPUs didn't.)

>
>Alan Burlison
-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: Stacks & registers

2001-05-24 Thread Nick Ing-Simmons

Uri Guttman <[EMAIL PROTECTED]> writes:
>>>>>> "NI" == Nick Ing-Simmons <[EMAIL PROTECTED]> writes:
>
>  NI> "We" need to decide where a perl6 sub's local variables are going
>  NI> to live (in the recursive case) - if we need a "stack" anyway it
>  NI> may make sense for VM to have ways of indexing the local "frame"
>  NI> rather than having "global" registers (set per thread by the way?)
>
>i made that thread point too in my long reply to dan.
>
>but indexing directly into a stack frame is effectively a register
>window. the problem is that you need to do an indirection through the
>window base for every access and that is slow in software (but free in
>hardware).

It isn't free in hardware either, but cost may be lower.
Modern machines should be able to schedule indirection fairly efficiently.
But I would contend we are going to have at least one index operation
anyway - if only from the "thread" pointer, or "global base" - so 
with careful design so that "registers" are at right "offset" from the "base"
we can subsume the register lookup index into that.

i.e. 
 R4 = frame[N]
is same cost as
 R4 = per_thread[N]
and about the same as
 extern REGISTER GlobalRegs4 
 R4 = GlobalRegs4;

>
-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: Stacks & registers

2001-05-23 Thread Nick Ing-Simmons

Graham Barr <[EMAIL PROTECTED]> writes:
>On Wed, May 23, 2001 at 10:30:32AM -0700, Hong Zhang wrote:
>> I think "stack based =~ register based". If we don't have Java-like "jsr" 
>
>That comment reminds me of how the register file is implemented in
>a sun sparc. They have a large register file, but only some are accessable
>at any given time, say 16. 

32 IIRC but principle is correct.

>When you do a sub call you place your
>arguments in the high registers, say 4, and shift the window pointer
>by 12 (in this case).  What was r12-r15 now becomes r0-r3. On return
>the result is placed into r0-r3 which are then available to the
>caller as r12-r15.
>
>This allows very efficient argument passing without having to save
>registers to a stack and restor them later.

That cost can be over-played - most compilers need "scratch" registers
which with have values which don't need to be saved, if your RISC
is set up to use arg registers as scratch then most of the save/restores
can be eliminated, and others can be hidden in delay slots or super-scalar
executed in parallel with other operations.

The problems with SPARC scheme are:

1. When you call deep enough to fall off the end of the large register
   file an expensive "system call" is needed to save some registers
   at the other end to memory and "wrap", and then again when you
   come "back" to the now-in-memory registers.

2. The large file has a large decode which is in "logical" critical
   path - so all kinds of bypass tricks have to be played to get 
   speed back.

Neither is too bad for a virtual machine though.

"We" need to decide where a perl6 sub's local variables are going to live
(in the recursive case) - if we need a "stack" anyway it may make sense
for VM to have ways of indexing the local "frame" rather than having 
"global" registers (set per thread by the way?)

What I do agree with is that 
   push a
   push b
   add
   pop  r 

is lousy way to code r = a + b - too much pointless copying. 
We want 
   add #a,#b,#r
were #a is a small number indexing into "somewhere" where a is stored.

>
>Graham.
-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: PDD: Conventions and Guidelines for Perl Source Code

2001-05-10 Thread Nick Ing-Simmons

Alan Burlison <[EMAIL PROTECTED]> writes:
>
>I strongly agree.  The current macro mayhem in perl is an utter abomination,
>and drastically reduces the maintainability of the code.  I think the
>performance argument is largely specious, and while abstraction is a
>laudable aim, in the case of perl it has turned from abstraction into
>obfustification.

As I have said more than once before, excessive use of macros can 
be a performance killer. It is better to have slabs of common stuff
in real function (which is cached) rather than replicated all over the 
place. That is the style I use in my own (whoops sorry, TI's) code and 
it does not seem to "hurt" even on X86 CISC machines.

-- 
Nick Ing-Simmons
who is looking for a new job see http://www.ni-s.u-net.com/

Re: Tying & Overloading

2001-04-24 Thread Nick Ing-Simmons

Larry Wall <[EMAIL PROTECTED]> writes:
>Nick Ing-Simmons writes:
>: >You really have to talk about overloading boolean context
>: >in general.
>: 
>: Only if you are going to execute the result in the normal perl realm.
>: Consider using the perl parser to build a parse tree - e.g. one to 
>: read perl5 and write perl 6. This works for all expressions except
>: &&, || and ?: because perl5 cannot overload those - so 
>: 
>: $c = ($a && &b) ? $d : $e;
>: 
>: calls the bool-ness of $a and in the defered execution mode of a translator
>: it wants to return not true/false but "it depends on what $a is at run-time".
>: It cannot do that and is not passed $b so cannot return 
>
>I think using overloading to write a parser is going to be a relic of
>Perl 5's limitations, not Perl 6's.

I am _NOT_ using overloading to write a parser. 
Parse::Yapp is just fine for writing parsers. I am trying to re-use
a parser that already exists - perl5's parser. 

I am using overloading to get at the parse "tree" that the _existing_ parser 
has produced. 

So I can get at perly.y's : 

term:   ...
|   '!' term
|   term ADDOP term

etc. but NOT 

|   term ANDAND term
|   term OROR term
|   term '?' term ':' term

; 

I can get at the former because overload maps via newBINOP/newUNOP
just fine, I cannot get at latter group because newLOGOP/newCONDOP
don't do overloading.

What _really_ want to do is a dynamically scoped peep-hole "optimize"
(actually a rewrite) of the op tree - written in perl.

But I can't do that, so I fake it by having 

sub construct (&) { ... }

and then 

construct { 
  # expression(s) here 
}

and have construct() "call" the ops with the overload stuff returning a tree.
These days I suppose one could use B:: to poke about in the CV

>
>Larry
-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: Tying & Overloading

2001-04-23 Thread Nick Ing-Simmons

Larry Wall <[EMAIL PROTECTED]> writes:
>: At 06:20 PM 4/20/2001 -0300, Filipe Brandenburger wrote:
>: >Please tell me if there really is an use for overloading && and || that 
>: >would not be better done with source filtering, then I will (maybe) 
>: >reconsider my opinion.
>
>I think it's a category error to talk about overloading && and ||,
>which are not really operators so much as they are control flow
>constructs.  

I want to be able to "overload" those as well ;-)

>You really have to talk about overloading boolean context
>in general.

Only if you are going to execute the result in the normal perl realm.
Consider using the perl parser to build a parse tree - e.g. one to 
read perl5 and write perl 6. This works for all expressions except
&&, || and ?: because perl5 cannot overload those - so 

$c = ($a && &b) ? $d : $e;

calls the bool-ness of $a and in the defered execution mode of a translator
it wants to return not true/false but "it depends on what $a is at run-time".
It cannot do that and is not passed $b so cannot return 

   new Operator::->('&&',$a,$b)

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: Tying & Overloading

2001-04-23 Thread Nick Ing-Simmons

Filipe Brandenburger <[EMAIL PROTECTED]> writes:
>
>The big problem with || and && is that they don't evaluate their second 
>argument until it's needed, that's what allows us to do something like 
>`$xxx || die'. 

That is still a run-time thing though - so no real barrier to overloading it. 

>I remember reading one possible use of it to do a seamless 
>Perl to SQL query engine, like writing
>
>select($name eq 'FILIPE' && $salary > 10);
>
>would do a SQL query. (It wasn't actually like this, but it was something 
>like it). I haven't seen any use other than this for overloading && and ||, 
>please tell me if there is one.

I have an app which uses overload to cause perl to build a parse 
tree (of perl objects). Not being able to overload && and || is 
an irritant in this area as VHDL and Verilog that parse is going 
to translate to both have equivalent constructs.

>
>Now, this kind of thing above, this would actually be done much better by a 
>pluggable parser (or better, a pluggable code generator, that takes the 
>abstract syntax tree of the parsed perl code), that instead of generating 
>perl bytecode (or whatever bytecode), generates a SQL query.

The aim of my scheme above is to use the perl parser - as a parser 
for a language the user is familier with - and build the 
alien parse tree from it. 

>
>As I foresee it, dealing with the late evaluation needed for && and || 
>wouldn't be worth the little win of being able of defining the Perl<->SQL 
>translator with overload. 

Dealing the the late eval is easy just let the overload code do it
same as for any other overload.

>
>Please tell me if there really is an use for overloading && and || that 
>would not be better done with source filtering, then I will (maybe) 
>reconsider my opinion.
>
>- Branden
-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: Split PMCs

2001-04-23 Thread Nick Ing-Simmons


Dan Sugalski <[EMAIL PROTECTED]> writes:
>At 07:39 PM 4/19/2001 +, [EMAIL PROTECTED] wrote:
>>Depends what they are. The scheme effectively makes the part "mandatory"
>>as we will have allocated space whether used or not.
>
>Well, we were talking about all PMCs having an int, float, and pointer 
>part, so it's not like we'd be adding anything. Segregating them out might 
>make things faster for those cases where we don't actually care about the 
>data. OTOH that might be a trivially small percentage of the times the 
>PMC's accessed, so...

What is the plan for arrays these days? - if the float parts 
of the N*100 entries in a perl5-oid AV were collected you might 
get "packed" arrays by the back door.

>
>>So it depends if access pattern means that the part is seldom used,
>>or used in a different way.
>>As you say works well for GC of PMCs - and also possibly for compile-time
>>or debug parts of ops but is not obviously useful otherwise.
>
>That's what I was thinking, but my intuition's rather dodgy at this level. 
>The cache win might outweigh other losses.
>
>> >I'm thinking that passing around an
>> >arena address and offset and going in as a set of arrays is probably
>> >suboptimal in general,
>>
>>You don't, you pass PMC * and have offset embedded within the PMC
>>then arena base is (pmc - pmc->offset) iff you need it.
>
>I was trying to avoid embedding the offset in the PMC itself. Since it was 
>calculatable, it seemed a waste of space.

But passing extra args around is fairly expensive when they are 
seldom going to be used. Passing an extra arg through N-levels is
going to consume instructions and N * 32 bits of memory or so.

>
>If we made sure the arenas were on some power-of-two boundary we could just 
>mask the low bits off the pointer for the base arena address. Evil, but 
>potentially worth it at this low a level.

That would work ;-)

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: PDD for code comments ????

2001-03-26 Thread Nick Ing-Simmons


David L . Nicol <[EMAIL PROTECTED]> writes:
>Jarkko Hietaniemi wrote:
>
>> Some sort of simple markup embedded within the C comments.  Hey, let's
>> extend pod!  Hey, let's use XML!  Hey, let's use SGML!  Hey, let's use
>> XHTML!  Hey, let's use lout!  Hey, ...
>
>Either run pod through a pod puller before the C preprocessor gets to
>the code, or figure out a set of macros that can quote and ignore pod.
>
>The second is Yet Another Halting Problem so we go with the first?
>
>Which means a little program to depod the source before building it,
>or a -HASPOD extension to gcc
>
>Or just getting in the habit of writing 
>
>/*
>=pod
>
>
>and
>
>=cut
>*/

Perhaps we could teach pod that /* was alias for =pod
and */ an alias for =cut ?


-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: Vtables: what do we know so far?

2001-02-02 Thread Nick Ing-Simmons


Edwin Steiner <[EMAIL PROTECTED]> writes:
>Filipe Brandenburger wrote:
>[...]
>> struct sv {
>> vtable_sv  * ptr_to_vtable;
>> void   * ptr_to_data;
>> void   * gc_data;
>> };
>[...]
>> I don't think I can get further from here. Note that, in all examples,
>> I didn't write the `this' pointer that every function would receive.
>> This would correspond to the `ptr_to_data' from the struct sv.
>
>I think the `this' pointer should be the SV* (== &ptr_to_vtable) so virtual functions 
>can themselves call virtual functions on the same object.

Definitely. It also allows them to change what ptr_to_data is for example.

>
>-Edwin
-- 
Nick Ing-Simmons

Modular subsystem design (was Re: Speaking of signals...)

2001-01-11 Thread Nick Ing-Simmons


Filipe Brandenburger <[EMAIL PROTECTED]> writes:
> 
>But, back to the efficiency issue, I _THINK_ the scenario I described is not 
>inefficient. What it does differently from a monolithic system: it uses 
>callbacks instead of fixed function calls, and it doesn't inline the 
>functions. First, Callbacks take at most 1 cycle more than fixed function 
>calls (is this right???), 

No - a memory fetch can take a long time (10s of cycles).
Mostly that can be hidden by a pipeline, but branches (i.e. calls)
tend to expose it more. But we are already thinking of "vtables" which 
are no better.

>because the processor must fetch the code address 
>from an address of memory, instead of just branching to a fixed memory 
>address. Comparing to all the code Perl uses to handle SVs and such stuff, I 
>think 1 cycle wouldn't kill us at all! 
> 
>Well, inline functions _CAN_ make a difference if there are many calls to 
>one function inside a loop, or something like this. And this _CAN_ be a 
>bottleneck. 

Inline functions can also cost you - the out-of-line function 
may be in the cache, and the plethora of inline functions not in cache,
or extra code size thrashes cache.

>Well, I have one idea that keeps our design modular, breaks 
>dependencies between subsystems (like that of using async i/o system without 
>having to link to the whole thing), and achieves efficiency through inline 
>functions. We could develop a tool that works in the source code level and 
>does the inlining of functions for us. I mean a perl program that opens the 
>C/C++ source of the kernel, looks for pre-defined functions that should be 
>inlined, and outputs processed C/C++ in ``spaghetti-style'', very messy, 
>very human-unreadable, and very efficient. 

And already discussed ;-) 

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: perl IS an event loop (was Re: Speaking of signals...)

2001-01-08 Thread Nick Ing-Simmons


Bart Lateur <[EMAIL PROTECTED]> writes:
>
>Apropos safe signals, isn't it possible to let perl6 handle avoiding
>zombie processes internally? What use does having to do wait() yourself,
>have anyway?

Valid point - perl could have a CHLD handler in C and stash away returned
status to pass to wait() when it did get called.


-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: perl IS an event loop (was Re: Speaking of signals...)

2001-01-08 Thread Nick Ing-Simmons


Dan Sugalski <[EMAIL PROTECTED]> writes:
>At 01:02 PM 1/6/01 -0500, Uri Guttman wrote:
>>that is what i would expect form a simple flag test and every N tests
>>doing a full event poll. and even up to 5-10% slowdown i would think is
>>a good tradeoff for the flexibilty and ease of design win we get in the
>>i/o and event guts. but then, i have always traded off speed for
>>flexibility and ease. hey, so has perl! :)
>
>Not always. :) The flexibility really does need to balance out the speed 
>hit. (If Nick wasn't in the middle of rewriting the whole IO system, I'd 
>probably be assaulting sv_gets to make up for the speed hit I introduced 
>way back with the record reading code...)

Nick has yet to touch sv_gets() - partly 'cos it was too scary to mess
with - so you can if you like ;-)

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: perl IS an event loop (was Re: Speaking of signals...)

2001-01-08 Thread Nick Ing-Simmons


Simon Cozens <[EMAIL PROTECTED]> writes:
>On Fri, Jan 05, 2001 at 11:42:32PM -0500, Uri Guttman wrote:
>>   SC> 5x slowdown. 
>> 
>> not if you just check a flag in the main loop. you only check the event
>> system if you have pending events or signals, etc. the key is not
>> checking all events on each pass thru the loop. 
>
>Which is exactly what Chip did in his safe-signals patch. 33% slowdown.

I don't believe it - can we add a stub test and bench mark it?

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: standard representations

2001-01-08 Thread Nick Ing-Simmons


Nicholas Clark <[EMAIL PROTECTED]> writes:
>> sure if there are any non-two's complement machines out there anymore, 
>
>However, as perl5 has a few 2s complement assumptions already polluting the
>source, unless we can find a 1s complement (or other) machine to test on, it
>seems sensible (to me at least) to say that the initial implementation will
>assume 2s complement as we have nothing to test that we've got all the 2s
>complement assumptions out or the conditionally compiled non 2s complement
>code correct.

FWIW IEEE-754 Floating point isn't 2's complement for the mantissa.

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: Speaking of signals...

2001-01-05 Thread Nick Ing-Simmons

Uri Guttman <[EMAIL PROTECTED]> writes:
>
>but the question remains, what code triggers a signal handler? would you
>put a test in the very tight loop of the the op dispatcher? 

Not a test. The C level signal handler just fossicks with the variables
that very tight loop is using. 

>
>  n> But if "runops" looked like:
>
>  n> while (PL_op = PL_next_op)
>  n>  {
>  PL_op-> perform(); # assigns PL_next_op;
>  n>  }
>
>  n> (Which is essentially FORTH-like) then there is little to get in a mess.
>  n> The above is simplistic - we need a way to "disable interrupts" too.
>
>and where is the event test call made? 

It isn't. PL_next_op is set by C signal handler.

In practice I suspect we need the test :

while (PL_op = (PL_sig_op) ? PL_sig_op : PL_next_op)
 {
  PL_op->perform;
 }

>or somehow the next op delivered
>will be the next baseline op or the dispatch check op. that is basically
>the same as my ideas above, just a different style loop.

What I am trying to get to is adding minimal extra tests to the tight loop.
We probably need at least ONE test in the loop - let us try and make 
that usable for all the "abnormal" cases.

>
>uri
-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: Anyone want to take a shot at the PerlIO PDD?

2001-01-03 Thread Nick Ing-Simmons

Dan Sugalski <[EMAIL PROTECTED]> writes:
>Would someone like to take a crack at a PDD for the PerlIO system? It 
>doesn't need to be particularly fancy (nor complete) to start with, but 
>having one will give us a place to work from. (Waiting for me to spec it 
>out may take a while...)

I am willing to cast bleadperl5's PerlIO into the form of a _draft_ PDD
for perl6 - i.e. "this is what it does now", not "this is what it should do".

Then we can discuss it here some more.

-- 
Nick Ing-Simmons

Re: standard representations

2000-12-31 Thread Nick Ing-Simmons


Dan Sugalski <[EMAIL PROTECTED]> writes:
>> >That's fine. I was thinking of smaller processors that might be used in
>> >embedded apps and such. (I'm also not sure what's the most efficient
>> >integer representation on things like the ARM microprocessors are)
>>
>>ARM7/ARM9 are both 32-bit
>>MIPS has both 32-bit and 64-bit variants.
>
>That's good. Though do either of them have 16-bit data busses?

Not at the CPU no - what happens at chip boundary depends on what customer
asks for.

The 68XXX in Palm-Pilots are the issue there.

>
>>DSPs are more messy.
>
>That's probably a bit too specialized a piece of hardware to worry about. 
>Unlss things have changed lately, they're not really general-purpose CPUs.

Some of them are.

>
>>It is micro-controllers that you have to worry about
>
>Yeak, I know a lot of the old 8 and 16 bit chips are in use as control 
>devices places. Those are the ones I'm thinking about. (Not that hard, but 
>I don't want to rule them out needlessly)

I suspect that any that are up to running anything approximating perl
will have 32-bit ops in a library in any case.

>
-- 
Nick Ing-Simmons

Re: standard representations

2000-12-30 Thread Nick Ing-Simmons


Dan Sugalski <[EMAIL PROTECTED]> writes:
>At 01:05 PM 12/29/00 +0000, Nick Ing-Simmons wrote:
>>Dan Sugalski <[EMAIL PROTECTED]> writes:
>> >
>> >I'm reasonably certain that all platforms that perl will ultimately run on
>> >can muster hardware support for 16-bit integers.
>>
>>Hmm, most modern RISCs are very bad at C-like 16-bit arithmetic - they have
>>a tendency to widen to 32-bits.
>
>That's fine. I was thinking of smaller processors that might be used in 
>embedded apps and such. (I'm also not sure what's the most efficient 
>integer representation on things like the ARM microprocessors are)

ARM7/ARM9 are both 32-bit
MIPS has both 32-bit and 64-bit variants.
DSPs are more messy.

It is micro-controllers that you have to worry about 

>
-- 
Nick Ing-Simmons

Re: standard representations

2000-12-30 Thread Nick Ing-Simmons

Dan Sugalski <[EMAIL PROTECTED]> writes:
>
>Anyone know of a good bigint/bigfloat library whose terms are such that we 
>can just snag the source and use it in perl?

There was some traffic on gcc list recently about a GNU one (presumably GPL
only).

>I don't really care to write 
>the code for division, 

As I recall Knuth has something on it.
I know that some hardware FPUs do division (N/M) by 
Newton-Raphson expansion of 1/M and then do N*(1/M).

>let alone the transcendental math ops...

TI's sources for those site some book or other.
The snag with those and sqrt() etc. is that the published algorithms
"know" how many terms of power series are needed to reach (say) IEEE-754
"double". 

Thus a "big float" still needs to decide how precise it is going to 
be or atan2(1,1)*4 (aka PI) is going to take a while to compute...

-- 
Nick Ing-Simmons

Re: standard representations

2000-12-29 Thread Nick Ing-Simmons

Dan Sugalski <[EMAIL PROTECTED]> writes:
>
>BigInt and BigFloat are both pure perl, and as such their speed leaves a 
>*lot* to be desired. Fixing that (at least yanking some of it to XS) has 
>been on my ToDo list for a while, but other stuff keeps getting in the 
>way... :)

My own "evolutionary" view of things is that if we did XS versions 
of BigInt and BigFloat for perl5 we would learn some issues that might 
affect Perl6. i.e. the vtable entries for "ints" may be influenced 
by their use as building blocks for "floats".

For example the choice of radix in the BigInt case - should it be N*16-bits
or should we try and squeeze 32-bits - or to avoid issues with sign 
should that be 15 or 31? (If we assume we use 2's complement then LS words
are treated as unsigned only MS word has sign bit(s).)

BigFloat could well build on BigInt for its "mantissa" and have another
int-of-some-kind as its exponent. We don't need to pack it tightly
so we should probably avoid IEEE-like hidden MSB. The size of exponent 
is one area where "known range of int" is important.

-- 
Nick Ing-Simmons

Re: standard representations

2000-12-29 Thread Nick Ing-Simmons


Dan Sugalski <[EMAIL PROTECTED]> writes:
>
>I'm reasonably certain that all platforms that perl will ultimately run on 
>can muster hardware support for 16-bit integers. 

Hmm, most modern RISCs are very bad at C-like 16-bit arithmetic - they have
a tendency to widen to 32-bits.

>I also expect that they 
>can all muster at least software support for 32-bit integers. However
>
>The issue isn't support, it's efficiency. Since we're not worrying about 
>loss of precision (as we will be upconverting as needed) the next issue is 
>speed, and that's where we want things to be in a platform convenient size.
>
>I honestly can't think of any reason why the internal representation of an 
>integer matters to the outside world, but if someone can, do please 
>enlighten me. :)

I can't think of anything except the range that is affected by the 
representation.

-- 
Nick Ing-Simmons

Re: standard representations

2000-12-29 Thread Nick Ing-Simmons

Dan Sugalski <[EMAIL PROTECTED]> writes:
>
>Strings can be of three types--binary data, platform native, and UTF-32. 
>No, we are not messing around with UTF-8 or 16, nor are we messing with 
>EBCDIC, shift-JIS, or any of that stuff. 

I don't understand that in the light of supporting "platform native".
That could easily be any of those as you note below. So what operations
are supported on "platform native" strings? Are we at the mercy of locale's
idea of upper/lower case, sort order etc.?

>Strings can be stored internally 
>that way (and the native form might be one of them) but as far as the 
>interface is concerned we have only three. Yes, this does mean if we mess 
>with strings in UTF-8 format on a non-UTF-8 system they'll need to be fed 
>out in UTF-32. It's bigger, but we can deal.

-- 
Nick Ing-Simmons

Re: String representation

2000-12-21 Thread Nick Ing-Simmons


Nicholas Clark <[EMAIL PROTECTED]> writes:
>> 
>> where it is possible to get "smart" when one arg is a "special case" of 
>> the other.
>
>> And similarly numbers must be convertable to "complex long double" or
>> what ever is the top if the built-in tree ? (NV I guess - complex is
>> over-kill.)
>
>> It is the how do we do the generic case that worries me.
>
>Maybe this is a digression, but it does suggest that there may not
>be 1 top to the tree (at least for builtin numbers). Which may also hold
>for strings.

Which is why it worries me. If I invent a new number type (say),
what vtable entries must it have to allow all the generic things
to function? Given a choice between NV/UV/IV possibles on what basis do we 
choose one branch over the other?


>
>> We old'ns need people that don't know "it can't be done" to tell us
>> how to do it - but we reserve the right to say "we tried that it didn't
>> work" too.
>  ^ because
>
>Nicholas Clark
-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: mixed numeric and string SVs.

2000-12-21 Thread Nick Ing-Simmons


David Mitchell <[EMAIL PROTECTED]> writes:
>> > 2. Each SV has 2 vtable pointers - one for it's numeric representation
>> > (if any), and one for its string represenation (if any). Flexible, but
>> > may require an extra 4/8 bytes per SV.
>> 
>> It may not be terrible. How big is the average SV already anyway?
>
>True, but I've just realised a complication with my suggestion. If
>there are a multiple vtable ptrs per SV, which type 'owns' the SV carcass,

Perl owns the carcass. Each vtable would have its own payload portion
and be responsible for its destruction and cleanup.
This is classical "multiple inheritance" scheme.

>and is responsible for destruction, and has permission to put its
>own stuff in the payload area etc? I think madness might this way lie.
>
>So here's a modified suggestion. Rather than having 2 vtable ptrs per scalar,
>we allow a string type to contain an optional pointer to another
>subsidiary SV containing its numeric value. (And vice versa).

That would work too.

>
>Then for example the getint() method for a utf8 string type might look like:
>
>utf8_getint(SV *sv) {
>   if (sv->subsidiary_numeric_sv == NULL) {
>   sv->subsidiary_numeric_sv = Numeric->new(aton(sv->value));
>   }
>   return sv->subsidiary_numeric_sv->getint();
>}
>
>(uft8 stringgy methods that alter the string value of the SV are then
>responsible for either destroying the subsidiary numeric SV, or for making
>sure it's value gets updated, or for setting a flag warning that it's
>value needs recalculating.)
>
>Similarly, the stringy methods for numeric types are wrappers that
>optionally create a subsidiary string SV, then pass the call onto that
>object.
>
>Or to avoid the conditional each time, there could be 2 vtables for each
>type, containing 'with subsidiary' and 'without subsidiary' methods;
>the role of the latter being to create the subsidiary SV and update the
>type of the main SV to the 'with subsidiary' type.
-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: String representation

2000-12-21 Thread Nick Ing-Simmons


Philip Newton <[EMAIL PROTECTED]> writes:
>On 18 Dec 00, at 15:21, Nick Ing-Simmons wrote:
>
>> There needs to be a hierachy of _repertoires_ such that:
>> 
>> ASCII is subset of Native is subset of wchar_t is subset of UNICODE.
>
>But we can't even rely on that. I can imagine a couple of Native 
>encodings around that fiddle with ASCII 

Then it isn't ASCII, it is ISO-646 or whatever.

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: String representation

2000-12-20 Thread Nick Ing-Simmons

math ops on 
>> well understood hierachies etc. are all easy enough - it is the 
>> combinations that get very messy very very quickly. 
>
>I couldnt agree more - however, I think that issue is mostly orthogonal
>to whether most pp_ functions should have vtable equivalents. If the
>functionality is built dirrectly into pp_XXX, you still have a combinatorial
>mess to cope with - hiving off into vtables *may* reduce the mess, or
>*might* increase it, depending on how its done. 

If it "depends" then it isn't strictly "orthogonal".

>
>One final thing - I'm fairly new to this game (I thought the start of Perl6
>would be a good time to get involved, without having to understand
>the horrors of perl5 internals in depth), which means I run more of a risk
>than most of speaking from my derierre. So far I have been reluctant to
>put forward any really substantial suggestions as to how to handle
>all this stuff, mainly for fear of irritating people who know what
>they are talking about, and who have to take time out to explain to me why I'm
>wrong! On the other hand, I do seem to have ended up taking a lot about
>this subject on perl6-internals!!
>So, should I have the courage of my convictions and let rip, or should I
>just leave this to wiser people? Answers on a postcard, please

We old'ns need people that don't know "it can't be done" to tell us
how to do it - but we reserve the right to say "we tried that it didn't
work" too.


-- 
Nick Ing-Simmons

Re: mixed numeric and string SVs.

2000-12-20 Thread Nick Ing-Simmons


David Mitchell <[EMAIL PROTECTED]> writes:
>Has anyone given thought to how an SV can contain both a numeric value
>and string value in Perl6?
>Given the arbitrary number of numeric and string types that the vatble
>scheme of Perl6 support it will be unviable to to have special types
>for all permuations (eg, utf8_nv, unicode32_iv, ascii_bitint, ad nauseum).
>
>It seems to me the following options are poossible:
>
>1. We no longer save conversions, so
>   $i="3"; $j+=$i for (...);
>does an aton() or similar each time round the loop

Well just the 1st time - then it is a number...

>
>2. Each SV has 2 vtable pointers - one for it's numeric representation
>(if any), and one for its string represenation (if any). Flexible, but
>may require an extra 4/8 bytes per SV.

This is my favourite.

>
>3. We decree that all string to numeric conversions should return
>a particular numeric type (eg NV), and that all numeric to string
>conversions should similary convert to a fixed string type (eg utf8).
>(Although I'm not sure that really helps.)

I can't see how that helps.

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: Garbage collector slowness

2000-12-20 Thread Nick Ing-Simmons

Mark-Jason Dominus <[EMAIL PROTECTED]> writes:
>> "The new version must be better because our gazillion dollar marketing
>> campaign said so.  (We didn't really *fix* anything.)  
>
>The part I found interesting was the part about elimination of the message.

printing messages can be surprisingly slow - if they go to 
unbuffered stderr which is an X window of some kind they can end up 
waiting for an ACK from the X server, which may have to wait for 
blanking and a move of a mega-pixel or two to do a scroll.

>
>Perceived slowness is also important.

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: String representation

2000-12-18 Thread Nick Ing-Simmons


Jarkko Hietaniemi <[EMAIL PROTECTED]> writes:
>On Mon, Dec 18, 2000 at 03:21:05PM +0000, Nick Ing-Simmons wrote:
>> Simon Cozens <[EMAIL PROTECTED]> writes:
>> >
>> >So, before we start even thinking about what we need, it's time to look at the
>> >vexed question of string representation. How do we do Unicode without getting
>> >into the horrendous non-Latin1 cockups we're seeing on p5p right now? 
>> 
>> Well - my theorist's answer is that everything is Unicode - like Java.
>
>That would be nice, yes.
>
>> As I pointed out on p5p even EBCDIC machines can use that model - but 
>> the downside is that ord('A') == 65 which will breaks backward compatibility 
>> with EBCDIC scripts. 
>
>Maybe we need $ENV{PERL_ENCODING} to control ord() and chr(), too?

That was my suggestion last week some time - though not stated as clearly!

>
>> Tagging a string with a repertoire and encoding is horrible - you are aware 
>
>Indeed.  We have had a very rough ride trying to get just two
>encodings to play well together, trying to support more simultaneously
>would be pure combinatorial masochism.  I say we should strive for
>converting everything to/from one agreed-upon internal encoding.  Yes,
>this is somewhat counter to the idea 'no preferred internal encoding'.
>After pondering about the issue I have come around to "Oh, yes, there
>should be one preferred internal encoding.", otherwise we banish
>ourselves to much nashing of the teeth.  Off-hand, I think it's only
>when there would be information loss when the One True Encoding
>conversion shouldn't be done.  What's the OTE, then?  Well, UTF-16 or
>UTF-32, I guess.  The redeeming features of UTF-8, that it is 1:1 for
>ASCII, and also compact for ASCII, frankly are getting rather thing in
>my eyes.

But not in mine (yet) - but then IO is just throwing gobs of bytes about
and regexps are introspecting. (And Encode has to handle variable-length
multi-byte gunk anyway.) 

-- 
Nick Ing-Simmons

Re: String representation

2000-12-18 Thread Nick Ing-Simmons

David Mitchell <[EMAIL PROTECTED]> writes:
>Nick Ing-Simmons <[EMAIL PROTECTED]> wrote:
>> What are string functions in your view?
>>   m//
>>   s///
>>   join()
>>   substr
>>   index
>>   lc, lcfirst, ...
>>   & | ~
>>   ++
>>   vec  
>>   '.'
>>   '.='
>> 
>> It rapidly gets out of hand.
>
>Perhaps, but consider that somewhere within the perl internals there
>have to be functions which implement all these ops anyway. If we
>provide vtable slots for all these functions and just fill most of the
>slots with pointers to the 'default' Perl implementation, we havent
>really lost anything, except possibly a slight delay due to the extra
>indirection which that may be compensated for elsewhere). On the other
>hand, we have gained the ability to replace the default implementation
>with something more efficent where it suits us.

I have just been through exactly that process with the PerlIO stuff.
So I hope you will not take offence when I say that your observation above
is simplistic. The problem is "what are the (types of) the arguments passed
to the functions?" - the existing code will be expecting its args in 
a particular form. So your wonderous new function must accept exactly 
those args and types - and convert them as necessary before becoming 
more efficient. So to get any win the args/types of all the functions 
has to be designed with pluggable-ness in mind from the outset.
At best this means taking an indirection hit for all the args as well 
as the function (this is what PerlIO does - PerlIO is now essentially 
a FILE ** rather than a FILE *). 

At worst we have to write a "worst case" override entry for each op and 
then work what it needs back - this is exemplified by PerlIO_getpos()
the "position" arg had to stop being an Fpos_t and become an SV *
so that stdio could stuff an Fpos_t in it, but a transcoding layer
could put the Fpos_t, and the escape-state and partial characters in as 
well.

>
>Take the example of substr() - if this is a standalone function, then
>it has to work without reference to any of the internals of its args,
>and thus has to rely on extracting a 'standard' representation of the
>string value from the SV in order to operate upon it. This then implies
>messiness of coding and inefficiency, with all the unicode hell that
>infects perl5 re-appearing.  If substr() were a per-type op, then the
>messy details of UTF8 would lie almost completely within the internal
>implementation of that datatype.

True, but the messy details would now occur multiple times,
as soon as substr_utf8 exists then _ALL_ the other string ops 
_must_ be overridden as well because nothing but string_utf8 "class" 
knows what is going on.

>
>In fact, I would argue that in general most if not all the operations currently
>performed by pp_* should have vtable equivalents, both for numeric and string
>types (including unary ops, mutators, binops etc etc).

Hmm - that is indeed a logical position. 

>
>> Seriously - I think we need to considr the original question 
>> "What is the representation" based on perl5 hindsight, then think what 
>> operations we want to perform on it, then divide those into the ones
>> which make sense to be "methods" (vtable entries) of string, 
>> those that are part of string API, and those which are just ops messing 
>> with strings.
>
>If an "op messing with strings" might be able to do a faster job given
>access to the internals of that string type, then I'd argue that that op
>should be in the vtable too.

I can see your position. 

perl6 = Union_of(I32_perl, I64_perl, float_perl, double_perl, long_double_perl,
  ASCII_perl,  UTF8_perl, ShiftJis_perl, 
  Complex_rational_perl, right_to_left_perl,
)

or 

class perl
{
 virtual SV *add(SV *,SV *);
 ...
 virtual SV *y(SV *,SV *); 
}

The snag here is that the volume of code explodes and gets splattered 
all over the sub-classes. So to fix a bug in the '+' operator (pp_plus)
one has to go visit lots of places - but, presumably, the bug will 
only be in one of them.

If this is to fly (and I am not saying it cannot), then the 
"multiple despatch" issue needs to have a clean process so that 
it is clear what happens if someone writes:

  my $complex_rational = $urdu_string / sqrt(-$big_integer);

The string needs to get converted to a number knowing which characters
are digits and what the Urdu for 'i' is. The big integer needs to get 
negated (no sweat) then someone's sqrt() gets called and had better not 
barf on the -ve value, then complex_rational can do the right thing.

In other words - string ops on strings of uniform type, math ops on 
well understood hierachies etc. are all easy enough - it is the 
combinations that get very messy very very quickly. 

-- 
Nick Ing-Simmons

Re: String representation

2000-12-18 Thread Nick Ing-Simmons


Nicholas Clark <[EMAIL PROTECTED]> writes:
>On Fri, Dec 15, 2000 at 11:18:00AM -0600, Jarkko Hietaniemi wrote:
>
>> As painful as it may sound (codingwise) I would urge to spare some
>> thought to using (internally) UTF-32 for those encodings for which
>> UTF-8 would be *longer* than the UTF-32 (mainly the Asian scripts).
>
>most CPUs can load a 32 bit quantity in 1 machine instruction
>most CPUs would take 2 or 3 machine instructions to load 2 or 3 bytes of
>variable length encoding, and I'd guess that on most RISC CPUs those
>three instructions take three times the space, 

Okay so far.

>(and take 3 times the
>single load instruction)

Almost certainly more than the single load, but much less than 3 
due to cache effects.

>And that's ignoring the code to bit shuffle those bytes that make up the
>character.
>
>So it may be more total space efficient to use 32 bits for data.
>And although it feels like we'll be shifting 32 bits of data round per
>character instead of 8-40 with an average less than 32, it might still take
>longer because we're doing it less efficiently.

My big worry is that "strings" are would fill the data cache much more quickly.


>
>Just a passing thought. Extrapolated up from 1 RISC CPU I know quite well.
>
>Nicholas Clark
-- 
Nick Ing-Simmons

Re: String representation

2000-12-18 Thread Nick Ing-Simmons


David Mitchell <[EMAIL PROTECTED]> writes:
>> Personally I would not use such a beast 
>
>But with different encodings implemented by different SV types - each with their
>own vtable - surely most of this will "come out in the wash", by the correct
>method automatically being called. I thought that was the big selling point
>of vtables :-)
>
>(Or to put it another way - is the debate about handling multiple string
>encodings really just the same debate as the handling of multiple numeric types
>(but harder...) ?)

It is exactly the same as the 
   enormous_int ** complex_rational  problem.

   if ("N{gamma}".title_case(join($klingon,@welsh)) =~ /$urdu/)

who's operators get called ? 


-- 
Nick Ing-Simmons

Re: String representation

2000-12-18 Thread Nick Ing-Simmons

Simon Cozens <[EMAIL PROTECTED]> writes:
>
>So, before we start even thinking about what we need, it's time to look at the
>vexed question of string representation. How do we do Unicode without getting
>into the horrendous non-Latin1 cockups we're seeing on p5p right now? 

Well - my theorist's answer is that everything is Unicode - like Java.
As I pointed out on p5p even EBCDIC machines can use that model - but 
the downside is that ord('A') == 65 which will breaks backward compatibility 
with EBCDIC scripts. 

If perl5.7+ EBCDIC continues down its alternate road
and we need to be able to translate perl5 -> perl6 I strongly suspect 
that perl6 cannot use the "java-oid" model either as the programmer's
intent will not be obvious enough to auto-translate.
I still haven't grasped what the current EBCDIC "model as seen by perl
programmer" _is_.

>Larry
>suggested aeons ago that everything is an array of numbers, and Perl shouldn't
>care what those numbers represent. But at some point, it has to, and that
>means things have to be tagged with their character repetoires and encodings.

Tagging a string with a repertoire and encoding is horrible - you are aware 
of the trickyness of even getting the SvUTF8 bit "right". To have 
a general representation carried around we need a pointer rather just a bit
and we cannot say 
   if (SvUTF8(sv))

we have to say 

   if (SvENCODING(sv)->some_predicate)

e.g. 

   if (SvENCODING(sv_a) != SvENCODING(sv_b))
{
 if (SvENCODING(sv_a)->is_superset_of(SvENCODING(sv_b))
  {
   sv_upgrade_to(sv_b,SvENCODING(sv_a));
  }
 elsif if (SvENCODING(sv_b)->is_superset_of(SvENCODING(sv_a))
  {
   sv_upgrade_to(sv_a,SvENCODING(sv_b));
  }
 else
  {
   Encoding *x = find_superset_encoding(SvENCODING(sv_a),SvENCODING(sv_b))
   sv_upgrade_to(sv_a,x);
   sv_upgrade_to(sv_b,x);
  }
} 

Personally I would not use such a beast 

The only sane compromise I can imagine is close to what we have at the 
moment with maybe a few extra special cases in the "flags" bits:
   ASCII only   (0..7f)
   Native-single-byte   (iso8859-x, IBM1047)
   wchar_t 
   UTF-8
   UNICODE

There needs to be a hierachy of _repertoires_ such that:

ASCII is subset of Native is subset of wchar_t is subset of UNICODE.

The "Native-single-byte" would have one - global-to-interpreter
encoding object - not just iso8859-1 - basically the one that LC_CTYPE
gives the "right answers for" - though how the  "£!$^¬!*% one is supposed 
to find that out is beyond me - so we would presumably invert that 
and use the Unicode CTYPE-oid stuff to do isALPHA() etc.

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: String representation

2000-12-18 Thread Nick Ing-Simmons

David Mitchell <[EMAIL PROTECTED]> writes:
>
>Personally I feel that that string part of the SV API should include most
>(if not all) string functions, including regex matching and substitution.

What are string functions in your view?
  m//
  s///
  join()
  substr
  index
  lc, lcfirst, ...
  & | ~
  ++
  vec  
  '.'
  '.='

It rapidly gets out of hand.

Why not eval "$string" as well ? ;-)
then in the limit perl can just become eval scalar();

Seriously - I think we need to considr the original question 
"What is the representation" based on perl5 hindsight, then think what 
operations we want to perform on it, then divide those into the ones
which make sense to be "methods" (vtable entries) of string, 
those that are part of string API, and those which are just ops messing 
with strings.

>That way way there can be multiple regex implementations to handle different
>cases (eg  fast one(s) for fixed width ASCII, UTF-32 etc, and a slow horrible one
>for variable-length UTF-8, etc). Of course perl itself could provide a default regex
>engine usable by all string types, but implementors would then be free to add
>variants for custom string types.

I would argue one does that by making the regex API more modular.

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: Opcodes (was Re: The external interface for the parser piece)

2000-12-12 Thread Nick Ing-Simmons

David Mitchell <[EMAIL PROTECTED]> writes:
>
>I think this this boils down to 2 important questions, and I'd be interested in
>hearing people's opinions of them.
>
>1. Does the Perl 6 language require some explicit syntax and/or semnatics to
>handle multiple and user-defined numeric types?
>Eg "my type $scalar",  "$i + integer($r1+$r2)" and so on.

That is a Language and not an internals issue - Larry will tell us.
But I suspect the answer is that it should "work" without any special 
stuff for simple perl5-ish types - because you need to be able to 
translate 98% of 98% of perl5 programs.

So we should start from the premise "no" and see where we get ...

>2. If the answer to (1) is yes, is it possible to decide what the numeric part of
>the vtable API should be until the details of (1) has been agreed on?
>
>I supect the answers are yes and no.

I suspect the answers are "no" and (2) is eliminated as "dead code" ;-)

>
>Dave.
-- 
Nick Ing-Simmons

Re: SvPV*

2000-11-28 Thread Nick Ing-Simmons


Dave Storrs <[EMAIL PROTECTED]> writes:
>On Tue, 21 Nov 2000, Jarkko Hietaniemi wrote:
>
>> Yet another bummer of the current SVs is that they poorly fit into
>> 'foreign memory' situations where the buffer is managed by something
>> else than Perl.  "No, thank you, Perl, keep your greedy fingers off
>> this chunk.  No, you may not play with it."
>
>
>   Out of curiousity, when might such a situation arise?  When you
>are embedding C in Perl, perhaps?

Or calling an external library which returns a pointer to data.
Right now we _have_ to copy it as there is no way to tell perl 
to (say) XFree() it rather than Safefree() it. Which is a pain when data
is big.

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: Perl Implementation Language

2000-09-20 Thread Nick Ing-Simmons


Tom Hughes <[EMAIL PROTECTED]> writes:
>
>What I'd like to see us avoid is the current situation where trying
>to examine the value of an SV in the debugger is all but impossible
>for anybody other than a minor god.

What is so hard about:

gdb> call Perl_sv_dump(sv)

???


>
>Tom
-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: A tentative list of vtable functions

2000-09-14 Thread Nick Ing-Simmons

Nathan Torkington <[EMAIL PROTECTED]> writes:
>Dan Sugalski writes:
>> It's possible, for example, for a tied/overloaded/really-darned-strange 
>> variable to look true but still be false. If you do:
>> 
>>$foo = $bar || $baz;
>> 
>> and both $bar and $baz are objects, the 'naive' way is to make $foo be 
>> $bar. But it's distinctly possible that $bar really should be treated as a 
>> false value and $baz be used instead. Why? Dunno. Serious hand-waving here. 
>> (And yes, I know that's a danger sign... :) But I don't see any reason to 
>> preclude the possibility.
>
>You can do that right now in perl5, by using overload.pm and supplying
>a 'bool' method.

In practice both Damian and I have been bitten by inability to overload || 
and && - you can indeed pick which side is kept but you cannot make 
it keep both. So "defered" action is not possible.

I can make $a + $b return bless ['+',$a,$b],'OperatorNode' but you cannot
get $a && $b to produce bless ['&&',$a,$b],'OperatorNode' whatever you do. 

-- 
Nick Ing-Simmons

Re: A tentative list of vtable functions

2000-09-13 Thread Nick Ing-Simmons


Ken Fox <[EMAIL PROTECTED]> writes:
>Dan Sugalski wrote:
>> For something like:
>> 
>>@foo = @bar || @baz;
>> 
>> I have no problem with the call sequence looking like (pseudo-codish here):
>> 
>> set_context(ARRAY, ASSIGN);
>> foo->store(bar->log_or(bar, baz));
>
>But log_or must short circuit -- 

And what above suggests it does not?
It is up to bar's log_or not to evaluate baz if bar is considered true.

>I think we have to preserve that behavior
>for all types or the (hypothetical future) optimizer might break things.
>It might translate "if" statements into ||, or vice versa. It might do dead
>code elimination. It might do liveness analysis.

It already does and it is a pain when you are trying to give meaning
to && and || for overloaded objects.

I happend to have a 'need' for || / && which "short circuit later" i.e.

 my $result = $obj1 || $obj2;  # looks at both and builds "tree"

 # perhaps this $obj1->become_false; 

 if ($result)  { ... } # does $obj1->value || $obj2->value NOW

>
>IMHO syntax changes (like creating non-short circuiting logicals) 

Semantic not syntax. 

>need to
>be done above the opcode level before the optimizer or compiler sees anything.
>That allows the opcodes to have stable well-known semantics.

Agreed - but the vtable scheme above does not preclude that.

>
>- Ken
-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: one question about vtbls

2000-09-11 Thread Nick Ing-Simmons


Benjamin Stuhl <[EMAIL PROTECTED]> writes:
>I have one question about vtbls that I have not been able
>to figure out an answer to:
>
>  How does using a vtbl get rid of the switch(sv->sv_flags)
>with multi-valued scalars running around? That is, how does
>one write a vtbl function that can cope with the perl6
>equivalent of perl5's
>
>   sv_setiv(sv, 42);
>   sv_setpv(sv, 
> "The answer to Life, the Universe, and Everything");
>   SvIOK_on(sv);

More or less the same way you do in perl5 

- upgrade the sv to a dual-var type
- call the var's set_integer entry
- call the var's set_string entry

Then when you want the string form call the vars get_string
it has to check private internal flags I guess - the point is that 
even in perl these weird variables are rare. We have removed flag 
testing for the common cases and only have it where required. 

>
>?
>
>-- BKS
>
>__
>Do You Yahoo!?
>Yahoo! Mail - Free email you can access from anywhere!
>http://mail.yahoo.com/
-- 
Nick Ing-Simmons

Re: RFCs for thread models

2000-09-11 Thread Nick Ing-Simmons


Steven W McDougall <[EMAIL PROTECTED]> writes:
>
>My point is that we can't work with guesses and exercises.
>We need a specific, detailed proposal that we can discuss and
>evaluate. I'm hoping that someone will submit an RFC for one.

Start with perl5.6.0's ithreads model.

-- 
Nick Ing-Simmons

Re: RFCs for thread models

2000-09-11 Thread Nick Ing-Simmons


Steven W McDougall <[EMAIL PROTECTED]> writes:
>1. All threads execute the same op tree
>
>Consider an op, like
>
>   fetch(b)
>
>If you actually compile a Perl program, like
>
>   $a = $b
>   
>and then look at the op tree, you won't find the symbol "$b", or "b"
>anywhere in it. 

But it isn't very far away (at least for lexicals) ;-)

>The fetch() op does not have the name of the variable
>$b; rather, it holds a pointer to the value for $b.

It holds and index into the scratch-pad. Subs have scratch-pads 
which are cloned as needed during recursion etc. 

>
>If each thread is to have its own value for $b, then the fetch() op
>can't hold a pointer to *the* value. 

Each thread's view of the sub has its own scratch-pad - value is at same 
index in each.

-- 
Nick Ing-Simmons

Re: A tentative list of vtable functions

2000-09-09 Thread Nick Ing-Simmons


Ken Fox <[EMAIL PROTECTED]> writes:
>Short
>circuiting should not be customizable by each type for example.

We are already having that argument^Wdiscussion elsewhere ;-)

But I agree variable vtables are not the place for that.

-- 
Nick Ing-Simmons

Re: RFC 178 (v2) Lightweight Threads

2000-09-09 Thread Nick Ing-Simmons

Chaim Frenkel <[EMAIL PROTECTED]> writes:
>
>NI> Indeed that is exactly how tied arrays work - they (automatically) add 
>NI> 'p' magic (internal tie) to their elements.
>
>Hmm, I always understood a tied array to be the _array_ not each individual
>element.

The perl level tie is on the array. That adds C level 'P' magic.
When you do an access to an array element the 'P' magic adds 'p' magic to 
a (proxy for) the element. The 'p' magic invokes FETCH or STORE.

It does not have to be that way - but it is in perl5.

>
>NI> Tk apps to this all the time :
>
>NI>  $parent->Lable(-textvariable => \$somehash{'Foo'});
>
>NI> The reference is just to get the actual element rather than a copy.
>NI> Tk then ties the actual element so it can see STORE ops and up date 
>NI> label.
>
>Would it be a loss to not allow the elements? The tie would then be
>to the aggregate.

Yes it would be a loss. If only one element in a 1,000 entry array 
is being watched by a widget that is a LOT of extra work checking 
accesses to the other 999 elements. You would also have to allow
1,000 ties on the SAME array so that 1,000 widgets could each watch 
one element. Or force higher level code to implement element watching
by watching the array and then re-despatching to the approriate inner
"tie" - which is what we have now ;-)

>
>I might argue that under threading tieing to the aggregate may be 'more'
>correct for coherency (locking the aggregate before accessing.)

I don't disagree - the lock may well be best on the aggregate - but that does 
not mean the tie has to be. 

-- 
Nick Ing-Simmons

Re: Event model for Perl...

2000-09-09 Thread Nick Ing-Simmons


Grant M. <[EMAIL PROTECTED]> writes:
>I am reading various discussions regarding threads, shared objects,
>transaction rollbacks, etc., and was wondering if anyone here had any
>thoughts on instituting an event model for Perl6? I can see an event model
>allowing for some interesting solutions to some of the problems that are
>currently being discussed.

Yes - Uri has started [EMAIL PROTECTED] to discuss that stuff.


>Grant M.
-- 
Nick Ing-Simmons

Re: RFC 178 (v2) Lightweight Threads

2000-09-08 Thread Nick Ing-Simmons

Chaim Frenkel <[EMAIL PROTECTED]> writes:
>
>What tied scalar? All you can contain in an aggregate is a reference
>to a tied scalar. The bucket in the aggregate is a regular bucket. No?

I tied scalar is still a scalar and can be stored in a aggregate.

Well if you want to place that restriction on perl6 so be it but in perl5
I can say 

tie $a[4],'Something';

Indeed that is exactly how tied arrays work - they (automatically) add 
'p' magic (internal tie) to their elements.

Tk apps to this all the time :

 $parent->Lable(-textvariable => \$somehash{'Foo'});

The reference is just to get the actual element rather than a copy.
Tk then ties the actual element so it can see STORE ops and up date 
label.

-- 
Nick Ing-Simmons

Re: RFC 130 (v4) Transaction-enabled variables for Perl6

2000-09-08 Thread Nick Ing-Simmons


Bart Lateur <[EMAIL PROTECTED]> writes:
>On Wed, 06 Sep 2000 11:23:37 -0400, Dan Sugalski wrote:
>
>>>Here's some high-level emulation of what it should do.
>>>
>>> eval {
>>> my($_a, $_b, $c) = ($a, $b, $c);
>>> ...
>>> ($a, $b, $c) = ($_a, $_b, $_c);
>>> }
>>
>>Nope. That doesn't get you consistency. What you need is to make a local 
>>alias of $a and friends and use that.
>
>My example should have been clearer. I actually intended that $_a would
>be a variable of the same name as $a. It's a bit hard to write currently
>valid code that way. Second attempt:
>
>   eval {
>   ($a, $b, $c) = do {
>   local($a, $b, $c) = ($a, $b, $c); #or my(...)
>   ... # code which may fail
>   ($a, $b, $c);
>   };
>   };
>
>So the final assignment of the local values to the outer scoped
>variables will happen, and in one go, only if the whole block has been
>executed succesfully.

So what is wrong with (if you mean that) saying:

>>> eval {
>>> my($_a, $_b, $_c) = ($a, $b, $c);
>>> ...
lock $abc_guard;
>>> ($a, $b, $c) = ($_a, $_b, $_c);
>>> }

Then no one has to guess what is going on?

But what do you do if $b (say) is tied so that assign to it needs
a $abc_guard lock in another thread for assign to complete?
i.e. things get hairy in the "final assignment".

>
>I would simply block ALL other threads while the final group assignment
>is going on. This should finish typically in a few milliseconds.

So we "only" stall the other CPUs for a few million instructions each ;-)

>
>>It also means that if we're including *any* sort of external pieces (even 
>>files) in the transaction scheme we need to have some mechanism to roll 
>>back changes. If a transaction fails after truncating a 12G file and 
>>writing out 3G of data, what do we do?
>
>That does not belong in the kernel of a language. All that you may
>expect, is transactions on simple variables; plus maybe some hooks to
>attach external transaction code (transactions on files etc) to it. A
>simple "create a new file, and rename to the old filename when done"
>will usually do.

I am concerned that this is making "simple things easyish, BUT hard things 
impossible". i.e. we have a scheme which will be hard to explain, 
will only cover a few fairly uninteresting cases, and get in the 
way of doing it "properly".


-- 
Nick Ing-Simmons

Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Nick Ing-Simmons


Alan Burlison <[EMAIL PROTECTED]> writes:
>Nick Ing-Simmons wrote:
>
>> The tricky bit i.e. the _design_ - is to separate the op-ness from the
>> var-ness. I assume that there is something akin to hv_fetch_ent() which
>> takes a flag to say - by the way this is going to be stored ...
>
>I'm not entirely clear on what you mean here - is it something like
>this, where $a is shared and $b is unshared?
>
>   $a = $a + $b;
>
>because there is a potential race condition between the initial fetch of
>say $a and the assignment to it?  

>My response to this is simple - tough.  

That is mine too - I was trying to deduce why you thought op tree had to change.

I can make a weak case for 

   $a += $b;

Expanding to 

   a->vtable[STORE](DONE => 1) = a->vtable[FETCH](LVALUE => 1) + 
 b->vtable[FETCH](LVALUE => 0);
   
but that can still break easily if b turns out to be tied to something 
that also dorks with a.

-- 
Nick Ing-Simmons

Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Nick Ing-Simmons

Alan Burlison <[EMAIL PROTECTED]> writes:
>
>Another good reason for having separate interpreter instances for each
>thread is it will allow people to write non-threaded modules that can
>still be safely used inside a threaded program.  Let's not forget that
>the overwhelming bulk of CPAN modules will probably never be threaded. 
>By loading the unthreaded module inside a 'wrapper' thread in the
>program you can safely use an unthreaded module in a threaded program -
>as far as the module is concerned, the fact that there are multiple
>threads is invisible.  This will however require that different threads
>are allowed to have different optrees 

Why ? 

I assume because you need to use 'special ops' if the variables that 
are used happend to be 'shared'?

If so this is one area where I hope the vtable scheme is a clear win:
the 'op' does not need to know what sort of variable it is - it just 
calls the vtable entry - variable knows what sort it is and does the
right thing. 

The tricky bit i.e. the _design_ - is to separate the op-ness from the 
var-ness. I assume that there is something akin to hv_fetch_ent() which 
takes a flag to say - by the way this is going to be stored ...

>- perhaps some sort of 'copy on
>write' semantic should be used so that optrees can be shared cheaply for
>the cases where no changes are made to it.

I would really like to keep optrees (bytecode, IR, ...) readonly if
at all possible.

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Nick Ing-Simmons


Alan Burlison <[EMAIL PROTECTED]> writes:
>Jarkko Hietaniemi wrote:
>
>> Multithreaded programming is hard and for a given program the only
>> person truly knowing how to keep the data consistent and threads not
>> strangling each other is the programmer.  Perl shouldn't try to be too
>> helpful and get in the way.  Just give user the bare minimum, the
>> basic synchronization primitives, and plenty of advice.
>
>Amen.  I've been watching the various thread discussions with increasing
>despair.  

I am glad it isn't just me !

And thanks for re-stating the interpreter-per-thread model.

>Most of the proposals have been so uninformed as to be
>laughable.  

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Nick Ing-Simmons

Chaim Frenkel <[EMAIL PROTECTED]> writes:
>
>Some series of points (I can't remember what they are called in C)

Sequence points.

>where operations are consider to have completed will have to be
>defined, between these points operations will have to be atomic.

No, quite the reverse - absolutely no promisses are made as to state of
anything between sequence points - BUT - the state at the sequence 
points is _AS IF_ the operations between then had executed in sequence.

So not _inside_ these points the sub-operations are atomic, but rather
This sequence of operations is atomic.

The problem with big "atoms" is that it means if CPU A. is doing a 
complex atomic operation. the CPU B has to stop working on perl and go 
find something else to do till it finishes.

>
>
-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Nick Ing-Simmons

Chaim Frenkel <[EMAIL PROTECTED]> writes:
>>>>>> "JH" == Jarkko Hietaniemi <[EMAIL PROTECTED]> writes:
>
>JH> Multithreaded programming is hard and for a given program the only
>JH> person truly knowing how to keep the data consistent and threads not
>JH> strangling each other is the programmer.  Perl shouldn't try to be too
>JH> helpful and get in the way.  Just give user the bare minimum, the
>JH> basic synchronization primitives, and plenty of advice.
>
>The problem I have with this plan, is reconciling the fact that a
>database update does all of this and more. And how to do it is a known
>problem, its been developed over and over again.

Yes - by the PROGRAMMER that does the database access code - that is far higher
level than typical perl code. 

If all your data lives in database and you are prepared to lock database
while you get/set them. 

Sure we can apply that logic to making statememts coherent in perl:

while (1)
 {
  lock PERL_LOCK; 
  do_state_ment
  unlock PERL_LOCK;
 }

So ONLY 1 thread is ever _in_ perl at a time - easy!
But now _by constraint_ a threaded perl program can NEVER be a performance
win. 

The reason this isn't a pain for databases is they have other things
to do while they wait ...

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: RFC 178 (v2) Lightweight Threads

2000-09-07 Thread Nick Ing-Simmons


Steven W McDougall <[EMAIL PROTECTED]> writes:
>> DS> Some things we can guarantee to be atomic. 
>
>> This is going to be tricky. A list of atomic guarentees by perl will be
>> needed.
>
>>From RFC 178
>
>...we have to decide which operations are [atomic]. As a starting
>point, we can take all the operators documented in C and
>all the functions documented in C as [atomic].

Presumably _ONLY_ in the absence of tie and overload:

use overload '.' => 'do_add';

sub do_add
{
 open(my $socket = "http://www. ...")
 ...
 
}

>
>
>- SWM
-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: RFC 130 (v4) Transaction-enabled variables for Perl6

2000-09-07 Thread Nick Ing-Simmons


Dlux <[EMAIL PROTECTED]> writes:
>| I've  deemed  to be  "too  complex".)  (Also  note  that I'm  not  a
>| database
>| guru, so  please bear with  me, and don't ask  me to write  the code
>| :-)
>
>Implementing threads  must be  done in  a very clever  way. It  may be
>put in  a shared library (mutex  handling code, locking, etc.),  but I
>think there  are more clevery  guys out  there who are  more competent
>in this, and I think it is covered with some other RFCs...

If amazingly clever threads handling is a requirement of this RFC 
then it is probably doomed. Multi-processing needs detailed explicit 
specifications to be done right - not vague requests.


>
>I also  don't like the overhead,  that's why I made  the "simple" mode
>default (look  at the "use  transaction" pragma again...).  This means
>NO  overhead,  

Not none, perhaps minimal ;-) - it has at least got to be looking 
at something pragma can set.

>no  locking  between  threads:  this  can  be  used  in
>single-thread  or multi-process  environment. Other  modes CAN  switch
>on locking functions,  but this is not default! If  you implement that
>intelligently (separated .so  for the thread handling),  then it means
>minimal overhead (some more callback call, and that's all).

I would need to understand just where the thread hooks need to go.
So far my non-detailed reading suggests that the hooks are pretty 
fundamental.

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: YAVTBL: yet another vtbl scheme

2000-09-06 Thread Nick Ing-Simmons


Benjamin Stuhl <[EMAIL PROTECTED]> writes:
>All -
>I fail to see the reason for imposing that all
>variables
>"know" how to perform ops upon themselves. An operation is 
>separate from the data it operates on. Therefore, I propose
>the following vtbl scheme, with two goals:
>  1. that the minimal vtbl be just that, minimal
>  2. that it be possible (convenient) to override ops as 
> needed

One of the goals vtables are supposed to help is that overloaded
ops be significantly faster than perl5 'magic' scheme. So that 
PDL can overload matrices etc. and have them "fast".

>First, a few basic types (these are sample only, and should
>be beaten on for cach-friendliness, etc. once a design is
>formalized).
>
>typedef struct _ovl {
>U32   ov_type;
>U32   ov_flags;
>void *ov_vtbl;
>void *ov_data;
>struct _ovl *ov_next;
>} OVERLOAD;
>
>typedef union {
>SCALAR_VTBL s;
>ARRAY_VTBL  a;
>HASH_VTBL   h;
>} SV_VTBL;
>
>typedef struct sv {
>void *sv_data;
>OVERLOAD *sv_magic;
>SV_VTBL  *sv_vtbl;
>U32  sv_flags; /* and type (SV, AV, HV) */
>(... GC stuff ... MT-safe stuff ...)
>} SV, *PMC;

That is perl5-ish - we now have to do a indirect via (slow) memory 
chain search down the overload list.

>
>SV_VTBL, then, supports basic operations on perlish data
>types (get, store, and a few housekeeping things). Since
>noone (outside perl and libperl.so) should be directly
>calling vtbl functions, this makes it easy to put checks in
>that a variable is the appropriate type (ie, av_fetch will
>die if the variable is really a scalar).

The checks take time.

>
>SCALAR_VTBL:
>get_int
>get_string
>get_real
>get_ref
>num_sign /* positive or negative (or zero?)*/
>num_is_integral
>set_int
>set_string
>set_real
>set_ref
>set_multival /* == perl5ish 
>sv_setpv(sv...);
>sv_setiv(sv,...);
>SvPOK_on(sv); (esp this part)
>  */
>undef
>construct
>finalize
>
>
>In order to allow overriding of opcodes for, say, BigInts,
>several types of OVERLOAD are defined (4 basic types (flags
>in bottom byte of ov_type?) 

I would rather avoid these oh-so-clever squeze it in N-bits tricks
unless _WELL_ hidden so we can change I mind when we find we need N+1.

>are defined, based on what
>flavor of vtbl is in ov_vtbl). These are OV_GET, OV_SET,
>OV_RANDOM, OV_OPS and are denoted in sv->sv_flags. The
>first three correspond to the perl5 GMG, SMG, and RMG. 

The snag with GET/SET is that ops like += or .= are forced
to do both rather than something efficient. 
And what exactly goes in the "random" bucket ? It is far from clear in perl5.

>The
>last marks that the vtbl is an overload of one or more
>opcodes. Every op checks to see if it is overloaded, and if
>it is, calls that. Some ops don't need to (ie, vec() can
>just do a set_string and add an OVERLOAD for the bitwise
>ops).
>
>If necessary, additional subclasses of OV_OPS may be
>defined (ie, OV_NUMERIC, OV_STRING, OV_IO).
>
>-- BKS
>
>__
>Do You Yahoo!?
>Yahoo! Mail - Free email you can access from anywhere!
>http://mail.yahoo.com/
-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: RFC 178 (v2) Lightweight Threads

2000-09-06 Thread Nick Ing-Simmons

Chaim Frenkel <[EMAIL PROTECTED]> writes:
>>>>>> "DS" == Dan Sugalski <[EMAIL PROTECTED]> writes:
>
>DS> I'd definitely rather perl not do any sort of explicit user-level locking. 
>DS> That's not our job, and there be dragons.
>
>Please explain how this is possible?
>
>Does this mean that without user specifying a lock, perl will allow
>a chaotic update pattern to be visible to the user?

If the user asks for chaos, chaos is what they get.

>
>   thread Athread B 
>   push(@foo, $bar);   ++$bar;
>
>or
>   $foo{$bar} = $baz;  delete $foo{$bar++};
>
>Will there be some sort of coherence here?

The snag with attempting to automate such things is illustrated by : 

thread Athread B 

$a = $a + $b++;   $b = $b + $a++;

So we need to 'lock' both $a and $b both sides.
So thread A will attempt to acquire locks on $a,$b (say)
and (in this case by symetry but perhaps just by bad luck) thread B will 
go for locks on $b,$a - opposite order. They then both get 1st lock 
they wanted and stall waiting for the 2nd. We are in then in 
a "classic" deadly embrace.

So the 'dragons' that Dan alludes to are those of intuiting the locks
and the sequence of the locks to acquire, deadlock detection and backoff, ...

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: Profiling

2000-09-05 Thread Nick Ing-Simmons


<[EMAIL PROTECTED]> writes:
>> 
>> Anyone surprised by the top few entries:
>
>Nope. It looks close to what I saw when I profiled perl 5.004 and 5.005
>running over innlog.pl and cleanfeed. The only difference is the method
>stuff, since neither of those were OO apps. The current Perl seems to
>spend most of its time in the op dispatch loop and in dealing with
>internal data structures.

What initially surprised me is why the op-despatch loop spends so long in 'self' code
when there is so little of it. My assumption is this is where we see 
the "cache miss" time. 

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: "Counting the birds" :")

2000-09-04 Thread Nick Ing-Simmons


Raptor <[EMAIL PROTECTED]> writes:
>What is interesting to me :
>
>1. "push" is used more than any of the other array ops, even than "shift"
>2. "use" is very good candidate for speedup
>3. We still use very much "goto" :")
>4. "each" is used more than "values" and "keys"
>5. Things like "hex,chr,oct,atan2" are used very rarely
>6. "pack" and "unpack" are also used very rarely, "study" -
>the same number of times.
>
>We can make similar thing for the whole CPAN.
>What will this give to us :
>1. It will help us to decide which of the operators are mostly used
>   (CPAN is suitable for this) so then we can take care
>   to speed up only mostly used ops in the new Perl6 (or Perl5)
>   (current script doesn't care about the "weight" of the ops i.e.
>it doesn't count how many times any op will be used in REAL LIFE f.e
>some op may execute 10 times during the life of the module but other can
>be
>executed only once. They are both counted as "ONE time" execution)

This is the snag with this - it may noy be 1 : 10 
but as in profile example I sent at weekend 1 : 5,833,600

Thus although 'use' occurs a lot as (by definition!) it is only executed once
it is less important than it looks.

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Profiling

2000-09-02 Thread Nick Ing-Simmons



This is from a perl5.7.0 (well the current perforce depot) compiled
with -pg and then run on a smallish example of my heavy OO day job app.

The app reads 7300 lines of "verilog" and parses it with (tweaked) Parse-Yapp
into tree of perl objects, messes with the parse tree and then calls
a method to write verilog back out again.

It isn't your typical perl app but it is one I am interested in speeding 
up. (Maybe even in perl5.)  

Anyone surprised by the top few entries:

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self  self total   
 time   seconds   secondscalls  ms/call  ms/call  name
  7.38  2.46 2.4621899 0.11 0.35  Perl_runops_debug
  6.24  4.54 2.08  4236752 0.00 0.00  Perl_sv_setsv
  4.77  6.13 1.59  2504451 0.00 0.00  Perl_hv_fetch_ent
  4.77  7.72 1.59  1185614 0.00 0.00  Perl_pp_entersub
  4.29  9.15 1.43  7665411 0.00 0.00  Perl_pp_padsv
  3.54 10.33 1.18  3560912 0.00 0.00  Perl_sv_upgrade
  3.30 11.43 1.10  2957237 0.00 0.00  Perl_sv_clear
  2.79 12.36 0.93  2192866 0.00 0.00  Perl_leave_scope
  2.70 13.26 0.90  5833600 0.00 0.00  Perl_pp_nextstate
  2.49 14.09 0.83  7528270 0.00 0.00  Perl_sv_free
  2.28 14.85 0.76  1050070 0.00 0.00  S_method_common
  2.22 15.59 0.74  9310671 0.00 0.00  Perl_pad_sv
  2.01 16.26 0.67  1460489 0.00 0.00  Perl_pp_helem
  1.77 16.85 0.59  2083856 0.00 0.00  Perl_pp_rv2av
  1.50 17.35 0.50  1682859 0.00 0.00  Perl_pp_rv2hv
  1.32 17.79 0.44  3708512 0.00 0.00  Perl_pp_pushmark
  1.32 18.23 0.44   274452 0.00 0.00  S_regmatch
  1.29 18.66 0.43   445172 0.00 0.00  Perl_pp_return
  1.23 19.07 0.41   513143 0.00 0.00  Perl_pp_aassign
  1.20 19.47 0.40   876634 0.00 0.00  Perl_sv_2nv
  1.17 19.86 0.39   406441 0.00 0.00  Perl_pp_iter
  1.11 20.23 0.37   292595 0.00 0.00  Perl_pp_match
  0.99 20.56 0.33   777388 0.00 0.00  Perl_av_fetch
  0.93 20.87 0.31   946209 0.00 0.00  Perl_pp_rv2sv
  0.90 21.17 0.30   666441 0.00 0.00  Perl_pp_leavesub
  0.87 21.46 0.29   739312 0.00 0.00  Perl_pp_enter
  0.84 21.74 0.28  2552651 0.00 0.00  Perl_pp_const
  0.84 22.02 0.28   624328 0.00 0.00  Perl_pp_padav
  0.78 22.28 0.26   613095 0.00 0.00  Perl_pp_aelem
  0.78 22.54 0.26   611990 0.00 0.00  Perl_av_store
  0.78 22.80 0.26   271953 0.00 0.00  Perl_hv_exists_ent
  0.75 23.05 0.25   890560 0.00 0.00  Perl_sv_setiv
  0.75 23.30 0.25   352878 0.00 0.00  Perl_hv_fetch
  0.72 23.54 0.24   812630 0.00 0.00  Perl_av_shift
  0.69 23.77 0.23  1533815 0.00 0.00  Perl_sv_mortalcopy
  0.69 24.00 0.23  1077589 0.00 0.00  Perl_free_tmps
  0.66 24.22 0.22  1116474 0.00 0.00  Perl_pp_sassign
  0.63 24.43 0.21  2364080 0.00 0.00  Perl_push_scope
  0.63 24.64 0.21  1518141 0.00 0.00  Perl_pp_and
  0.63 24.85 0.21   973339 0.00 0.00  Perl_safesysmalloc
  0.63 25.06 0.21   714053 0.00 0.00  Perl_sv_grow
  0.60 25.26 0.20  1353534 0.00 0.00  Perl_pp_gv
  0.60 25.46 0.20   499163 0.00 0.00  Perl_pp_leave
  0.60 25.66 0.20   328362 0.00 0.00  Perl_sv_eq
  0.57 25.85 0.19  2034488 0.00 0.00  Perl_pop_scope
  0.57 26.04 0.19   812393 0.00 0.00  Perl_pp_shift
  0.57 26.23 0.19   635329 0.00 0.00  Perl_pp_or
  0.57 26.42 0.19   270181 0.00 0.00  Perl_regexec_flags
  0.54 26.60 0.18  1636750 0.00 0.00  Perl_save_clearsv
  0.54 26.78 0.18  1043850 0.00 0.00  Perl_pp_method_named
  0.54 26.96 0.18   404016 0.00 0.00  Perl_amagic_call
  0.51 27.13 0.17  1148905 0.00 0.00  Perl_newSV
  0.48 27.29 0.16  2159235 0.00 0.00  Perl_save_int
  0.48 27.45 0.16  2125024 0.00 0.00  Perl_vivify_ref
  0.48 27.61 0.1665311 0.00 0.00  Perl_pp_enteriter
  0.45 27.76 0.15   387333 0.00 0.00  Perl_pp_push
  0.45 27.91 0.15   304028 0.00 0.00  Perl_sv_setpvn
  0.42 28.05 0.1492967 0.00 0.00  Perl_share_hek
  0.39 28.18 0.13   777688 0.00 0.00  Perl_pp_cond_expr
  0.39 28.31 0.13   110898 0.00 0.00  Perl_gv_fetchpv
  0.36 28.43 0.12   239937 0.00 0.00  Perl_av_clear
  0.36 28.55 0.1255061 0.00 0.00  Perl_pp_unpack



-- 
Nick Ing-Simmons

Re: A tentative list of vtable functions

2000-09-01 Thread Nick Ing-Simmons

Dan Sugalski <[EMAIL PROTECTED]> writes:
>is_equal (true if this thing is equal to the parameter thing)
>is_same (True if this thing is the same thing as the parameter thing)

is_equal in what sense? (String, Number, ...)

and how is is_same different from just comparing addresses of the things?

-- 
Nick Ing-Simmons

Re: A tentative list of vtable functions

2000-09-01 Thread Nick Ing-Simmons


David L . Nicol <[EMAIL PROTECTED]> writes:
>Dan Sugalski wrote:
>> 
>> Okay, here's a list of functions I think should go into variable vtables.
>
>All the math functions are in here.  Can the entries that my type does
>not use be replaced with other functions that my type does use?

NO ! 

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: A tentative list of vtable functions

2000-09-01 Thread Nick Ing-Simmons


David L . Nicol <[EMAIL PROTECTED]> writes:
>Dan Sugalski wrote:
>> 
>> Okay, here's a list of functions I think should go into variable vtables.
>
>All the math functions are in here.  Can the entries that my type does
>not use be replaced with other functions that my type does use?

NO ! 

-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: RFC 155 - Remove geometric functions from core

2000-08-30 Thread Nick Ing-Simmons


Dan Sugalski <[EMAIL PROTECTED]> writes:
>At 07:32 PM 8/29/00 +0000, Nick Ing-Simmons wrote:
>>David L . Nicol <[EMAIL PROTECTED]> writes:
>> >
>> >Did I not just describe how a .so or a DLL works currently?
>>
>>And behind the scenes that does something akin to:
>>
>>int fd = open("file_of_posn_indepenant_byte_code",O_RDONLY);
>>struct stat st;
>>fstat(fd,&st);
>>code_t *code = mmap(NULL,st.st_len,PROT_READ,MAP_SHARED,fd,0);
>>close(fd);
>
>Don't forget the fixup work that needs to be done afterwards. Loading the 
>library into memory's only the first part--after that the loader needs to 
>twiddle with transfer vectors and such so the unresolved calls into the 
>routines in the newly loaded library get resolved.

I finessed the "fixup work" by saying "position independant byte code".
The fixups break the shareability of the pages which is why you compile
shared libs -fPIC. So we should strive to have minimal fixups and 
collect them in one place (which vtables do very nicely).


-- 
Nick Ing-Simmons

RE: how small is small? (was Re: RFC 146 (v1) Remove socket funct ions from core)

2000-08-30 Thread Nick Ing-Simmons

Garrett Goebel <[EMAIL PROTECTED]> writes:
>How small?
>
>I'd like to get barebones Perl and Linux on a 1.44MB floppy...
>
>If it is currently possible, 

I was possible. At first Perl Conference there was a ZigZag app demoed
which came as console-only linux + perl + app. on one floppy.

-- 
Nick Ing-Simmons

Re: RFC 146 (v1) Remove socket functions from core

2000-08-30 Thread Nick Ing-Simmons


David L . Nicol <[EMAIL PROTECTED]> writes:
>Nick Ing-Simmons wrote:
>
>> We need to distinguish "module", "overlay", "loadable", ... if we are
>> going to get into this type of discussion. Here is my 2¢:
>> 
>> Module   - separately distributable Perl and/or C code.  (e.g. Tk800.022.tar.gz)
>> Loadable - OS loadable binary e.g. Tk.so or Tk.dll
>> Overlay  - Tightly coupled ancillary loadable which is no use without
>>its "base"  - e.g. Tk/Canvas.so which can only be used
>>when a particular Tk.so has already be loaded.
>
>I know I've got helium Karma around here these days but I don't like
>"overlay" it is reminiscent of old IBM machines swapping parts of the
>program out because there isn't enough core.  

Which is exactly why I chose it - the places these things makes sense are 
on little machines where memory is a premium. 

>Linux modules have
>dependencies on each other and sometimes you have to load the more basic
>ones first or else get symbol-undefined errors.  So why not follow
>that lead and call Overlays "dependent modules."

A. Name is too long.
B. That does not have same "feel" as what we have.

>
>If a dependent module knows what it depends on, that module can be
>loaded on demand for the dependent one.

But - like old-style overlays our add-ons are going to be loaded on need
by the parent and only depend on the parent.

e.g. perl discovers it needs to getpwuid() do it loads the thing
that has those functions.

We are not going to be in the middle of getpwuid() and decide we need perl...


-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Re: RFC 155 - Remove geometric functions from core

2000-08-29 Thread Nick Ing-Simmons

Sam Tregar <[EMAIL PROTECTED]> writes:
>On Tue, 29 Aug 2000, Nick Ing-Simmons wrote:
>> David L . Nicol <[EMAIL PROTECTED]> writes:
>> >
>> >does sysV shm not support the equivalent security as the file system?
>> 
>> mmap() has the file system.
>
>I wasn't aware that mmap() was part of SysV shared memory. 

It is NOT. It is another (POSIX) way of getting shared memory bewteen 
processes. Even without MAP_SHARED OS will share un-modified pages 
between processes.

It happens to be the way modern UNIX implemements "shared .text".
i.e. the ".text" part of the object file is mmap()'ed  into 
each process.

>My
>mistake?  It's not on the SysV IPC man pages on my Linux system.  The mmap
>manpage doesn't mention SysV IPC either.

SysV IPC is a mess IMHO. 

My point was that if the "file system" is considered
sufficient then mmap()ing file system objects will get you "shared code"
or "shared data" without any tedious reinventing of wheels.

-- 
Nick Ing-Simmons

Re: RFC 155 - Remove geometric functions from core

2000-08-29 Thread Nick Ing-Simmons

David L . Nicol <[EMAIL PROTECTED]> writes:
>
>does sysV shm not support the equivalent security as the file system?

mmap() has the file system.

>
>Did I not just describe how a .so or a DLL works currently?

And behind the scenes that does something akin to:

int fd = open("file_of_posn_indepenant_byte_code",O_RDONLY);
struct stat st;
fstat(fd,&st);
code_t *code = mmap(NULL,st.st_len,PROT_READ,MAP_SHARED,fd,0);
close(fd);

strace (linux) or truss (solaris) will show you what I mean.

And then trusts to OS to honour MAP_SHARED.  (mmap() is POSIX.)

Win32 has "something similar" but I don't remember the function names off
hand.

Or you can embed your bytecode in 

const char script[] = {...};

and link/dlopen() it and then you have classical shared text.

-- 
Nick Ing-Simmons

The evils of #define ...

2000-08-29 Thread Nick Ing-Simmons


Jarkko Hietaniemi <[EMAIL PROTECTED]> writes:
>On Tue, Aug 29, 2000 at 01:46:17AM -, [EMAIL PROTECTED] wrote:
>> 
>> This is a build failure report for perl from [EMAIL PROTECTED],
>> generated with the help of perlbug 1.32 running under perl v5.7.0.
>
>Now I tracked this one down (change #6891).  The hunt mainly consisted
>of debugging the following charming line :-)
>
>SV *perinterp_sv = * Perl_hv_fetch(((PerlInterpreter 
>*)pthread_getspecific((*Perl_Gthr_key_ptr(((void *)0) )) ) )  ,   
>(*Perl_Imodglobal_ptr(((PerlInterpreter 
>*)pthread_getspecific((*Perl_Gthr_key_ptr(((void *)0) )) ) )   ))  ,
>"Storable(" "0.703"  ")"  ,  sizeof("Storable(" "0.703"  ")" )-1 ,  (1)  )  ;
>stcxt_t *cxt  = ( stcxt_t * )(perinterp_sv && ((  perinterp_sv  )->sv_flags  & 
>0x0001 )? (  stcxt_t *  )(unsigned long )(  ((XPVIV*)  (  perinterp_sv  
>)->sv_any )->xiv_iv  )  : ((void *)0) )  ;  (  cxt  = (  stcxt_t 
>*)Perl_safesysmalloc  ((size_t  )((  1 )*sizeof(  stcxt_t , (__extension__ 
>(__builtin_constant_p (   (  1 )*sizeof(  stcxt_t )  ) && (   (  1 )*sizeof(  stcxt_t 
>)  ) <= 16? ((   (  1 )*sizeof(  stcxt_t )  ) == 1? ({ void *__s = (   
>(char*)(  cxt )   );   *((__uint8_t *) __s) = (__uint8_t)0  ; __s; })  : 
>({ void *__s = (   (char*)(  cxt )   );   union { unsigned int __ui;  
>unsigned short int __usi;   unsigned char __uc; } *__u = __s;   __uint8_t __!
>c = (__uint8_t) (   0  );  switch ((unsigned int) ( (  1 )*sizeof(  stcxt_t ) 
>  )) {   case 15:__u->__ui = __c * 0x01010101;   __u = __extension__ 
>(void *)((char *) __u + 4); case 11:__u->__ui = __c * 0x01010101;   __u = 
>__extension__ (void *)((char *) __u + 4); case 7: __u->__ui = __c * 0x01010101;   __u 
>= __extension__ (void *)((char *) __u + 4); case 3: __u->__usi = (unsigned short int) 
>__c * 0x0101; __u = __extension__ (void *)((char *) __u + 2); __u->__uc = (unsigned 
>char) __c;break;  case 14:__u->__ui = __c * 0x01010101;   __u = 
>__extension__ (void *)((char *) __u + 4); case 10:__u->__ui = __c * 
>0x01010101;   __u = __extension__ (void *)((char *) __u + 4); case 6: __u->__ui = __c 
>* 0x01010101;   __u = __extension__ (void *)((char *) __u + 4); case 2: __u->__usi = 
>(unsigned short int) __c * 0x0101; break;  case 13:__u->__ui = __c * 
>0x01010101;   __u = __extension__ (void *)((char *) __u + 4); case 9: __u->__ui = __c 
>* 0x01010101;   __u = __extension__ (void *)((char *) __u + 4); c!
>ase 5: __u->__ui = __c * 0x01010101;   __u = __extension__ (void *)((char *) __u + 
>4); case 1: __u->__uc = (unsigned char) __c;break;  case 16:__u->__ui 
>= __c * 0x01010101;   __u = __extension__ (void *)((char *) __u + 4); case 12:
>__u->__ui = __c * 0x01010101;   __u = __extension__ (void *)((char *) __u + 4); case 
>8: __u->__ui = __c * 0x01010101;   __u = __extension__ (void *)((char *) __u + 4); 
>case 4: __u->__ui = __c * 0x01010101;   case 0: break;  }   __s; }) )   : 
>(__builtin_constant_p ( 0 ) && ( 0 ) == '\0'  ? ({ void *__s = (  (char*)(  cxt )  ); 
>__builtin_memset ( __s , '\0',  (  1 )*sizeof(  stcxt_t )   ) ; __s; }) : 
>memset (  (char*)(  cxt )  ,  0 ,(  1 )*sizeof(  stcxt_t )     ;  
>Perl_sv_setiv(((PerlInterpreter *)pthread_getspecific((*Perl_Gthr_key_ptr(((void *)0) 
>)) ) )  ,   perinterp_sv ,  ( IV )(unsigned long )(  cxt  )   )  ;





-- 
Nick Ing-Simmons

Re: RFC 155 (v2) Remove mathematic and trigonomic functions from core binary

2000-08-28 Thread Nick Ing-Simmons

Tom Christiansen <[EMAIL PROTECTED]> writes:
>
>Explain why things like #ifdef HAS_SETLOCALE are not sufficient for
>this stated purpose.

Because the source has to have something like:

#ifdef HAS_SETLOCALE
 ...
 setlocale(...) 
 ...
#else

#endif 

That does not help someone who has Locale_Set() with different calling sequence 
but which could mimic the effect.

Suppose there is no pipe() but there is SocketPair().
Suppose there is no fork() but there is CloneAddressSpace().

With the #ifdef scheme the one set of source gets cluttered with 
an #ifdef forest where only ONE path through the forest applies at a
a time and you cannot see if for the trees.

With the "loadable" scheme we conditionally compile the version of 
the loadable for the current platform and "link it in".

I think this is inappropriate for sin/cos/tan et. al. and possibly even 
sockets (although Win32 sockets are weird enough that it would be worthwhile).

But for getpw* or shm/queue/msg or other may-not-be-there but we can 
fake it if you REALLY want it stuff it makes sense to move the faking out
into a loadable so we can fix it independantly of perl.

-- 
Nick Ing-Simmons

RE: RFC 146 (v1) Remove socket functions from core

2000-08-28 Thread Nick Ing-Simmons


Fisher Mark <[EMAIL PROTECTED]> writes:
>>Leaping to conculusions based on no tests at all is even worse...
>>
>>Will anyone bite the bullet and write the "Internals Decisions should
>>be based on actual tests on multiple platforms" RFC ?
>
>BTW, I have access to Rational Software's Quantify (and PureCoverage and
>Purify) on WinNT and HP-UX 10.20 which I'd be glad to use for such tests.

If you want to get "in the mood" it would be good to fire it up on 
(say) perl5.6.0 and see where the hot-spots are.


>===
>Mark Leighton Fisher[EMAIL PROTECTED]
>Thomson Consumer ElectronicsIndianapolis IN
>"Display some adaptability." -- Doug Shaftoe, _Cryptonomicon_
-- 
Nick Ing-Simmons

Re: RFC 161 (v2) OO Integration/Migration Path

2000-08-28 Thread Nick Ing-Simmons

Nathan Torkington <[EMAIL PROTECTED]> writes:
>Dan Sugalski writes:
>> If the vtable stuff goes into the core perl engine (and it probably will,
>> barring performance issues), then what could happen in the
>
>I have a lot of questions.  Please point me to the appropriate place
>if they are answered elsewhere.
>
>vtables are tables of C functions?  

I am using them as tables of machine-code functions (compiled from C 
being the obvious but not the only way to create those).

>Perl functions?  

Not directly. But given a "C" API it is normally easy enough to 
wrap the perl function (e.g. the FETCH/GET tie methods layered under
"magic" in perl5).

>Either?  How
>would you use them to handle overloading of operators?  One function
>in the vtable for every operation?  

If the table is with the data yes, else if table is with the code 
one function for every type.

>How does that extend to
>user-defined operators?

Badly. But it makes user-defined implementations of existing operators easy.

>
>Nat
-- 
Nick Ing-Simmons

Re: RFC 146 (v1) Remove socket functions from core

2000-08-27 Thread Nick Ing-Simmons


Michael G Schwern <[EMAIL PROTECTED]> writes:
>Like all other optimizing attempts, the first step is analysis.
>People have to sit down and systematically go through and find out
>what parts of perl (and Perl) are eating up space and speed.  The
>results will be very surprising, I'm sure, but it will give us a
>concrete idea of what we can do to really help out perl's performance.
>
>There should probably be an RFC to this effect, and I'm just visiting
>here in perl6-language so I dump it on somebody else.

Alan Burlison <[EMAIL PROTECTED]> writes:

>Drawing conclusions based on a single test can be
>misleading.

Leaping to conculusions based on no tests at all is even worse...

Will anyone bite the bullet and write the "Internals Decisions should
be based on actual tests on multiple platforms" RFC ?

-- 
Nick Ing-Simmons

RE: RFC 146 (v1) Remove socket functions from core

2000-08-27 Thread Nick Ing-Simmons

Al Lipscomb <[EMAIL PROTECTED]> writes:
>I wonder if you could arrange things so that you could have statically
>linked and dynamic linked executable. Kind of like what they do with the
>Linux kernel. When your installation is configured in such a way as to make
>the dynamic linking a problem, just compile a version that has (almost)
>everything bolted in. Otherwise compile the features as modules.

If we make it possible to move socket or math functions out of execuable
into "overlays" then there will always be an option NOT to do that
and build one executable - (and that will probably be the default!).

We need to distinguish "module", "overlay", "loadable", ... if we are 
going to get into this type of discussion. Here is my 2¢:

Module   - separately distributable Perl and/or C code.  (e.g. Tk800.022.tar.gz)
Loadable - OS loadable binary e.g. Tk.so or Tk.dll
Overlay  - Tightly coupled ancillary loadable which is no use without
   its "base"  - e.g. Tk/Canvas.so which can only be used 
   when a particular Tk.so has already be loaded.  

Tk has these "overlays" - I think DBI has something similar. perl5 itself
does not as such (although POSIX.so is close).

_I_ would like to see RFC 146 mutate into or be replaced by an RFC 
which said perl should have a mechanism to allow parts of functionality
to be split out into separate binary (sharable) files.

-- 
Nick Ing-Simmons

Re: RFC 155 - Remove geometric functions from core

2000-08-27 Thread Nick Ing-Simmons

Jarkko Hietaniemi <[EMAIL PROTECTED]> writes:
>   bytes
>
>microperl, which has almost nothing os dependent (*) in it 1212416
>shared libperl 1277952 bytes + perl 32768 bytes1310720
>dynamically linked perl1376256
>statically linked perl with all the core extensions2129920
>
>  (*) I haven't tried building it in non-UNIX boxes, so I can't be certain
>  of how fastidiously features have been disabled.

"bytes" of what? - size of executable, size of .text, ???
If we are taling executable with -g size then  a lot of that is symbol-table
and is tedious repetition of "sv.h" & co. re-itteerated in each .o file.

But the basic point is that these things are small.

>
>So ripping all this 'cruft' would save us about 100-160 kB, still
>leaving us with well over a 1MB-plus executable.  It's Perl itself
>that's big, not the thin glue to the system functions.

My support for the idea is not to reduce the size of perl in the UNIX
case, but to allow replacement. I would also like to have the mechanism
worked out and "proven" on something that we know gets used so 
that we can have good solid testing of the mechanism. Then something 
less obvious (say Damian's any/all operators) which might be major
extra size and not of universal appeal can use a well tried mechanism,
and we can flip default to re-link sockets or sin/cos/tan into the core.

-- 
Nick Ing-Simmons

1 2 >

1 - 100 of 177 matches

Mail list logo