[PATCH] ARGV in P0
The patch below places the contents of argv into P0. At the moment it has the name of the script file in question in P0[0]; I haven't yet decided if this is to be construed as a feature or a bug. ;^) A little test script to see that this is working right: set I0, P0 set I1, 0 print start\n LOOP: ge I1, I0, OUT set S0, P0, I1 print print I1 print : print S0 print \n inc I1 branch LOOP OUT: print done\n end --Brent Dax [EMAIL PROTECTED] Parrot Configure pumpking and regex hacker Check out the Parrot FAQ: http://www.panix.com/~ziggy/parrot.html (no, it's not mine) obra . hawt sysadmin chx0rs lathos This is sad. I know of *a* hawt sysamin chx0r. obra I know more than a few. lathos obra: There are two? Are you sure it's not the same one? --- /parrot-cvs/embed.c Wed Jan 30 06:41:34 2002 +++ /parrot/embed.c Thu Jan 31 01:49:26 2002 @@ -139,6 +139,9 @@ void Parrot_runcode(struct Parrot_Interp *interpreter, int argc, char *argv[]) { +INTVAL i; +PMC* userargv; + if(interpreter-flags PARROT_DEBUG_FLAG) { fprintf(stderr, Parrot VM: Debugging enabled.\n); @@ -159,6 +162,24 @@ exit(1); } #endif + +if(interpreter-flags PARROT_DEBUG_FLAG) { +fprintf(stderr, Parrot VM: Setting up ARGV array in P0. Current argc: %d\n, argc); +} + +userargv=pmc_new(interpreter, enum_class_PerlArray); + +for(i=0; i argc; i++) { +if(interpreter-flags PARROT_DEBUG_FLAG) { +fprintf(stderr, \t%d: %s\n, i, argv[i]); +} + +userargv-vtable-set_string_index(interpreter, userargv, +string_make(interpreter, argv[i], strlen(argv[i]), 0, 0, 0), i +); +} + +interpreter-pmc_reg.registers[0]=userargv; runops(interpreter, interpreter-code, 0); --- /parrot-cvs/test_main.c Thu Jan 31 01:18:18 2002 +++ /parrot/test_main.c Thu Jan 31 01:45:20 2002 @@ -96,7 +96,7 @@ } OUT: -(*argc)--; + return (*(argv++))[0]; }
Re: Apoc 4: The skip keyword
skip was uncomfortable when I read it (I at first took it to mean skip over the following rather than skip to the following), but I find nobreak also a bit strange. How about proceed? If we mean fall-through, why invent a new term? Why not use the intent: Cfall_through? Wow, keyword with underscore. I like proceed much better. Tomas.
Re: parrot rx engine
On Wed, 30 Jan 2002 17:45:58 +, Graham Barr wrote: On Wed, Jan 30, 2002 at 09:32:49AM -0800, Brent Dax wrote: # rx_setprops P0, i, 2 # branch $start0 # $advance: # rx_advance P0, $fail # $start0: # rx_literal P0, a, $advance # # First, we set the rx engine to case-insensitive. Why is that bad? It's # setting a runtime property for what should be compile-time # unicode-character-kung-fu. Assuming your CPU knows what the gritty # details of unicode in the first place just feels wrong, but I digress. That i does a once-off case-folding operation on the target string. All other input to the engine MUST already be case-folded for speed. Hm, is that going to work ? What about a rx like /^a(?i:b)C/ where the case insensitivity only applies to part of the pattern ? Or worse, in /^a(b)c/i, where you want to capture the original character, not the case-folded version? -- Peter Haworth [EMAIL PROTECTED] The term `Internet' has the meaning given that term in section 230(f)(1) of the Communications Act of 1934. -- H.R. 3028, Trademark Cyberpiracy Prevention Act
strings: sequence-of-integer ... list of chunks
On Wed, Jan 30, 2002 at 10:47:36AM -0800, Larry Wall wrote: For various reasons, some of which relate to the sequence-of-integer abstraction, and some of which relate to infinite strings and arrays, I think Perl 6 strings are likely to be represented by a list of chunks, where each chunk is a sequence of integers of the same size or representation, but different chunks can have different integer sizes or representations. The abstract string interface must hide this from any module that wishes to work at the abstract string level. In particular, it must hide this from the regex engine, which works on pure sequences in the abstract. I hope someone volunteers to start looking into implementing that soon (if no one has already). Tim.
RE: parrot rx engine
Peter Haworth: # On Wed, 30 Jan 2002 17:45:58 +, Graham Barr wrote: # On Wed, Jan 30, 2002 at 09:32:49AM -0800, Brent Dax wrote: # # rx_setprops P0, i, 2 # # branch $start0 # # $advance: # # rx_advance P0, $fail # # $start0: # # rx_literal P0, a, $advance # # # # First, we set the rx engine to case-insensitive. Why is # that bad? It's # # setting a runtime property for what should be compile-time # # unicode-character-kung-fu. Assuming your CPU knows # what the gritty # # details of unicode in the first place just feels wrong, # but I digress. # # That i does a once-off case-folding operation on the # target string. # All other input to the engine MUST already be case-folded # for speed. # # Hm, is that going to work ? What about a rx like # /^a(?i:b)C/ where the # case insensitivity only applies to part of the pattern ? # # Or worse, in /^a(b)c/i, where you want to capture the # original character, # not the case-folded version? Parentheses just record a pair of indices, not a string. --Brent Dax [EMAIL PROTECTED] Parrot Configure pumpking and regex hacker Check out the Parrot FAQ: http://www.panix.com/~ziggy/parrot.html (no, it's not mine) obra . hawt sysadmin chx0rs lathos This is sad. I know of *a* hawt sysamin chx0r. obra I know more than a few. lathos obra: There are two? Are you sure it's not the same one?
Re: [PATCH] ARGV in P0
At 2:00 AM -0800 1/31/02, Brent Dax wrote: The patch below places the contents of argv into P0. At the moment it has the name of the script file in question in P0[0]; I haven't yet decided if this is to be construed as a feature or a bug. ;^) Probably a bug, but in the specification. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: strings: sequence-of-integer ... list of chunks
At 2:49 PM + 1/31/02, Tim Bunce wrote: On Wed, Jan 30, 2002 at 10:47:36AM -0800, Larry Wall wrote: For various reasons, some of which relate to the sequence-of-integer abstraction, and some of which relate to infinite strings and arrays, I think Perl 6 strings are likely to be represented by a list of chunks, where each chunk is a sequence of integers of the same size or representation, but different chunks can have different integer sizes or representations. The abstract string interface must hide this from any module that wishes to work at the abstract string level. In particular, it must hide this from the regex engine, which works on pure sequences in the abstract. I hope someone volunteers to start looking into implementing that soon (if no one has already). Yup, in progress. There is an issue of time--what do we do, for example, in the case: my $pi = Pi::Generate; if ($pi =~ /[a-z]) { print There's a letter in here!\n; } if Pi::Generate returns a generator object that will calculate pi for you to however far you want, that regex will run forever or until it runs out of memory, whichever comes first. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: parrot rx engine
On Thu, Jan 31, 2002 at 08:54:21AM -0800, Brent Dax wrote: Peter Haworth: # On Wed, 30 Jan 2002 17:45:58 +, Graham Barr wrote: # On Wed, Jan 30, 2002 at 09:32:49AM -0800, Brent Dax wrote: # # rx_setprops P0, i, 2 # # branch $start0 # # $advance: # # rx_advance P0, $fail # # $start0: # # rx_literal P0, a, $advance # # # # First, we set the rx engine to case-insensitive. Why is # that bad? It's # # setting a runtime property for what should be compile-time # # unicode-character-kung-fu. Assuming your CPU knows # what the gritty # # details of unicode in the first place just feels wrong, # but I digress. # # That i does a once-off case-folding operation on the # target string. # All other input to the engine MUST already be case-folded # for speed. # # Hm, is that going to work ? What about a rx like # /^a(?i:b)C/ where the # case insensitivity only applies to part of the pattern ? # # Or worse, in /^a(b)c/i, where you want to capture the # original character, # not the case-folded version? Parentheses just record a pair of indices, not a string. Yes, I was assuming that. However what is to be gained by case folding the input string ? Because parts of an rx can be case-insensitive while other parts are case-sensitive, we will probably need two sorts of ops anyway (or a way to tell the op to be case-insensitive). And you will only be able to do the case folding when the whole rx is case-insensitive. It also means creating a copy of the input string, which is something the current rx engine in perl5 tries to avoid. And while I will agree that it is often faster todo lc($str) =~ /.../ than $str =~ /.../i that is normally only the case for small-ish strings. Graham.
Re: Jit on Solaris: using dis instead of objdump?
On Wed, 30 Jan 2002, Jason Gloudon wrote: On Wed, Jan 30, 2002 at 03:27:18PM -0500, Andy Dougherty wrote: objdump. Is anyone with a Solaris system familiar enough with jit internals to have a go at adapting it to use dis instead of GNU objdump? The difference was pretty minimal. It should work with 'dis'. It doesn't. (If it had, I would have posted a patch allowing 'dis' instead of 'objdump' instead of asking for help. Sorry I hadn't made it clear originally that I had already tried the simple stuff.) Today, I note that sun4-solaris.pm now always uses 'dis', and sun4Generic.pm has been changed a bit. However, it still doesn't work. I get perl-cc jit2h.pl sun4 include/parrot/jit_struct.h /usr/ccs/bin/as: t.s, line 2: error: invalid character (0x40) as t.s failed at lib/Parrot/Jit/sun4Generic.pm line 164, IN line 10. *** Error code 1 (Note that I'm using Sun's assembler. That may be the difference.) -- Andy Dougherty [EMAIL PROTECTED]
Re: strings: sequence-of-integer ... list of chunks
On Thu, Jan 31, 2002 at 12:18:28PM -0500, Dan Sugalski wrote: At 2:49 PM + 1/31/02, Tim Bunce wrote: On Wed, Jan 30, 2002 at 10:47:36AM -0800, Larry Wall wrote: For various reasons, some of which relate to the sequence-of-integer abstraction, and some of which relate to infinite strings and arrays, I think Perl 6 strings are likely to be represented by a list of chunks, where each chunk is a sequence of integers of the same size or representation, but different chunks can have different integer sizes or representations. The abstract string interface must hide this from any module that wishes to work at the abstract string level. In particular, it must hide this from the regex engine, which works on pure sequences in the abstract. I hope someone volunteers to start looking into implementing that soon (if no one has already). Yup, in progress. There is an issue of time--what do we do, for example, in the case: my $pi = Pi::Generate; if ($pi =~ /[a-z]) { print There's a letter in here!\n; } if Pi::Generate returns a generator object that will calculate pi for you to however far you want, that regex will run forever or until it runs out of memory, whichever comes first. Right. So don't do that. :-) Tim.
Re: strings: sequence-of-integer ... list of chunks
At 5:34 PM + 1/31/02, Tim Bunce wrote: On Thu, Jan 31, 2002 at 12:18:28PM -0500, Dan Sugalski wrote: At 2:49 PM + 1/31/02, Tim Bunce wrote: On Wed, Jan 30, 2002 at 10:47:36AM -0800, Larry Wall wrote: For various reasons, some of which relate to the sequence-of-integer abstraction, and some of which relate to infinite strings and arrays, I think Perl 6 strings are likely to be represented by a list of chunks, where each chunk is a sequence of integers of the same size or representation, but different chunks can have different integer sizes or representations. The abstract string interface must hide this from any module that wishes to work at the abstract string level. In particular, it must hide this from the regex engine, which works on pure sequences in the abstract. I hope someone volunteers to start looking into implementing that soon (if no one has already). Yup, in progress. There is an issue of time--what do we do, for example, in the case: my $pi = Pi::Generate; if ($pi =~ /[a-z]) { print There's a letter in here!\n; } if Pi::Generate returns a generator object that will calculate pi for you to however far you want, that regex will run forever or until it runs out of memory, whichever comes first. Right. So don't do that. :-) Oh, sure, *be* sensible. Sheesh, some people... :) -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: strings: sequence-of-integer ... list of chunks
On Thu, 31 Jan 2002, Dan Sugalski wrote: At 2:49 PM + 1/31/02, Tim Bunce wrote: On Wed, Jan 30, 2002 at 10:47:36AM -0800, Larry Wall wrote: For various reasons, some of which relate to the sequence-of-integer abstraction, and some of which relate to infinite strings and arrays, I hope someone volunteers to start looking into implementing that soon (if no one has already). Yup, in progress. There is an issue of time--what do we do, for example, in the case: my $pi = Pi::Generate; if ($pi =~ /[a-z]) { print There's a letter in here!\n; } if Pi::Generate returns a generator object that will calculate pi for you to however far you want, that regex will run forever or until it runs out of memory, whichever comes first. We simply guarantee that Perl will always give you enough rope to hang yourself, you just need to ask nicely. Alex Gough
Re: parrot rx engine
On Thu, Jan 31, 2002 at 05:15:49PM +, Graham Barr wrote: Yes, I was assuming that. However what is to be gained by case folding the input string ? Because parts of an rx can be case-insensitive while other parts are case-sensitive, we will probably need two sorts of ops anyway (or a way to tell the op to be case-insensitive). And you will only be able to do the case folding when the whole rx is case-insensitive. (Two sorts of ops makes most sense to me as the case-insensitive op will need to know about fiddly charset conversion stuff whereas the case-sensitive can just work with the list-of-integers abstraction.) It also means creating a copy of the input string, which is something the current rx engine in perl5 tries to avoid. And while I will agree that it is often faster todo lc($str) =~ /.../ than $str =~ /.../i that is normally only the case for small-ish strings. Agreed on all counts. Especially as the perl6 rx engine will have to be able to work directly on non-trivial things like streams and generators ans suchlike. Tim [who's not really paying attention].
Re: Jit on Solaris: using dis instead of objdump?
This should make solaris 'as' happy. There will be an assembler warning, but it's harmless. diff -r1.3 sun4Generic.pm 78c78 return Parrot::Jit-Assemble(ld [\%o0], \%o0\njmpl \%o0, \%g0\n); --- return Parrot::Jit-Assemble(ld [\%o0], \%o0\njmpl \%o0, \%g0\nnop\n); 151c151 .typemain,@function --- .typemain,#function -- Jason
[PATCH] no need to rebuild everything all the time
Dependencies in the Makefile are currently too broad brush. I don't enjoy waiting for everything to recompile every time I try to tweak the jit. The only file that #includes jit_struct.h is jit.c, so I feel that the Makefile dependencies should reflect this, and not cause a gratuitous recompile of everything. There are probably other auto-generated header files that world+dog should not depend on. Nicholas Clark -- EMCFT http://www.ccl4.org/~nick/CV.html --- Makefile.in~Wed Jan 30 10:31:28 2002 +++ Makefile.in Thu Jan 31 18:54:57 2002 @@ -57,13 +57,14 @@ # ### -H_FILES = $(INC)/config.h $(INC)/exceptions.h $(INC)/io.h $(INC)/op.h \ +GENERAL_H_FILES = $(INC)/config.h $(INC)/exceptions.h $(INC)/io.h $(INC)/op.h \ $(INC)/register.h $(INC)/string.h $(INC)/events.h $(INC)/interpreter.h \ $(INC)/memory.h $(INC)/parrot.h $(INC)/stacks.h $(INC)/packfile.h \ $(INC)/global_setup.h $(INC)/vtable.h $(INC)/oplib/core_ops.h $(INC)/oplib/core_ops_prederef.h \ $(INC)/runops_cores.h $(INC)/trace.h \ $(INC)/pmc.h $(INC)/key.h $(INC)/resources.h $(INC)/platform.h \ -$(INC)/interp_guts.h ${jit_h} ${jit_struct_h} $(INC)/rx.h $(INC)/rxstacks.h $(INC)/embed.h +$(INC)/interp_guts.h ${jit_h} $(INC)/rx.h $(INC)/rxstacks.h $(INC)/embed.h +ALL_H_FILES = $(GENERAL_H_FILES) ${jit_struct_h} CLASS_O_FILES = classes/default$(O) classes/array$(O) \ classes/perlint$(O) classes/perlstring$(O) classes/perlnum$(O) \ @@ -207,7 +208,7 @@ # ### -test_main$(O): test_main.c $(H_FILES) +test_main$(O): test_main.c $(GENERAL_H_FILES) lib/Parrot/Jit.pm: lib/Parrot/Jit/${jitarchname}.pm lib/Parrot/Jit/${jitcpuarch}Generic.pm $(PERL) -MFile::Copy=cp -e ${PQ}cp q|lib/Parrot/Jit/${jitarchname}.pm|, q|lib/Parrot/Jit.pm|${PQ} @@ -261,70 +262,70 @@ # ### -global_setup$(O): $(H_FILES) +global_setup$(O): $(GENERAL_H_FILES) -pmc$(O): $(H_FILES) +pmc$(O): $(GENERAL_H_FILES) -jit$(O): $(H_FILES) +jit$(O): $(GENERAL_H_FILES) ${jit_struct_h} -key$(O): $(H_FILES) +key$(O): $(GENERAL_H_FILES) -resources$(O): $(H_FILES) +resources$(O): $(GENERAL_H_FILES) -platform$(O): $(H_FILES) +platform$(O): $(GENERAL_H_FILES) -string$(O): $(H_FILES) +string$(O): $(GENERAL_H_FILES) -chartype$(O): $(H_FILES) +chartype$(O): $(GENERAL_H_FILES) -encoding$(O): $(H_FILES) +encoding$(O): $(GENERAL_H_FILES) -chartype/usascii$(O): $(H_FILES) +chartype/usascii$(O): $(GENERAL_H_FILES) -chartype/unicode$(O): $(H_FILES) +chartype/unicode$(O): $(GENERAL_H_FILES) -exceptions$(O): $(H_FILES) +exceptions$(O): $(GENERAL_H_FILES) -encoding/singlebyte$(O): $(H_FILES) +encoding/singlebyte$(O): $(GENERAL_H_FILES) -encoding/utf8$(O): $(H_FILES) +encoding/utf8$(O): $(GENERAL_H_FILES) -encoding/utf16$(O): $(H_FILES) +encoding/utf16$(O): $(GENERAL_H_FILES) -encoding/utf32$(O): $(H_FILES) +encoding/utf32$(O): $(GENERAL_H_FILES) -interpreter$(O): interpreter.c $(H_FILES) +interpreter$(O): interpreter.c $(GENERAL_H_FILES) -io/io$(O): $(H_FILES) +io/io$(O): $(GENERAL_H_FILES) -io/io_stdio$(O): $(H_FILES) +io/io_stdio$(O): $(GENERAL_H_FILES) -io/io_unix$(O): $(H_FILES) +io/io_unix$(O): $(GENERAL_H_FILES) -io/io_win32$(O): $(H_FILES) +io/io_win32$(O): $(GENERAL_H_FILES) -memory$(O): $(H_FILES) +memory$(O): $(GENERAL_H_FILES) -packfile$(O): $(H_FILES) +packfile$(O): $(GENERAL_H_FILES) -parrot$(O): $(H_FILES) +parrot$(O): $(GENERAL_H_FILES) -register$(O): $(H_FILES) +register$(O): $(GENERAL_H_FILES) -rx$(O): $(H_FILES) +rx$(O): $(GENERAL_H_FILES) -rxstacks$(O): $(H_FILES) +rxstacks$(O): $(GENERAL_H_FILES) -stacks$(O): $(H_FILES) +stacks$(O): $(GENERAL_H_FILES) -embed$(O): $(H_FILES) +embed$(O): $(GENERAL_H_FILES) -core_ops$(O): $(H_FILES) core_ops.c +core_ops$(O): $(GENERAL_H_FILES) core_ops.c core_ops.c $(INC)/oplib/core_ops.h: $(OPS_FILES) ops2c.pl lib/Parrot/OpsFile.pm lib/Parrot/Op.pm $(PERL) ops2c.pl C $(OPS_FILES) -core_ops_prederef$(O): $(H_FILES) core_ops_prederef.c +core_ops_prederef$(O): $(GENERAL_H_FILES) core_ops_prederef.c core_ops_prederef.c $(INC)/oplib/core_ops_prederef.h: $(OPS_FILES) ops2c.pl lib/Parrot/OpsFile.pm lib/Parrot/Op.pm $(PERL) ops2c.pl CPrederef $(OPS_FILES)
RE: parrot rx engine
Because parts of an rx can be case-insensitive while other parts are case-sensitive, we will probably need two sorts of ops anyway (or a way to tell the op to be case-insensitive). And you will only be able to do the case folding when the whole rx is case-insensitive. I don't like your suggestion. I think we should have one set of ops, but two input strings: one is the original, the other is case- folded. Rx chooses the right one depending on the current case-sensitivity. 2 regex opcodes will be used for this purpose, op-case-sensitive-start and op-case-insensitive-start. The opcode will switch strings begins, ends, positions etc. It also means creating a copy of the input string, which is something the current rx engine in perl5 tries to avoid. And while I will agree that it is often faster todo lc($str) =~ /.../ than $str =~ /.../i that is normally only the case for small-ish strings. I don't think the perl5 approach is the best choice. Unicode case folding is much much more expensive than malloc/free. And we can always use per-thread free list, unless the regex is nested or the string is very big, we don't need to allocate any memory. Hong
Re: parrot rx engine
On Thu, Jan 31, 2002 at 11:18:58AM -0800, Hong Zhang wrote: Because parts of an rx can be case-insensitive while other parts are case-sensitive, we will probably need two sorts of ops anyway (or a way to tell the op to be case-insensitive). And you will only be able to do the case folding when the whole rx is case-insensitive. I don't like your suggestion. I think we should have one set of ops, but two input strings: one is the original, the other is case- folded. Rx chooses the right one depending on the current case-sensitivity. 2 regex opcodes will be used for this purpose, op-case-sensitive-start and op-case-insensitive-start. The opcode will switch strings begins, ends, positions etc. It also means creating a copy of the input string, which is something the current rx engine in perl5 tries to avoid. And while I will agree that it is often faster todo lc($str) =~ /.../ than $str =~ /.../i that is normally only the case for small-ish strings. I don't think the perl5 approach is the best choice. Unicode case folding is much much more expensive than malloc/free. And we can always use per-thread free list, unless the regex is nested or the string is very big, we don't need to allocate any memory. But as you say, case folding is expensive. And with this approach you are going to case-fold every string that is matched against an rx that has some part of it that is case-insensitive. The case-folding should be done in the rx itself, at compile time if possible. Then it is only done once, which will save a lot of time if the rx happens to be used in a loop or something. Graham.
Re: [PATCH] no need to rebuild everything all the time [APPLIED]
At 7:04 PM + 1/31/02, Nicholas Clark wrote: Dependencies in the Makefile are currently too broad brush. I don't enjoy waiting for everything to recompile every time I try to tweak the jit. The only file that #includes jit_struct.h is jit.c, so I feel that the Makefile dependencies should reflect this, and not cause a gratuitous recompile of everything. There are probably other auto-generated header files that world+dog should not depend on. Applied, thanks. -- Dan --it's like this--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
RE: parrot rx engine
But as you say, case folding is expensive. And with this approach you are going to case-fold every string that is matched against an rx that has some part of it that is case-insensitive. That is correct in general. But regex compiler can be smarter than that. For example, rx should optimize /a+/i to /[aA]+/ to avoid case-folding. If it is too difficult for rx to do case-folding, I think it is better to use some normalizer to do full-case folding. The case-folding should be done in the rx itself, at compile time if possible. Then it is only done once, which will save a lot of time if the rx happens to be used in a loop or something. The regular expression itself is case-folded at compile time. But I am talking about input string here, not re. Hong
RE: parrot rx engine
Tim Bunce: # On Thu, Jan 31, 2002 at 05:15:49PM +, Graham Barr wrote: # # Yes, I was assuming that. However what is to be gained by case # folding the input string ? # # Because parts of an rx can be case-insensitive while other parts # are case-sensitive, we will probably need two sorts of ops anyway # (or a way to tell the op to be case-insensitive). And you will # only be able to do the case folding when the whole rx is # case-insensitive. # # (Two sorts of ops makes most sense to me as the case-insensitive op # will need to know about fiddly charset conversion stuff whereas the # case-sensitive can just work with the list-of-integers abstraction.) # # It also means creating a copy of the input string, which is # something # the current rx engine in perl5 tries to avoid. And while I # will agree # that it is often faster todo lc($str) =~ /.../ than $str =~ /.../i # that is normally only the case for small-ish strings. # # Agreed on all counts. # # Especially as the perl6 rx engine will have to be able to # work directly on # non-trivial things like streams and generators ans suchlike. I have a suggestion similar to the ops suggestion but more flexible: Regex vtables. We'd probably need three: -normal text match -case-folded text match -generic sequence match (the stuff Larry's been talking about) This could probably be implemented without too much difficulty. Let me know if I'm brilliant, on crack, or both with this idea. --Brent Dax [EMAIL PROTECTED] Parrot Configure pumpking and regex hacker Check out the Parrot FAQ: http://www.panix.com/~ziggy/parrot.html (no, it's not mine) obra . hawt sysadmin chx0rs lathos This is sad. I know of *a* hawt sysamin chx0r. obra I know more than a few. lathos obra: There are two? Are you sure it's not the same one?
[BUG] Makefile assumes . is in my PATH
$ echo $PATH /home/nick/bin:/home/nick/bin:/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin:/usr/local/bin:/sbin:/usr/sbin:/usr/local/sbin $ make mopstest cd examples cd assembly make mops.pbc PERL=perl5.7.2-i386-freebsd cd .. cd .. perl5.7.2-i386-freebsd -I../../lib ../../assemble.pl mops.pasm mops.pbc test_parrot examples/assembly/mops.pbc test_parrot:No such file or directory *** Error code 1 Stop in /stuff/parrot/play-jit. $ ./test_parrot examples/assembly/mops.pbc Iterations:1 Estimated ops: 2 Elapsed time: 76.224550 M op/s:2.623827 happy (but slow). Or for more speed: $ ./test_parrot -j examples/assembly/mops.pbc Iterations:1 Estimated ops: 2 Elapsed time: 4.011099 M op/s:49.861645 I can't work out a portable non-hacky way to add the ./ on Unix. No, I'm not going to add . to my PATH. Nicholas Clark -- EMCFT http://www.ccl4.org/~nick/CV.html
RE: parrot rx engine
--- Brent Dax [EMAIL PROTECTED] wrote: Tim Bunce: # On Thu, Jan 31, 2002 at 05:15:49PM +, Graham Barr wrote: # # Especially as the perl6 rx engine will have to be able to # work directly on # non-trivial things like streams and generators ans suchlike. I have a suggestion similar to the ops suggestion but more flexible: Regex vtables. We'd probably need three: -normal text match -case-folded text match -generic sequence match (the stuff Larry's been talking about) Hmm... based on what I've read in Larry's message and the unicode spec, some of this could be spirited away into a customizable and/or chained unicode string iterator. For instance, it (the iterator) could return case-folded (or not) characters, it could convert pairs into Ps/Pe quote pairs (for code parsers) and remove comments (yay), and it could return locale-based graphemes (I'm scared). Since graphemes at least will be multi-character in some locales, I see how my objection to rx_literal was a Bad Thing. And I expect to be able to write a grapheme-sending unicode string iterator and plug it into a regex and have it DWIM in my Distant $future, right? Perhaps the backtracking mechanism should be *in* the iterator? Maybe the iterator will be the home of some locale evil? Could we make it handle locale character-ranges [a-o'] too? Okay, that last one's a bit much. Still, Larry did mention that business with generalized backtracking and bookkeeping... I can't wait for Apocalypse 5. Is there a custom iterator syntax/convention in parrot? I hope I don't give Larry any *new* scary ideas for the next apocalypse. This is just for entertainment purposes, after all /disclaimer. Ashley the Zealot __ Do You Yahoo!? Great stuff seeking new owners in Yahoo! Auctions! http://auctions.yahoo.com
Re: parrot rx engine
On Thu, Jan 31, 2002 at 12:50:52PM -0800, Brent Dax wrote: Let me know if I'm brilliant, on crack, or both with this idea. I've no idea :-) Tim.
ARM JIT (just about)
This just about implements a jit for ARM. It doesn't actually do any ops in assembler yet, except for end. It's names on the basis that it's for v3 or later instructions. (I may have all the names slightly wonky, but IIRC v3 is ARM600 and later cores. StrongARM and ARM8 are v4, but the machine I've got has other hardware that won't cope with the half word loads that v4 brings.) Strictly it's something like little endian, APCS 32, ARM v3 [is it even APCS-R? (Arm Procedure Call Standard). As it's using a frame pointer does that mean there's more that should be in the name? Not that gdb thinks that I got the frame pointer correct] Would it be useful for parrot to be able to use the 32*32=64 bit multiply instructions that come in post ARM v3? Problems that I remember that I encountered. (Comments in the code may indicate more). Part of these were understanding things - it doesn't mean that the current way is wrong, just that it wasn't obvious to me :-( 1: '}' is a necessary character in ARM assembler syntax, so jit2h.pl needs to be a bit smarter about deciding when to chop the end of a function 2: There is no terse way to load arbitrary 32 bit constants into a register with ARM instructions. There are 2 usual methods 1: Put the constant in a constant pool within +- 4092 or so bytes of the PC, and load it with an offset from the PC. 2: Make it with 1, 2 or 3 instructions. I believe that currently it is conjectured that it is possible to make any 32 bit value with 3 ARM instructions, and so far no-one has found any value that they couldn't make, but no-one has proved it possible and thereby made an algorithm that lets a program generate instructions to build a constant Either way, I found I was fighting the current jit which expects (at worst) to be able to split a 32 bit constant into 2 (possibly unequal) halves stored in two machine instructions. To be more flexible jit would need to know what some CPU registers contain (ie things like the current interpreter pointer), and be able to choose whether to get a value or pointer by arithmetic from a CPU register, by deferencing a CPU register (possibly with offset) or by giving up and loading a constant This will make more sense to anyone who gets hold of an ARM machine and then tries to write ops :-) 3: I wanted to put the pointer to the current interpreter in r7. This made the default precompiled call function have its branch somewhere wonky. It seems to me that Parrot::Jit-call should be returning a 2 item list the bytecode, and the offset of the branching instruction in there. 4: I think in a RISC way, so expect the offset to be of the start of the instruction that needs butchering, not the byte within it. (How the sparc position was expressed confused me for a while). it's a slow beast, particularly with -g: $ ./test_parrot examples/assembly/mops.pbc Iterations:1 Estimated ops: 2 Elapsed time: 109.129854 M op/s:1.832679 This was the first working jit, with Fix_cpcf_call() as ldr r0, [r0] mov pc, r0 Iterations:1 Estimated ops: 2 Elapsed time: 65.109552 M op/s:3.071746 This is the slightly faster jit, with Fix_cpcf_call() as ldr pc, [r0] Iterations:1 Estimated ops: 2 Elapsed time: 60.948834 M op/s:3.281441 Segmentation fault Which dmesg reports as: test_parrot: unhandled page fault at pc=0x, lr=0x (bad address=0x, code 0) and I think it may be the irritating hardware bug care of Digital's engineers' mistake in the early StrongARMs which causes problems on page faults that load PC. Anyway, it's not very tested, but it seems that just binning the runops loop gets a 75% speedup. :-) **Beware** - I've no idea if loading the addresses of registers actually works. The .pm code is still from sun4Generic.pm Nicholas Clark -- EMCFT http://www.ccl4.org/~nick/CV.html --- include/parrot/jit.h~ Tue Jan 29 14:05:45 2002 +++ include/parrot/jit.hThu Jan 31 16:52:40 2002 @@ -22,6 +22,10 @@ static void write_32(char *instr_end, ptrcast_t value); typedef void (*jit_f)(void *int_reg, void *num_reg, void *str_reg); #endif +#ifdef ARMV3 +typedef void (*jit_f)(void *int_reg, void *num_reg, void *str_reg, + void *cur_interpreter); +#endif #define MAX_SUBSTITUTION 3 --- /dev/null Mon Jul 16 22:57:44 2001 +++ jit/armv3/core.jit Thu Jan 31 16:53:24 2002 @@ -0,0 +1,15 @@ +; +; armv3_core.jit +; +; $Id: $ +; + +Parrot_end { + ldmea fp, {r4, r5, r6, r7, fp, sp, pc} +} + +Parrot_noop { +# Seems that as recognises this and assembles mov r0, r0 for a nop. + nop +} + --- /dev/null Mon Jul 16 22:57:44 2001 +++ lib/Parrot/Jit/armv3-linux.pm Thu Jan 31 23:36:03 2002 @@ -0,0 +1,30 @@ +# +# Parrot::Jit; +# +# $Id: $ +# + +package Parrot::Jit; + +use base qw(Parrot::Jit::armv3Generic); +
Re: strings: sequence-of-integer ... list of chunks
On Thu, 31 Jan 2002, Dan Sugalski wrote: There is an issue of time--what do we do, for example, in the case: my $pi = Pi::Generate; if ($pi =~ /[a-z]) { print There's a letter in here!\n; } if Pi::Generate returns a generator object that will calculate pi for you to however far you want, that regex will run forever or until it runs out of memory, whichever comes first. -- Just a thought...the following would be *really* cool: my $pi = Pi::Generate; # Check the first 200 characters only; halt w/success if NO match print There's a letter in here!\n if ($pi =~ /[a-z]/h200t); # Check the first 200 characters only; halt w/failure if NO match print There's a letter in here!\n if ($pi =~ /[a-z]/h200f); This would be useful for cases where you might be dealing with infinite data, or when you are only going to need to use the first section of a string. Dave
Re: strings: sequence-of-integer ... list of chunks
On Thursday 31 January 2002 21:03, Dave Storrs wrote: Just a thought...the following would be *really* cool: my $pi = Pi::Generate; # Check the first 200 characters only; halt w/success if NO match print There's a letter in here!\n if ($pi =~ /[a-z]/h200t); print There's a letter in here!\n if ($pi !~ /^.{0,199}?[a-z]/); # Check the first 200 characters only; halt w/failure if NO match print There's a letter in here!\n if ($pi =~ /[a-z]/h200f); print There's a letter in here!\n if ($pi =~ /^.{0,199}?[a-z]/); This would be useful for cases where you might be dealing with infinite data, or when you are only going to need to use the first section of a string. substr, maybe? -- Bryan C. Warnock [EMAIL PROTECTED]
Re: strings: sequence-of-integer ... list of chunks
On Thursday 31 January 2002 22:03, Bryan C. Warnock wrote: junk. Too tired, I missed the point entirely. On Thursday 31 January 2002 21:03, Dave Storrs wrote: Just a thought...the following would be *really* cool: my $pi = Pi::Generate; # Check the first 200 characters only; halt w/success if NO match print There's a letter in here!\n if ($pi =~ /[a-z]/h200t); print There's a letter in here!\n if ($pi !~ /^.{0,199}?[a-z]/); print There's a letter in here!\n if (substr($pi, 0, 200) !~ /[a-z]/); # Check the first 200 characters only; halt w/failure if NO match print There's a letter in here!\n if ($pi =~ /[a-z]/h200f); print There's a letter in here!\n if ($pi =~ /^.{0,199}?[a-z]/); print There's a letter in here!\n if (substr($pi, 0, 200) =~ /[a-z]/); This would be useful for cases where you might be dealing with infinite data, or when you are only going to need to use the first section of a string. substr, maybe? I said it, but didn't do it. (My first examples weren't extensible past a single letter.) -- Bryan C. Warnock [EMAIL PROTECTED]
Re: [COMMIT] PerlArray fixes
2 - Add the PMC type to the array and hash indices Poke poke. :) This would be useful, anyone working on this in near term? Also, just curious how do we plan to unify the get_index_* stuff to one function? Returning a PMC instead of specific type? -Melvin