from:"Brian Wheeler"

Re: segfault in parrot using pugs PIR output

2005-06-30 Thread Brian Wheeler

On Thu, 2005-06-30 at 19:36 +0200, Leopold Toetsch wrote:
 Brian Wheeler wrote:
  Its been a while since I tinkered with parrot so I thought I'd start
  playing again...but I've hit a segfault.
 
 Should of course not happen... But it seems that the codegen is mixing 
 old and new calling conventions.
 
 The trace:
 
8233 set I0, 1- I0=1,
8236 set I1, 0- I1=0,
8239 set I2, 0- I2=0,
8242 set I3, 0- I3=0,
8245 set I4, 0- I4=0,
8248 set S1,- S1=,
8251 set P0, PMC_C[733]   - P0=Sub=PMC(0x8af24b8 pc:8023),
8254 invokecc
 
 clearly shows some wrong function call, like another one I found in the 
 pir source:
 
  .namespace ['main']
 
  main()
 
 This shorthand call syntax doesn't work together (yet) with the new 
 set_args/get_params ... opcodes.
 


Ok, that seems reasonable, considering the newness of the new calling
conventions.


 So please verify that no function shortcut syntax is used in code 
 generation.
 
 leo
 

Thanks!

Not knowing Haskell (who said _perl_ looks like line noise?) this patch
to src/Pugs/CodeGen/PIR.hs gets it working to the point I can test an
optimizer I'm tinkering with:

@@ -320,6 +320,7 @@
 varInit ('':_) = text $ PerlScalar
 varInit x   = error $ invalid name:  ++ x

+
 {-| Compiles the current environment to PIR code. -}
 genPIR :: Eval Val
 genPIR = do
@@ -374,11 +375,17 @@
 -- XXX wrong, should be lexical
 , InsNew tempPMC PerlScalar
 , store_global.- [lit $_, tempPMC]
-]) ++ [ StmtRaw (text (name ++ ())) | PSub
name@('_':'_':_) _ _ _ - globPIL ] ++
-[ StmtRaw (text main())
-, StmtIns (exit .- [lit0])
-]
-, DeclSub main [SubANON] [ StmtRaw $ emit mainPIR ]
+   ]  ++
+{- ) ++ [ StmtRaw (text (name ++ ())) | PSub name@('_':'_':_) _ _ _
- globPIL ] ++ -}
+   [ InsNew tempPMC PerlScalar
+   , tempPMC -- find_name $ [lit __main]
+   , set_args.- [lit ()]
+   , InsNew tempPMC2 PerlScalar
+   , get_results .-[lit (0), tempPMC2]
+   , invokecc.-[tempPMC]
+   , exit.-[lit0]
+   ])
+, DeclSub __main {- [SubANON] -} [] [ StmtRaw $ emit
mainPIR ]
 ] ] ]
 where
 style = MkEvalStyle


Brian

Re: eof opcode

2004-11-29 Thread brian wheeler

Fair enough.  However, shouldn't the rest of the opcodes with an IO
object as their parameter be methods as well?  Its not a lot of ops, but
it would trim down the core a bit.

Brian


On Thu, 2004-11-25 at 08:00 +0100, Leopold Toetsch wrote:
 Brian Wheeler [EMAIL PROTECTED] wrote:
  I noticed a hole in the io.ops where the PIO stuff wasn't covered.  This
  patch creates an eof opcode which checks for end of file.
 
 Please just use the eof method of the PIO object:
 
   $I0 = $P0.eof()
 
 leo
-- 
brian wheeler [EMAIL PROTECTED]

Re: Problems with 0.1.1 release on x86-64

2004-10-19 Thread Brian Wheeler

Sigh.  I'll get this right sometime!

Brian

Index: config/auto/jit.pl
===
RCS file: /cvs/public/parrot/config/auto/jit.pl,v
retrieving revision 1.33
diff -u -r1.33 jit.pl
--- config/auto/jit.pl  8 Mar 2004 08:49:05 -   1.33
+++ config/auto/jit.pl  19 Oct 2004 18:38:41 -
@@ -171,9 +171,9 @@
   else {
 Configure::Data-set(
   jitarchname = 'nojit',
-  jitcpuarch  = 'i386',
-  jitcpu  = 'I386',
-  jitosname   = 'nojit',
+  jitcpuarch  = $cpuarch,
+  jitcpu  = $cpuarch,
+  jitosname   = $osname,
   jitcapable  = 0,
   execcapable = 0,
   cc_hasjit   = '',
Index: config/auto/memalign.pl
===
RCS file: /cvs/public/parrot/config/auto/memalign.pl,v
retrieving revision 1.10
diff -u -r1.10 memalign.pl
--- config/auto/memalign.pl 13 Oct 2004 14:37:59 -  1.10
+++ config/auto/memalign.pl 19 Oct 2004 18:38:41 -
@@ -42,6 +42,13 @@
Configure::Data-set('malloc_header', 'stdlib.h');
 }
 
+if (Configure::Data-get('ptrsize') == Configure::Data-get('intsize')) {
+   Configure::Data-set('ptrcast','int');
+  }
+else {
+   Configure::Data-set('ptrcast','long');
+  }
+
 cc_gen('config/auto/memalign/test_c.in');
 eval { cc_build(); };
 unless ($@ || cc_run_capture() !~ /ok/) {
Index: config/auto/memalign/test_c.in
===
RCS file: /cvs/public/parrot/config/auto/memalign/test_c.in,v
retrieving revision 1.4
diff -u -r1.4 test_c.in
--- config/auto/memalign/test_c.in  13 Jul 2003 18:52:37 -  1.4
+++ config/auto/memalign/test_c.in  19 Oct 2004 18:38:41 -
@@ -9,6 +9,6 @@
 
 int main(int argc, char **argv) {
void *ptr = memalign(256, 17);
-   puts(ptr  ((int)ptr  0xff) == 0 ? ok : nix);
+   puts(ptr  ((${ptrcast})ptr  0xff) == 0 ? ok : nix);
return 0;
 }
Index: config/auto/memalign/test_c2.in
===
RCS file: /cvs/public/parrot/config/auto/memalign/test_c2.in,v
retrieving revision 1.3
diff -u -r1.3 test_c2.in
--- config/auto/memalign/test_c2.in 13 Jul 2003 18:52:37 -  1.3
+++ config/auto/memalign/test_c2.in 19 Oct 2004 18:38:41 -
@@ -20,6 +20,6 @@
 *  arbitrary allocation size)
 */
int i = posix_memalign(p, s, 177);
-   puts(((int)p  0xff) == 0  i == 0 ? ok : nix);
+   puts(((${ptrcast})p  0xff) == 0  i == 0 ? ok : nix);
return i;
 }

Re: Problems with 0.1.1 release on x86-64

2004-10-18 Thread Brian Wheeler

Sorry for the delay...work interfered with my playing and I had to
transfer my CVS repo to my x86-64 machine.   I don't know if I'd
classify it as silence thereafter... as in the summary, but its pretty
close :)

Here's the diff against the current CVS.  It doesn't mess with the
default class that needs the split for the return  exception.

Brian Wheeler
[EMAIL PROTECTED]


Index: config/auto/jit.pl
===
RCS file: /cvs/public/parrot/config/auto/jit.pl,v
retrieving revision 1.33
diff -u -r1.33 jit.pl
--- config/auto/jit.pl  8 Mar 2004 08:49:05 -   1.33
+++ config/auto/jit.pl  18 Oct 2004 05:25:57 -
@@ -171,9 +171,9 @@
   else {
 Configure::Data-set(
   jitarchname = 'nojit',
-  jitcpuarch  = 'i386',
-  jitcpu  = 'I386',
-  jitosname   = 'nojit',
+  jitcpuarch  = $cpuarch,
+  jitcpu  = $cpuarch,
+  jitosname   = $osname,
   jitcapable  = 0,
   execcapable = 0,
   cc_hasjit   = '',
Index: config/auto/memalign.pl
===
RCS file: /cvs/public/parrot/config/auto/memalign.pl,v
retrieving revision 1.10
diff -u -r1.10 memalign.pl
--- config/auto/memalign.pl 13 Oct 2004 14:37:59 -  1.10
+++ config/auto/memalign.pl 18 Oct 2004 05:25:57 -
@@ -42,6 +42,13 @@
Configure::Data-set('malloc_header', 'stdlib.h');
 }
  
+if (Configure::Data-get('ptrsize') == Configure::Data-get('intsize')) {
+   Configure::Data-set('ptrcast','int');
+  }
+else {
+   Configure::Data-set('ptrcast','long');
+  }
+
 cc_gen('config/auto/mema

On Thu, 2004-10-14 at 06:37, Leopold Toetsch wrote:
 Brian Wheeler wrote:
  
  * cast warnings in default.pmc.  Changing static int cant_do_method to
  static long cant_do_method makes it compile without warnings, but its
  not the right fix. 
 
 Better would be to split the return statement and the exception in the 
 generated code.
 
  Below is a patch which fixes the first 3.
 
 Doesn't apply. Please rediff to current CVS and attach the patch.
 
 Thanks,
 leolign/test_c.in');
 eval { cc_build(); };
 unless ($@ || cc_run_capture() !~ /ok/) {
Index: config/auto/memalign/test_c.in
===
RCS file: /cvs/public/parrot/config/auto/memalign/test_c.in,v
retrieving revision 1.4
diff -u -r1.4 test_c.in
--- config/auto/memalign/test_c.in  13 Jul 2003 18:52:37 -  1.4
+++ config/auto/memalign/test_c.in  18 Oct 2004 05:25:57 -
@@ -9,6 +9,6 @@
  
 int main(int argc, char **argv) {
void *ptr = memalign(256, 17);
-   puts(ptr  ((int)ptr  0xff) == 0 ? ok : nix);
+   puts(ptr  ((${ptrcast})ptr  0xff) == 0 ? ok : nix);
return 0;
 }
Index: config/auto/memalign/test_c2.in
===
RCS file: /cvs/public/parrot/config/auto/memalign/test_c2.in,v
retrieving revision 1.3
diff -u -r1.3 test_c2.in
--- config/auto/memalign/test_c2.in 13 Jul 2003 18:52:37 -  1.3
+++ config/auto/memalign/test_c2.in 18 Oct 2004 05:25:57 -
@@ -20,6 +20,6 @@
 *  arbitrary allocation size)
 */
int i = posix_memalign(p, s, 177);
-   puts(((int)p  0xff) == 0  i == 0 ? ok : nix);
+   puts(((${ptrcast})p  0xff) == 0  i == 0 ? ok : nix);
return i;
 }

Problems with 0.1.1 release on x86-64

2004-10-14 Thread Brian Wheeler

* PARROT_CPU_ARCH is defined as i386 and PARROT_OS_NAME is nojit
when jit determination fails.  It now correctly (?) reports 'x86_64' and
'linux'.
  
* Memory alignment tests warn about a pointer size difference in cast. 
A new configuration setting ptrcast is either 'int' or 'long' depending
on what type a pointer is safe to cast to/from.  I suspect that all
cases will be 'long', but you never really know.

* cast warnings in default.pmc.  Changing static int cant_do_method to
static long cant_do_method makes it compile without warnings, but its
not the right fix. 

* Build fails with:

gcc -shared -fPIC  -g  \
-o runtime/parrot/dynext/libnci.so src/nci_test.o
/usr/bin/ld: src/nci_test.o: relocation R_X86_64_32 can not be used when
making a shared object; recompile with -fPIC
src/nci_test.o: could not read symbols: Bad value
collect2: ld returned 1 exit status
make: *** [runtime/parrot/dynext/libnci.so] Error 1

No idea how to fix this one...adding --ccflags=-fPIC doesn't help the
compile.

Below is a patch which fixes the first 3.

Brian Wheeler
[EMAIL PROTECTED]

diff -ur ../parrot-0.1.1/classes/default.pmc ./classes/default.pmc
--- ../parrot-0.1.1/classes/default.pmc 2004-10-06 10:55:36.0 -0500
+++ ./classes/default.pmc   2004-10-13 23:22:01.089477209 -0500
@@ -52,7 +52,7 @@
  
 */
  
-static int
+static long
 cant_do_method(Parrot_Interp interpreter, PMC * pmc, const char *methname)
 {
 internal_exception(ILL_INHERIT,
diff -ur ../parrot-0.1.1/config/auto/jit.pl ./config/auto/jit.pl
--- ../parrot-0.1.1/config/auto/jit.pl  2004-03-08 04:29:29.0 -0500
+++ ./config/auto/jit.pl2004-10-13 22:59:05.551927726 -0500
@@ -171,9 +171,9 @@
   else {
 Configure::Data-set(
   jitarchname = 'nojit',
-  jitcpuarch  = 'i386',
-  jitcpu  = 'I386',
-  jitosname   = 'nojit',
+  jitcpuarch  = $cpuarch,
+  jitcpu  = $cpuarch,
+  jitosname   = $osname,
   jitcapable  = 0,
   execcapable = 0,
   cc_hasjit   = '',
diff -ur ../parrot-0.1.1/config/auto/memalign/test_c2.in 
./config/auto/memalign/test_c2.in
--- ../parrot-0.1.1/config/auto/memalign/test_c2.in 2003-07-13 13:52:58.0 
-0500
+++ ./config/auto/memalign/test_c2.in   2004-10-13 23:09:54.87570 -0500
@@ -20,6 +20,6 @@
 *  arbitrary allocation size)
 */
int i = posix_memalign(p, s, 177);
-   puts(((int)p  0xff) == 0  i == 0 ? ok : nix);
+   puts(((${ptrcast})p  0xff) == 0  i == 0 ? ok : nix);
return i;
 }
diff -ur ../parrot-0.1.1/config/auto/memalign/test_c.in 
./config/auto/memalign/test_c.in
--- ../parrot-0.1.1/config/auto/memalign/test_c.in  2003-07-13 13:52:58.0 
-0500
+++ ./config/auto/memalign/test_c.in2004-10-13 23:09:59.903596577 -0500
@@ -9,6 +9,6 @@
  
 int main(int argc, char **argv) {
void *ptr = memalign(256, 17);
-   puts(ptr  ((int)ptr  0xff) == 0 ? ok : nix);
+   puts(ptr  ((${ptrcast})ptr  0xff) == 0 ? ok : nix);
return 0;
 }
diff -ur ../parrot-0.1.1/config/auto/memalign.pl ./config/auto/memalign.pl
--- ../parrot-0.1.1/config/auto/memalign.pl 2004-03-13 13:46:24.0 -0500
+++ ./config/auto/memalign.pl   2004-10-13 23:09:48.870037359 -0500
@@ -39,6 +39,14 @@
Configure::Data-set('malloc_header', 'stdlib.h');
 }
  
+if (Configure::Data-get('ptrsize') == Configure::Data-get('intsize')) {
+   Configure::Data-set('ptrcast','int');
+  }
+else {
+   Configure::Data-set('ptrcast','long');
+}
+
+
 cc_gen('config/auto/memalign/test_c.in');
 eval { cc_build(); };
 unless ($@ || cc_run_capture() !~ /ok/) {

Re: Python bytecode notes, part one

2004-07-06 Thread brian wheeler

On Mon, 2004-07-05 at 19:27, Dan Sugalski wrote:

UNARY_NEGATIVE: restore $Px; new $Py, Undef; $Py = $Px * -1; save $Py |
$Px = -1 * $Px

Wouldn't something this do what is desired?  I'm just guessing, though.

Brian

Current CVS broken?

2003-06-09 Thread Brian Wheeler

Its been a while since I've looked at parrot, so I did a cvs update
-d, perl Configure.pl, make clean, make and build failed:

$ make
gcc -o parrot -L/usr/local/lib   test_main.o blib/lib/libparrot.a -lnsl
-ldl -lm -lpthread -lcrypt -lutil
blib/lib/libparrot.a(jit_cpu.o)(.text+0x2ce0): In function
`Parrot_jit_restart_op':
: undefined reference to `Parrot_end_jit'
collect2: ld returned 1 exit status
make: *** [parrot] Error 1


It can't find Parrot_end_jit.  Sure enough, its not there:

$ nm blib/lib/libparrot.a | grep Parrot_end_jit
 U Parrot_end_jit
$ grep --recursive Parrot_end_jit *
include/parrot/jit_emit.h:static void Parrot_end_jit(Parrot_jit_info_t
*, struct Parrot_Interp * );
include/parrot/jit_emit.h:Parrot_end_jit(jit_info, interpreter);
jit/i386/jit_emit.h:static void Parrot_end_jit(Parrot_jit_info_t *,
struct Parrot_Interp * );
jit/i386/jit_emit.h:Parrot_end_jit(jit_info, interpreter);


Did I miss something obvious?

Brian Wheeler
[EMAIL PROTECTED]

Re: Current CVS broken?

2003-06-09 Thread Brian Wheeler

On Mon, 2003-06-09 at 15:23, Leopold Toetsch wrote:
 Brian Wheeler [EMAIL PROTECTED] wrote:
  Its been a while since I've looked at parrot, so I did a cvs update
  -d, perl Configure.pl, make clean, make and build failed:
 
 $ make realclean
 
 should help. We don't have all dependencies in generated files sorted out
 that far - sorry.
 

Doesn't, I'm afraid.  I removed everything but the CVS directory at the
top level, repopulated everything and tried to build it again with the
same results

Any thoughts?

Brian


  It can't find Parrot_end_jit.  Sure enough, its not there:
 
 Its in jit_cpu.c generated from core.jit.
 
  Brian Wheeler
 
 leo

RE: Parrot 0.0.9

2002-10-29 Thread brian wheeler

On Tue, 2002-10-29 at 11:48, Brent Dax wrote:
 [EMAIL PROTECTED]:
 #  Well, on thinking a bit about this, there's no reason that 
 # we have to 
 #  worry--it's perfectly OK for us to declare, unconditionally, that 
 #  segment 0 is always bytecode, 1 line number info, and so on, with 
 #  everything after position X (for some value of X) left up 
 # in the air. 
 #  A bit dodgy, true, as it means that any new known segment 
 # types we add 
 #  in will be floating, but I don't think we're going to end 
 # up with too 
 #  many performance-critical pieces in the bytecode. (Arguably 
 # it's just 
 #  the bytecode itself, the symbols, and the constants, as the 
 # rest are 
 #  looked at under exceptional circumstances or on (rare) demand)
 # 
 # *No*
 # 
 # This really kills extendability, or makes it at least very 
 # ugly. It needs to prealloc a certain number of segments. Each 
 # of this has a fixed semantic. Extending means consuming on of 
 # the preallocated fields, or using some segment beyond the 
 # preallocated area but then it needs a type field. In fact the 
 # preallocated segments also have a
 # type-field: the position in the packfile.
 
 How about this structure:
 
   HEADER
   SEGMENT 0
   CHUNK 0 (DIRECTORY)
   SIZE:
   DATA:
   CHUNK 0 ENTRY
   TYPE: DIRECTORY (type 0)
   OFFSET:

Is this really necessary?  Seems like a chicken-and-egg thing:  to know
which chuck the directory is in, you need to read the directory. 
However, since you've defined that the first chunk (0) is always the
directory, there's really no need to have it in the directory since you
know it has to be the first chunk.


Out of curiosity, would I need a separate segments if I was going to
have multiple versions of the program (say, one debugging and one
optimized) in the same file?  It looks that way.  Will the segment/chunk
ids's be consistent between builds  how do I know what they will be in
advance (for dynamically loading the 'debugging' version on demand) ?

Brian

Re: [netlabs #757] Problem mixing labels, comments and quote-marks

2002-07-13 Thread brian wheeler


On Sat, 2002-07-13 at 12:32, Tom Hughes wrote:
 In message 20020703012231$[EMAIL PROTECTED]
   Simon Glover (via RT) [EMAIL PROTECTED] wrote:
 
   This code:
  
A:# prints a
print a
end
  
doesn't assemble; the assembler dies with the error message:
  
  Use of uninitialized value in hash element at assemble.pl line 844.
  Couldn't find operator '' on line 1.
  
If you remove the s from the comment, it works fine. Likewise, if
you put the label, op and comment on the same line, ie:
  
 A: print a   # prints a
   end
  
then it assembles and runs OK.
 
 Here's a patch that will fix this. I havn't committed it because I'm
 not sure why the assember wasn't dropping comments that included quotes
 so I'm giving people who know more about the assembler than me a chance
 to comment first...


I believe it wasn't dropping the comments with quotes as a side effect
of not wanting to break things like:
print #

which breaks with the included patch.  I basically had the same patch
you do, but wasn't able to figure out how to handle the above case *and*
do the right thing with  # prints a


Brian



 
 Tom
 
 -- 
 Tom Hughes ([EMAIL PROTECTED])
 http://www.compton.nu/
 
 

 Index: assemble.pl
 ===
 RCS file: /cvs/public/parrot/assemble.pl,v
 retrieving revision 1.77
 diff -u -r1.77 assemble.pl
 --- assemble.pl   4 Jul 2002 18:36:17 -   1.77
 +++ assemble.pl   13 Jul 2002 17:30:48 -
 @@ -433,7 +433,7 @@
  
$self-{pc}++;
return if $line=~/^\s*$/ or $line=~/^\s*#/; # Filter out the comments and blank 
lines
 -  $line=~s/#[^']+$//;   # Remove trailing comments
 +  $line=~s/#.*$//;   # Remove trailing comments
$line=~s/(^\s+|\s+$)//g;   # Remove leading and trailing whitespace
#
# Accumulate lines that only have labels until an instruction is found..

[Patch] typeof op

2002-06-29 Thread brian wheeler


This patch implements a typeof op which returns the integer type or
string type of a PMC.

The test I used is:

new P0,.PerlInt
typeof S0,P0
eq S0,PerlInt,OK_1
print  not 
OK_1:
print  ok 1\\n
typeof I0,P0
eq I0,.PerlInt,OK_2
print  not 
OK_2:
print  ok 2\\n
end


This should allow a program to know what kind of PMC they're playing
with.

Brian


diff -u -r1.174 core.ops
--- core.ops28 Jun 2002 18:50:05 -  1.174
+++ core.ops29 Jun 2002 18:49:32 -
 -3884,6 +3884,27 
   goto NEXT();
 }
 
+
+=item Btypeof(out STR, in PMC)
+
+=item Btypeof(out INT, in PMC)
+
+Return the type of PMC in $2.
+
+=cut
+
+inline op typeof (out STR, in PMC) {
+  PMC *p = $2;
+  $1 = (p-vtable-name(interpreter, p));
+  goto NEXT();
+}
+
+inline op typeof (out INT, in PMC) {
+  PMC *p = $2;
+  $1 = (p-vtable-type(interpreter, p));
+  goto NEXT();
+}
+
 
 
 =item Bfind_type(out INT, in STR)

Re: .include directive for new assembler

2002-06-22 Thread brian wheeler


On Sat, 2002-06-22 at 13:06, Jeff wrote:
 Clinton A Pierce wrote:
  
  At 09:37 PM 6/21/2002 -0500, brian wheeler wrote:
  I've implemented a .include directive for the new assembler.  It
  basically changes the preprocessor to shift through the source file, and
  when an include is found, the included file is unshifted to the
  beginning.
  
  To the beginning?  Do we have any pre-processor directives (.constant,
  etc...) that are sensitive to where they're done?
  
  Either way, I'm fine with it.  As soon as it's in, I'll fix BASIC to remove
  the hand-rolled .include stuff it has now.
 
 I hadn't read the 'beginning' part, unfortunately. That actually could
 change a lot of things. If it's unshifted onto the beginning of the
 file, then each new file will become the main as it's unshifted onto the
 beginning. For example, any code outside of a macro declaration would
 get run before the main file, and potentially initialize in the wrong
 order.
 
 To be fair, this is more of a fault with the current assembler not
 having built-in support for subroutine boundaries. Correct me please,
 Brian, but I'm envisioning a situation like this:
 
  .include foo.pasm
  .include bar.pasm
 
 MAIN:
   print hi
   end
 
 Expanding to:
 
 Initialize_Bar:
   set I0,32
   branch END_OF_BAR # This is, of course, something of a bogus mechanism
 to begin with, but it should make my point.
  .constant IO_VECTOR S31
 # Some subroutines
 END_OF_BAR:
 Initialize_Foo:
   set I1,42
   branch END_OF_FOO # See above comment
  .constant IO_VECTOR S30
 END_OF_FOO:
 
 MAIN:
   print hi
   end
 
 The code is admittedly highly contrived, but note that initialization is
 being done in the wrong order with respect to the files (foo.pasm's init
 should be run then bar.pasm), and IO_VECTOR is defined to be S30 rather
 than S31. The Initialize_Bar and Initialize_Foo functions have just been
 called in the wrong order.
 
 Now, I realize that modifying that bit of the assembler to count its
 position in the array of lines is non-trivial, and that's entirely my
 fault. However, having it attach files to the beginning of the list
 would violate the principle of least surprise, and could cause problems
 with conditional macros, which I've designed but haven't seen a need to
 implement yet.
 
 If you wouldn't mind rewriting the patch so that it substitutes files
 inline at the point of .include'ing I'll be happy to revert the old
 patch and put your new one in. Sorry about the seeming change of mind,
 but this is what I get for affirming stuff at obscenely late hours.


Its not backwards, it does the right thing.  The at the beginning part
is correct since the source is treated like a queue, and the first item
is shifted off.  When an .include is found, the file is inserted at the
beginning of the queue, since all of the code prior to the .include have
been removed (shifted) off.

So this (also contrived) example works as expected:

test.pasm
--
print yo!\n
..include foo.pasm
..include foo2.pasm
print you are here\n
end
--

foo.pasm
--
print hello 
--

foo2.pasm
--
print world\n
--

Does that make more sense?

Brian




 --
 Jeff [EMAIL PROTECTED]

Re: .include directive for new assembler

2002-06-22 Thread brian wheeler


On Sat, 2002-06-22 at 20:12, Jeff wrote:
 brian wheeler wrote:
  
  Its not backwards, it does the right thing.
 
 Okay, I believe you now :) I was thinking that the insert was done at
 the beginning of the -file-, not the insertion point of the file. If you
 haven't committed, feel free to do so. I shouldn't have stuck my nose in
 :)
 


no problem! I have a tendency to not actually test my code thoroughly!

Brian



   The at the beginning part
  is correct since the source is treated like a queue, and the first item
  is shifted off.  When an .include is found, the file is inserted at the
  beginning of the queue, since all of the code prior to the .include have
  been removed (shifted) off.
  
  So this (also contrived) example works as expected:
  
  test.pasm
  --
  print yo!\n
  .include foo.pasm
  .include foo2.pasm
  print you are here\n
  end
  --
  
  foo.pasm
  --
  print hello 
  --
  
  foo2.pasm
  --
  print world\n
  --
  
  Does that make more sense?
  
  Brian
  
   --
   Jeff [EMAIL PROTECTED]

.include directive for new assembler

2002-06-21 Thread brian wheeler


I've implemented a .include directive for the new assembler.  It
basically changes the preprocessor to shift through the source file, and
when an include is found, the included file is unshifted to the
beginning.


Should I commit it?

Brian

--- assemble.pl 17 Jun 2002 03:18:17 -  1.74
+++ assemble.pl 22 Jun 2002 02:39:15 -
 -227,7 +227,9 
   my $reg_re   = qr([INSP]\d+);
   my $num_re   = qr([-+]?\d+(\.\d+([eE][-+]?\d+)?)?);
 
-  for({$self-{cur_contents}}) {
+  my todo=@{$self-{cur_contents}};
+  while(scalar(todo)) {
+$_=shift(todo);
 $line++;
 
 #
 -263,16 +265,19 
 elsif(/^\.include \s+
([^]+)
   /x) {# .include {file}
-#  if(-e $1) {
-#open FOO, $1;
-#while(FOO) {
-#  chomp;
-#}
-#close FOO;
-#  }
-#  else {
-#print STDERR Couldn't open '$1' for inclusion at line $line: $!.\n;
-#  }
+  if(-e $1) {
+open FOO, $1;
+my include;
+while(FOO) {
+  chomp;
+  push(include,$_);
+}
+unshift(todo,include);
+close FOO;
+  }
+  else {
+print STDERR Couldn't open '$1' for inclusion at line $line: $!.\n;
+  }
 }
 elsif(/^\.macro\s+
($label_re) \s*

Re: quicksort in pasm

2002-05-25 Thread brian wheeler


On Fri, 2002-05-24 at 15:10, Sean O'Rourke wrote:
  I was starting with a very simple test to decide how to determine where the
  memory overuse was coming from,
 
 I'm actually looking at this now as well, though with zip2.pasm instead of
 quicksort.  What I've found is that because zip constructs the result with
 incremental packing, and since the 3-argument pack is implemented through
 string_concat, which allocates a new string every time, it chews through
 vast quantities of memory.  At some point, these buffers (which are now
 ~45k each) stop getting freed, and Parrot grows from  10M to about 200M
 before my laptop dies a horrible death.
 
 Sort does things much differently, but I wouldn't be surprised if it's
 hittng the same things-not-being-reclaimed bug as zip.
 
  [interesting sorting results]
  Am I missing something somewhere?
 
 You've probably got it in EBCDIC mode or something.  That or it's a
 randomized quicksort, which is supposed to be faster.
 
 /s

Actually I was getting the wrong results because I was using ge  le to
test the bounds when I should have used gt  lt.

The attached quicksort fixes it.  Its still _huge_ when the memory runs.


Looking at the profile data, it looks like Parrot_pushs is the big time
user...
  %   cumulative   self  self total   
 time   seconds   secondscalls  ns/call  ns/call  name
 66.67  0.02 0.02 singlebyte_decode
 33.33  0.03 0.01  646 15479.88 15479.88  Parrot_pushs


Of course, if I understood the calling convention, I might get away with
less pushs's :)

Brian

Re: quicksort in pasm

2002-05-25 Thread brian wheeler


Crud, I forgot to attach the quicksort in the last one...

Brian


On Sat, 2002-05-25 at 21:17, brian wheeler wrote:
 On Fri, 2002-05-24 at 15:10, Sean O'Rourke wrote:
   I was starting with a very simple test to decide how to determine where the
   memory overuse was coming from,
  
  I'm actually looking at this now as well, though with zip2.pasm instead of
  quicksort.  What I've found is that because zip constructs the result with
  incremental packing, and since the 3-argument pack is implemented through
  string_concat, which allocates a new string every time, it chews through
  vast quantities of memory.  At some point, these buffers (which are now
  ~45k each) stop getting freed, and Parrot grows from  10M to about 200M
  before my laptop dies a horrible death.
  
  Sort does things much differently, but I wouldn't be surprised if it's
  hittng the same things-not-being-reclaimed bug as zip.
  
   [interesting sorting results]
   Am I missing something somewhere?
  
  You've probably got it in EBCDIC mode or something.  That or it's a
  randomized quicksort, which is supposed to be faster.
  
  /s
 
 Actually I was getting the wrong results because I was using ge  le to
 test the bounds when I should have used gt  lt.
 
 The attached quicksort fixes it.  Its still _huge_ when the memory runs.
 
 
 Looking at the profile data, it looks like Parrot_pushs is the big time
 user...
   %   cumulative   self  self total   
  time   seconds   secondscalls  ns/call  ns/call  name
  66.67  0.02 0.02 singlebyte_decode
  33.33  0.03 0.01  646 15479.88 15479.88  Parrot_pushs
 
 
 Of course, if I understood the calling convention, I might get away with
 less pushs's :)
 
 Brian
 
 
 
 



#
# quicksort.pasm
#
# Author: Brian Wheeler ([EMAIL PROTECTED])
#
# Usage:
#./parrot quicksort.pbc  /file/to/quicksort
#   
main:
new P0,PerlArray
set I1,0
m1: readlineS0,0
length  I0,S0
eq  I0,0,m2
chopn   S0,1
set_keyed   P0,I1,S0
inc I1
branch  m1

m2: set I0,0
set I1,P0
dec I1
bsr quicksort

m3: get_keyed   S0,P0,I0
print   S0
print   \n
inc I0
le  I0,I1,m3

end

#
# quicksort
#  params:
#   I0  low
#   I1  high
#   P0  array to sort
#  returns:
#   nothing
quicksort:
pushi
print   Starting quicksort with 
print   I0
printand 
print   I1
print   \n
set I11,I1  # High=high; 
set I10,I0  # Low=low;
le  I1,I0,q1# if(highlow) {
bsr partition   #  pivot=partition(p,low,high);

set I0,I10  #  quicksort(p,Low,pivot-1)
set I1,I2
dec I1
bsr quicksort

set I0,I2   #  quicksort(p,pivot+1,High)
inc I0
set I1,I11
bsr quicksort 

q1: popi# }
ret


#
# partition
#  params:
#   I0  low
#   I1  high
#   P0  array to sort
#  returns:
#   I2  pivot
partition:
pushi
pushs
get_keyed   S2,P0,I0# S2 is pivot_item
set I2,I0   # I2 is pivot index
set I10,I0  # I10 is left
set I11,I1  # I11 is right
p1: ge  I10,I11,p1e # while(left  right) {
p2: get_keyed   S0,P0,I10   #   while(p[left] = pivot_item 
gt  S0,S2,p2e
gt  I10,I11,p2e #   left = right) {
inc I10 # left++; 
branch  p2  #   }
p2e:
p3: get_keyed   S1,P0,I11   #   while(p[right]  pivot_item
le  S1,S2,p3e
lt  I11,I10,p3e #right = left) {
dec I11 # right--; 
branch  p3  #   }
p3e:
ge  I10,I11,p4  #   if(left  right) {
get_keyed   S3,P0,I10   # swap(p,left,right); 
get_keyed   S4,P0,I11
set_keyed   P0,I11,S3
set_keyed   P0,I10,S4
p4: #   }
branch  p1  # }
p1e:get_keyed   S3,P0,I11   # p[low]=p[right]; 
set_keyed   P0,I0,S3
set_keyed   P0,I11,S2   # p[right

quicksort in pasm

2002-05-24 Thread brian wheeler

1 0.00 0.02  mem_setup_allocator [41]
  0.0   0.35 0.001 0.00 0.00  mem_sys_realloc [125]
  0.0   0.35 0.001 0.00 0.02  new_hash [44]
  0.0   0.35 0.001 0.00 0.00  new_sized_resource_pool [60]
  0.0   0.35 0.001 0.00 0.01  new_tracked_header [55]
  0.0   0.35 0.001 0.00 0.00  parseflags [126]
  0.0   0.35 0.001 0.00   349.75  runops [3]
  0.0   0.35 0.001 0.00   349.75  runops_cgoto_core [4]
  0.0   0.35 0.001 0.00   349.75  runops_generic [5]
  0.0   0.35 0.001 0.00 0.00  sized_index [127]
  0.0   0.35 0.001 0.00 0.00  string_grow [64]
  0.0   0.35 0.001 0.00 0.00  string_init [128]


I don't see anything that really stands out (unlike earlier builds where
string_make was taking the most time)


Any thoughts?
Brian


#
# quicksort.pasm
#
# Author: Brian Wheeler ([EMAIL PROTECTED])
#
# Usage:
#./parrot quicksort.pbc  /file/to/quicksort
#   
main:
new P0,PerlArray
set I1,0
m1: readlineS0,0
length  I0,S0
eq  I0,0,m2
chopn   S0,1
set_keyed   P0,I1,S0
inc I1
branch  m1

m2: set I0,0
set I1,P0
dec I1
bsr quicksort

m3: get_keyed   S0,P0,I0
print   S0
print   \n
inc I0
le  I0,I1,m3

end

#
# quicksort
#  params:
#   I0  low
#   I1  high
#   P0  array to sort
#  returns:
#   nothing
quicksort:
pushi
print   Starting quicksort with 
print   I0
printand 
print   I1
print   \n
set I11,I1  # High=high; 
set I10,I0  # Low=low;
le  I1,I0,q1# if(highlow) {
bsr partition   #  pivot=partition(p,low,high);

set I0,I10  #  quicksort(p,Low,pivot-1)
set I1,I2
dec I1
bsr quicksort

set I0,I2   #  quicksort(p,pivot+1,High)
inc I0
set I1,I11
bsr quicksort 

q1: popi# }
ret


#
# partition
#  params:
#   I0  low
#   I1  high
#   P0  array to sort
#  returns:
#   I2  pivot
partition:
pushi
pushs
get_keyed   S2,P0,I0# S2 is pivot_item
set I2,I0   # I2 is pivot index
set I10,I0  # I10 is left
set I11,I1  # I11 is right
p1: ge  I10,I11,p1e # while(left  right) {
p2: get_keyed   S0,P0,I10   #   while(p[left] = pivot_item 
gt  S0,S2,p2e
ge  I10,I11,p2e #   left  right) {
inc I10 # left++; 
branch  p2  #   }
p2e:
p3: get_keyed   S1,P0,I11   #   while(p[right]  pivot_item
le  S1,S2,p3e
le  I11,I10,p3e #right  left) {
dec I11 # right--; 
branch  p3  #   }
p3e:
ge  I10,I11,p4  #   if(left  right) {
get_keyed   S3,P0,I10   # swap(p,left,right); 
get_keyed   S4,P0,I11
set_keyed   P0,I11,S3
set_keyed   P0,I10,S4
p4: #   }
branch  p1  # }
p1e:get_keyed   S3,P0,I11   # p[low]=p[right]; 
set_keyed   P0,I0,S3
set_keyed   P0,I11,S2   # p[right]=pivot_item; 
saveI11
pops
popi
restore I2  # return right;
ret

Non-vtable functionality on PMCs?

2002-05-20 Thread brian wheeler


I've been trying to catch up with parrot again (darn it, babies take
more time than I thought :) and I've come up with a question... how do
you do other things to PMCs that aren't normal ops?  In particular, I
was wondering about shift/unshift, push/pop on the PerlArray PMC. Am I
missing something obvious?

Brian

Re: .NET CLR and Parrot

2002-02-23 Thread brian wheeler


On Sat, 2002-02-23 at 13:12, Dan Sugalski wrote:
 At 12:22 PM -0500 2/23/02, Melvin Smith wrote:
 At 11:53 AM 2/23/2002 +, Simon Cozens wrote:
 I was very lucky recently to attend a talk by Ganesh Sittampalam
 introducing Microsoft .NET and the Common Language Runtime. A lot of
 what CLR is trying to do is quite similar to what we're doing with
 Parrot, so I thought it would be a good idea to briefly recap what's
 going on with CLR.
 
 I've read several specs on CLR and ILAsm; there are some good ideas.
 
 rant
 I wouldn't want Parrot to look like ILA but I wish we would at least
 have more dialogue on things like pbc format, adding directives to
 the language, etc.
 
 Then lets start. I've not been giving the assembler much attention 
 mainly because I've been assuming that it'll be marginalized quickly, 
 but I think that assumption's a bad one.
 
 We also need a PDD for the bytecode file format.
 
 We need notation for globals, notation for metadata about objects and
 types, etc. etc. but frankly the last time I looked at the assembler I sort
 of got lost.
 
 You're not alone here. The assembler gives me headaches, and I think 
 it's stymied Simon recently as well.
 

;) me too  [uh oh, I wrote bits of it!]

I've been playing around with a new assembler that might be a bit better
laid out.  

Here's what I've got done so far:
* object oriented:
* you can load more than one assembler at once
* assembly can be done in parts, i.e.
$a=new Assembler;
$a-assemble(set I0,3);
$a-assemble(set I1,4);
  is the same as:
$a=new Assembler;
$a-assemble(set I0,3\nset I1,4);
* the preprocessor is a separate routine
* directives must be in upper case (per PDD)

* local labels, plus the ability to use local labels from outside the
scope:  global_label.local_label from anywhere will find the right one.
* argument types are fully determined prior to looking up signatures


I've not got these things finished
* label arithmetic [is this really needed?]
* macros
* actual bytecode output 


I just kind of set it aside because I wasn't sure if it was really
needed.  I can dig it back up if there is interest...



Brian

Re: We have PMCs. Time to start work.

2001-11-24 Thread brian wheeler


On Fri, 2001-11-23 at 13:41, Simon Cozens wrote:
 On Fri, Nov 23, 2001 at 06:04:29PM +, Simon Cozens wrote:
  * Rewrite mops.pasm to use integer PMCs, and compare the speeds.
 
 I couldn't wait. :)
 
  % ../../test_prog mops.pbc
 Iterations:1
 Estimated ops: 2
 Elapsed time:  9.948440
 M op/s:20.103654
 
  % ../../test_prog mops_p.pbc
 Iterations:1
 Estimated ops: 2
 done
 Elapsed time:  20.994231
 M op/s:9.526427
 

I don't get it.  What kind of machine was that on?  Here are my numbers:


[bdwheele@thor parrot]$ ./test_prog mops.pbc 
Iterations:1
Estimated ops: 2
Elapsed time:  11.016154
M op/s:18.155156
[bdwheele@thor parrot]$ ./test_prog mops_p.pbc 
Iterations:1
Estimated ops: 2
done
Elapsed time:  14.288639
M op/s:13.997134

My I-regs one is slower than yours, and my P-regs one is faster...this
is on a P4/1.7G under Linux 2.4.15pre8

Any ideas?
Brian

Re: sizeof(INTVAL), sizeof(void*), sizeof(opcode_t)

2001-11-20 Thread Brian Wheeler


On Tue, 2001-11-20 at 12:19, Ken Fox wrote:
 James Mastros wrote:
  In byteswapping the bytecode ...
  
  I propose that we make INTVAL and opcode_t the same size, and gaurrenteed
  to be able to hold a void*.
 
 It sounds like you want portable byte code. Is that a goal? It seems like
 we can have either mmap'able byte code or portable byte code, but not both.
 Personally, I'd rather have portable byte code because memory is cheap
 and self-modifiying byte code opens up a lot of optimization potential. I
 know others disagree.

Hmm. It wouldn't necessarily be portable, though it probably would be on
machines with the same size  endianness.

So, on an alpha, you'd have:
sizeof(INTVAL)=sizeof(opcode_t)=sizeof(void *)=64 bits
whereas on x86 (and other 32 bit machines)
sizeof(INTVAL)=sizeof(opcode_t)=sizeof(void *)=32 bits
and on the Parrot/C-64 VM you'd have
sizeof(INTVAL)=sizeof(opcode_t)=sizeof(void *)=16 bits

Is that right?
Brian
 
 
 Are we looking at two different byte code formats? Dan?
 
 - Ken

RE: Size of integer register vs sizeof(void *)

2001-11-19 Thread Brian Wheeler


On Mon, 2001-11-19 at 12:43, Hong Zhang wrote:
  Are there any cases where a void * cannot be placed into an integer
  register?  It seems like it shouldn't happen, especially since jump and
  jsr are supposed to take an integer register and they point to a
  host-machine-address...
 
 What register are you talking about? The 16-bit x86 has 16-bit integer
 registers, but uses segment:offset as address. So the sizeof(void*) may
 be 32-bit, plus you have to deal with far, near, huge, ... (total 6 types)
 of pointers.
 
 Hong


Sorry:  should be size of parrot integer vs sizeof(void *)

Should have clarified that :)

Brian

Re: sizeof(INTVAL), sizeof(void*), sizeof(opcode_t)

2001-11-19 Thread brian wheeler


On Mon, 2001-11-19 at 19:59, James Mastros wrote:
 Hey all.
   In parellel to splitting out features (yeah, I like that better then
 platforms too) (which is going well this time, I think (I'm being a lot
 better about checking against clean checkouts, but having problems
 thinking of a good generic interface for open() and friends), I'm thinking
 about a new packfile format.
   I'm comeing across a common problem to both of them.  That is, that the
 sizeof things don't come in nice easy units.
   In byteswapping the bytecode, it's a Very Good Thing if you can just
 treat it as an array of some specifc type, so you don't have to figure out
 how man args each opcode takes (this may, in fact, be impossible, if
 oplibs are dynamicly loadable during the bytecode's runtime).
   In the feature-splitting, for example in file opening, some interfaces
 (POSIX open(), for example) want to return an integer (the fd), and some
 (win32 CreateFile(), for example) want to return a void* (the file
 handle).  (This is a pointer to a kernel-allocated structure that will
 cause a segfault to directly access, BTW, so no GC problems here.)
 
 I propose that we make INTVAL and opcode_t the same size, and gaurrenteed
 to be able to hold a void*.
 

Seems reasonable to me, since jsr and jump are slated to use an I
register to jump to a host-machine-address and start interpreting there.

Brian

Size of integer register vs sizeof(void *)

2001-11-17 Thread brian wheeler


Are there any cases where a void * cannot be placed into an integer
register?  It seems like it shouldn't happen, especially since jump and
jsr are supposed to take an integer register and they point to a
host-machine-address...

Brian

Ooops, sorry for that blank log message.

2001-10-25 Thread Brian Wheeler


Darn it, I fat fingered the log message.

This is a fix which changes the way op variants are handled.  The old
method forgot the last variant, so thing(i,i|ic,i|ic) would
generate:
thing(i,i,i)
thing(i,i,ic)
thing(i,ic,i)

but not

thing(i,ic,ic)

The new one does.

Brian

[patch] changes to trace

2001-10-17 Thread Brian Wheeler


This patch makes trace a little more useful.  It prints the constant
referred to, as well as the value of the register being accessed.

This string reverse program 

trace   1
set S0,Hello world
set S1,
set S2,
length  I0,S0
dec I0
$loop:  substr  S2,S0,I0,1
concat  S1,S2
dec I0
ge  I0,0,$loop
set S0,S1
end

produces this output (I removed some spaces on PC=16 to stop the word
wrap in my mailer):

PC=2; OP=9 (set_s_sc); ARGS=(S0=(null), Hello world)
PC=5; OP=9 (set_s_sc); ARGS=(S1=(null), )
PC=8; OP=9 (set_s_sc); ARGS=(S2=(null), )
PC=11; OP=115 (length_i_s); ARGS=(I0=0, S0=Hello world)
PC=14; OP=66 (dec_i); ARGS=(I0=11)
PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2=, S0=Hello world, I0=10,1)
PC=21; OP=110 (concat_s_s); ARGS=(S1=, S2=d)
PC=24; OP=66 (dec_i); ARGS=(I0=10)
PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=9, 0, -10)
PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2=d, S0=Hello world, I0=9,1)
PC=21; OP=110 (concat_s_s); ARGS=(S1=d, S2=l)
PC=24; OP=66 (dec_i); ARGS=(I0=9)
PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=8, 0, -10)
PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2=l, S0=Hello world, I0=8,1)
PC=21; OP=110 (concat_s_s); ARGS=(S1=dl, S2=r)
PC=24; OP=66 (dec_i); ARGS=(I0=8)
PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=7, 0, -10)
PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2=r, S0=Hello world, I0=7,1)
PC=21; OP=110 (concat_s_s); ARGS=(S1=dlr, S2=o)
PC=24; OP=66 (dec_i); ARGS=(I0=7)
PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=6, 0, -10)
PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2=o, S0=Hello world, I0=6,1)
PC=21; OP=110 (concat_s_s); ARGS=(S1=dlro, S2=w)
PC=24; OP=66 (dec_i); ARGS=(I0=6)
PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=5, 0, -10)
PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2=w, S0=Hello world, I0=5,1)
PC=21; OP=110 (concat_s_s); ARGS=(S1=dlrow, S2= )
PC=24; OP=66 (dec_i); ARGS=(I0=5)
PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=4, 0, -10)
PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2= , S0=Hello world, I0=4,1)
PC=21; OP=110 (concat_s_s); ARGS=(S1=dlrow , S2=o)
PC=24; OP=66 (dec_i); ARGS=(I0=4)
PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=3, 0, -10)
PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2=o, S0=Hello world, I0=3 1)
PC=21; OP=110 (concat_s_s); ARGS=(S1=dlrow o, S2=l)
PC=24; OP=66 (dec_i); ARGS=(I0=3)
PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=2, 0, -10)
PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2=l, S0=Hello world, I0=2,1)
PC=21; OP=110 (concat_s_s); ARGS=(S1=dlrow ol, S2=l)
PC=24; OP=66 (dec_i); ARGS=(I0=2)
PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=1, 0, -10)
PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2=l, S0=Hello world, I0=1,1)
PC=21; OP=110 (concat_s_s); ARGS=(S1=dlrow oll, S2=e)
PC=24; OP=66 (dec_i); ARGS=(I0=1)
PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=0, 0, -10)
PC=16; OP=117 (substr_s_s_i_ic); ARGS=(S2=e, S0=Hello world, I0=0,1)
PC=21; OP=110 (concat_s_s); ARGS=(S1=dlrow olle, S2=H)
PC=24; OP=66 (dec_i); ARGS=(I0=0)
PC=26; OP=43 (ge_i_ic_ic); ARGS=(I0=-1, 0, -10)
PC=30; OP=8 (set_s_s); ARGS=(S0=Hello world, S1=dlrow olleH)
PC=33; OP=0 (end)


If it looks reasonable, I'll commit it.

Brian



Index: interpreter.c
===
RCS file: /home/perlcvs/parrot/interpreter.c,v
retrieving revision 1.29
diff -u -r1.29 interpreter.c
--- interpreter.c   2001/10/14 23:47:39 1.29
+++ interpreter.c   2001/10/17 20:15:42
@@ -81,7 +81,38 @@
 fprintf(stderr, ; ARGS=();
 for(i = 1; i  interpreter-opcode_info[*pc].arg_count; i++) {
 if (i  1) { fprintf(stderr, , ); }
-fprintf(stderr, %ld, (long) *(pc + i));
+switch(interpreter-opcode_info[*pc].types[i]) {
+case PARROT_ARG_IC:
+fprintf(stderr, %ld, (long) *(pc + i));
+break;
+case PARROT_ARG_NC:
+fprintf(stderr, %f, interpreter-code-const_table-constants[*(pc 
++ i)]-number);
+break;
+case PARROT_ARG_PC:
+/* what is a PMC constant look like? */
+fprintf(stderr, %ld, (long) *(pc + i));
+break;
+case PARROT_ARG_SC:
+fprintf(stderr, \%s\, 
+interpreter-code-const_table-constants[*(pc + i)]-string-bufstart);
+break;
+case PARROT_ARG_I:
+fprintf(stderr, I%ld=%ld, (long) *(pc + i), (long) 
+interpreter-int_reg-registers[*(pc + i)]);
+break;
+case PARROT_ARG_N:
+fprintf(stderr, N%ld=%f, (long) *(pc + i), 
+interpreter-num_reg-registers[*(pc + i)]);
+break;
+case PARROT_ARG_P:
+/* what does a PMC constant look like? */
+fprintf(stderr, P%ld=???, (long) *(pc + i));
+break;
+case PARROT_ARG_S:
+if(interpreter-string_reg-registers[*(pc + i)]) {
+fprintf(stderr, S%ld=\%s\, (long) *(pc + i), 
+interpreter-string_reg-registers[*(pc + i)]-bufstart);
+

Re: string weirdness

2001-10-16 Thread Brian Wheeler


On Mon, 2001-10-15 at 21:12, Dan Sugalski wrote:
 On 15 Oct 2001, Brian Wheeler wrote:
 
  With the addition of clone, I started writing some generic routines
  which might be useful (index,lc,uc,reverse,abs,tr,etc)...and I came
  across some weirdness:
  
  doing:
 save  S0
 restore S1
  
  (since there's no set S1,S0)
  
  binds the registers together, so a change to one is a change to
  both...which doesn't happen on int registers.
 
 Right. Save on a string register pushes the pointer to the string
 structure in the register onto the stack. The same thing happens with
 PMCs, or will when they're implemented.
 
 The assumption is that, when you push a register onto the stack, you'll
 then stomp on the contents of the register. (Rather than what the register
 points to...) Otherwise a push would need to create a copy of the string
 structure and a copy of the string contents.
 
   Dan

Ok, that's fair...

With that in mind I've implemented set Sx,Sy which does a string
copy.  This is what I was after in the first place, but then noticed the
unexpected behavior of save/restore of S registers.  It makes sense now
:)

Brian

missing opcodes

2001-10-16 Thread Brian Wheeler


After writing a couple of library functions, I realized that we have to
do alot of data shuffling to do common tasks.  Reserving a register to 
hold 0 or 1 and/or filling up registers with constants just takes up cpu
time and could better be handled if the opcodes took constants directly
as well as registers.

substr was really annoying in this regard:

set I0,0
set I1,1
substr  S0,S1,I0,I1

pre-allocating the registers (I.e. I31=1, I30=0) helped a bit, but it
took away from the size of the general purpose register pool, and
obscured the code.

So, with that in mind, here are some opcodes to fill in the blanks.  I
tried to make sure that none of the opcodes below could be computed in
advance (i.e. there's no add(i,ic,ic)) and when the order of the args
doesn't matter (in add, for example), the constant comes last.

** Suggested Opcodes to minimize data shuffling **
-- arithmetic operations

[unification of add  inc, except in the 'inc I0' case]
add(i,i,ic)
add(n,n,nc)

cmod(i,i,ic)
cmod(i,ic,i)
cmod(n,n,nc)
cmod(n,nc,n)

div(i,i,ic)
div(i,ic,i)
div(n,n,nc)
div(n,nc,n)

mod(i,i,ic)
mod(i,ic,i)
mod(n,n,nc)
mod(n,n,nc)

mul(i,i,ic)
mul(n,n,nc)

pow(n,i,ic)
pow(n,ic,i)
pow(n,i,nc)
pow(n,ic,n)
pow(n,n,ic)
pow(n,nc,i)
pow(n,n,nc)
pow(n,nc,n)

[unification of sub  dec, except in the 'dec I0' case]
sub(i,i,ic)
sub(n,n,nc)


-- string operations

chopn(s,i)

concat(s,sc)

substr(s,s,i,ic)
substr(s,s,ic,i)
substr(s,s,ic,ic)
substr(s,sc,i,i)
substr(s,sc,i,ic)
substr(s,sc,ic,i)

-- trancendental operations
atan(n,i,ic)
atan(n,ic,i)
atan(n,i,nc)
atan(n,ic,n)
atan(n,n,ic)
atan(n,nc,i)
atan(n,n,nc)
atan(n,nc,n)

-- bitwise logical operations
and(i,i,ic)
or(i,i,ic)
shl(i,i,i)
shr(i,i,i)
xor(i,i,ic)




WHEW!  That's alot of opcodes.  However, it would lead to more compact
(and probably faster) bytecode to be able to avoid having to load
registers all the time with values you know in advance.

Thoughts?

Brian

[PATCH] missing opcodes

2001-10-16 Thread Brian Wheeler


Here's a patch which adds the 'missing' opcodes from the earlier email. 
It also adds the 3 arg variant of concat.

Dan/Simon/Anyone, if it seems ok, I'll commit it, but since it adds 52 
op variants, I wasn't sure if it would be ok.

Brian


Ops follow
+AUTO_OP add(i, i, ic) {
+AUTO_OP add(n, n, nc) {
+AUTO_OP cmod(i, i, ic) {
+AUTO_OP cmod(i, ic, i) {
+AUTO_OP cmod(n, n, nc) {
+AUTO_OP cmod(n, nc, n) {
+AUTO_OP div(i, i, ic) {
+AUTO_OP div(i, ic, i) {
+AUTO_OP div(n, n, nc) {
+AUTO_OP div(n, nc, n) {
+AUTO_OP mod(i, i, ic) {
+AUTO_OP mod(i, ic, i) {
+AUTO_OP mod(n, n, nc) {
+AUTO_OP mod(n, nc, n) {
+AUTO_OP mul(i, i, ic) {
+AUTO_OP mul(n, n, nc) {
+AUTO_OP pow(n, i, ic) {
+AUTO_OP pow(n, ic, i) {
+AUTO_OP pow(n, i, nc) {
+AUTO_OP pow(n, ic, n) {
+AUTO_OP pow(n, n, ic) {
+AUTO_OP pow(n, nc, i) {
+AUTO_OP pow(n, n, nc) {
+AUTO_OP pow(n, nc, n) {
+AUTO_OP sub(i, i, ic) {
+AUTO_OP sub(i, ic, i) {
+AUTO_OP sub(n, n, nc) {
+AUTO_OP sub(n, nc, n) {
+AUTO_OP chopn(s, i) {
+AUTO_OP concat(s, sc) {
+AUTO_OP concat(s, s, s) {
+AUTO_OP concat(s, s, sc) {
+AUTO_OP concat(s, sc, s) {
+AUTO_OP substr(s, s, i, ic) {
+AUTO_OP substr(s, s, ic, i) {
+AUTO_OP substr(s, s, ic, ic) {
+AUTO_OP substr(s, sc, i, i) {
+AUTO_OP substr(s, sc, i, ic) {
+AUTO_OP substr(s, sc, ic, i) {
+AUTO_OP atan(n, i, ic) {
+AUTO_OP atan(n, ic, i) {
+AUTO_OP atan(n, i, nc) {
+AUTO_OP atan(n, ic, n) {
+AUTO_OP atan(n, n, ic) {
+AUTO_OP atan(n, nc, i) {
+AUTO_OP atan(n, n, nc) {
+AUTO_OP atan(n, nc, n) {
+AUTO_OP and(i, i, ic) {
+AUTO_OP or(i, i, ic) {
+AUTO_OP shl(i, i, i) {
+AUTO_OP shr(i, i, i) {
+AUTO_OP xor(i, i, ic) {




Index: core.ops
===
RCS file: /home/perlcvs/parrot/core.ops,v
retrieving revision 1.8
diff -u -r1.8 core.ops
--- core.ops2001/10/16 14:12:41 1.8
+++ core.ops2001/10/16 15:35:36
@@ -661,8 +661,12 @@
 
 =item Badd(i, i, i)
 
+=item Badd(i, i, ic)
+
 =item Badd(n, n, n)
 
+=item Badd(n, n, nc)
+
 Set $1 to the sum of $2 and $3.
 
 =cut
@@ -670,16 +674,28 @@
 AUTO_OP add(i, i, i) {
   $1 = $2 + $3;
 }
+
+AUTO_OP add(i, i, ic) {
+  $1 = $2 + $3;
+}
   
 AUTO_OP add(n, n, n) {
   $1 = $2 + $3;
 }
 
+AUTO_OP add(n, n, nc) {
+  $1 = $2 + $3;
+}
 
+
 
 
 =item Bcmod(i, i, i)
 
+=item Bcmod(i, i, ic)
+
+=item Bcmod(i, ic, i)
+
 NOTE: This uncorrected mod algorithm uses the C language's built-in
 mod operator (x % y), which is
 
@@ -716,11 +732,23 @@
   $1 = $2 % $3;
 }
 
+AUTO_OP cmod(i, i, ic) {
+  $1 = $2 % $3;
+}
+
+AUTO_OP cmod(i, ic, i) {
+  $1 = $2 % $3;
+}
 
+
 
 
 =item Bcmod(n, n, n)
 
+=item Bcmod(n, n, nc)
+
+=item Bcmod(n, nc, n)
+
 NOTE: This uncorrected mod algorithm uses the built-in C math library's
 fmod() function, which computes
 
@@ -752,7 +780,15 @@
   $1 = fmod($2, $3);
 }
 
+AUTO_OP cmod(n, n, nc) {
+  $1 = fmod($2, $3);
+}
 
+AUTO_OP cmod(n, nc, n) {
+  $1 = fmod($2, $3);
+}
+
+
 
 
 =item Bdec(i)
@@ -795,8 +831,16 @@
 
 =item Bdiv(i, i, i)
 
+=item Bdiv(i, i, ic)
+
+=item Bdiv(i, ic, i)
+
 =item Bdiv(n, n, n)
 
+=item Bdiv(n, n, nc)
+
+=item Bdiv(n, nc, n)
+
 Set $1 to the quotient of $2 divided by $3. In the case of INTVAL division, the
 result is truncated (NOT rounded or floored).
 
@@ -806,11 +850,27 @@
   $1 = $2 / $3;
 }
 
+AUTO_OP div(i, i, ic) {
+  $1 = $2 / $3;
+}
+
+AUTO_OP div(i, ic, i) {
+  $1 = $2 / $3;
+}
+
 AUTO_OP div(n, n, n) {
   $1 = $2 / $3;
 }
 
+AUTO_OP div(n, n, nc) {
+  $1 = $2 / $3;
+}
+
+AUTO_OP div(n, nc, n) {
+  $1 = $2 / $3;
+}
 
+
 
 
 =item Binc(i)
@@ -853,6 +913,10 @@
 
 =item Bmod(i, i, i)
 
+=item Bmod(i, i, ic)
+
+=item Bmod(i, ic, i)
+
 NOTE: This corrected mod algorithm is based on the C code on page 70
 of [1]. Assuming correct behavior of C's built-in mod operator (%) with
 positive arguments, this algorithm implements a mathematically convenient
@@ -899,11 +963,67 @@
   $1 = r;
 }
 
+AUTO_OP mod(i, i, ic) {
+  INTVAL y = $2;
+  INTVAL z = $3;
+  INTVAL s = 0;
+  INTVAL r;
 
+  if (z == 0) {
+r = y;
+  }
+  else {
+if (y  0) { s += 2; y = -y; }
+if (z  0) { s += 1; z = -z; }
+
+r = y % z;
+
+switch (s) {
+  case 0 + 0:break;
+  case 0 + 1: r = r - z; break;
+  case 2 + 0: r = z - r; break;
+  case 2 + 1: r = -r;break;
+}
+  }
+
+  $1 = r;
+}
+
+AUTO_OP mod(i, ic, i) {
+  INTVAL y = $2;
+  INTVAL z = $3;
+  INTVAL s = 0;
+  INTVAL r;
+
+  if (z == 0) {
+r = y;
+  }
+  else {
+if (y  0) { s += 2; y = -y; }
+if (z  0) { s += 1; z = -z; }
+
+r = y % z;
+
+switch (s) {
+  case 0 + 0:break;
+  case 0 + 1: r = r - z; break;
+  case 2 + 0: r = z - r; break;
+  case 2 + 1: r = -r;break;
+}
+  }
+
+  $1 = r;
+}
+
+
 
 
 =item Bmod(n, n, n)
 
+=item Bmod(n, n, nc)
+
+=item Bmod(n, nc, n)
+
 NOTE:

Re: missing opcodes

2001-10-16 Thread Brian Wheeler


Heheh, I should read all of my mail before I send new ones.  I'll commit
it shortly.

Brian

On Tue, 2001-10-16 at 10:36, Dan Sugalski wrote:
 At 10:04 AM 10/16/2001 -0500, Brian Wheeler wrote:
 Thoughts?
 
 Go for it. This sort of thing's just fine. I know I made a NO NEW OPCODES 
 WITHOUT CLEARANCE statement a while ago, but that really needs clarification.
 
 The thing that's a no-no is new 'high-level' opcodes--basically anything 
 that'd require a new entry in parrot_assembly.pod. Opcode variants--tan 
 which takes a constant, say--are fine. We may, at some point, prune out the 
 less-used opcodes so we don't have a huge table of 'em if that turns out to 
 be a problem (like if there's a performance knee at 255/256 or 511/512 base 
 opcodes in the switch) but for the moment I'm not overly worried about the 
 excess of opcodes if it turns out that having special ones is worth it.
 
 It may, of course, turn out that:
 
 set N0, 12.4
 tan N1, N0
 
 ends up being faster than
 
 tan N1, 12.4
 
 in some cases (like if we almost never use tan of a constant but do use tan 
 of a register a lot, thus having to fault in the code for the constant tan 
 every time we use it) but that's something that'll have to do some 
 performance testing on to know for sure.
 
   Dan
 
 --it's like this---
 Dan Sugalski  even samurai
 [EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

substr broken?

2001-10-16 Thread Brian Wheeler


I'm getting some weird results when using substr.  Here's my test
program:

set S0,Hello world
print   Arg to Reverse: 
print   S0
print   \n
set S1,
set S2,
length  I0,S0
dec I0
$loop:  substr  S2,S0,I0,1
print   string so far:
print   S1
print   , 
print   S2
print   \n
concat  S1,S2
dec I0
ge  I0,0,$loop
set S0,S1
end


Here's the output:
Arg to Reverse: Hello world
string so far:d, d
string so far:l, l
string so far:r, r
string so far:o, o
string so far:w, w
string so far: ,  
string so far:o, o
string so far:l, l
string so far:l, l
string so far:e, e
string so far:H, H

Why is S1 set on the first iteration, before the concat?

Brian

Re: substr broken?

2001-10-16 Thread Brian Wheeler


On Tue, 2001-10-16 at 13:04, Alex Gough wrote:
 On Tue, 16 Oct 2001, Alex Gough wrote:
  On 16 Oct 2001, Brian Wheeler wrote:
  
   I'm getting some weird results when using substr.  Here's my test
   program:
  
  It's probably something wrong with the constant table or the assembly
  phase, if the script is changed so that S1 is set to -, say, it does
  more what I expect.
 
 Indeed it is.  My CVS access isn't working (ask??) so here's a new test
 to fail:

Actually its the set s,sc operation.  I've modified it to use
string_copy.   I also added your test and committed them a few minutes
ago.

Thanks!  This one had been really stumping me!

Brian


 
 Alex Gough
 
 Index: parrot/t/op/string.t
 ===
 RCS file: /home/perlcvs/parrot/t/op/string.t,v
 retrieving revision 1.11
 diff -u -r1.11 string.t
 --- parrot/t/op/string.t  2001/10/14 00:43:50 1.11
 +++ parrot/t/op/string.t  2001/10/16 18:01:34
 @@ -1,6 +1,6 @@
  #! perl -w
  
 -use Parrot::Test tests = 23;
 +use Parrot::Test tests = 24;
  
  output_is( 'CODE', OUTPUT, set_s_sc );
   set S4, JAPH\n
 @@ -296,6 +296,21 @@
  ok
  OUTPUT
  
 +output_is('CODE', OUTPUT, same constant twice bug);
 + set S0, 
 + set S1, 
 + set S2, foo
 + concat  S1,S1,S2
 + print   S1
 + print   S0
 + print   \n
 +CODE
 +foo
 +OUTPUT
 +
 +
 +#  Helper subs
 +
  # Set all string registers to values given by $_[0](reg num)
  sub set_str_regs {
my $code = shift;

sample code

2001-10-16 Thread Brian Wheeler



I've written a library of sorts which contains useful routines such as:

_absi - absolute value of I0
_absn - absolute value of I0
_chomp - chomp a string (S0) with a trailing newline
_chr - create a string (S0) with the ascii value of I0
_exit - terminate with a return code of I0
_hex - return in I0 the decimal value of the hex string in S0
_index - return the index (in I0) of S1 in S0 starting with I0
_lc - lowercase S0
_lcfirst - lowercase the first character in S0
_ord - return (in I0) the ascii value of the first character in S0
_reverse - reverse the string in S0
_rindex - do a reverse index of S1 in S0, returing the value in I0
_tr - transliterate S0 with S1 as the source and S2 as the dest
_uc - uppercase S0
_ucfirst - uppercase the first char in S0

Here's an example which would uppercase only the last character of a
string:

set S0,Hello world
bsr _lc
bsr _reverse
bsr _ucfirst
bsr _reverse
print   This should have only the 'd' uppercase: 
print   S0
print   \n


Do you think this library (called utils.pasm) should be included in
the examples directory?

Brian


#
# utils.pasm
#
# Copyright (C) 2001 Yet Another Society. All rights reserved.
# This program is free software. It is subject to the same
# license as The Parrot Interpreter.
#
# $Id$
#
# This contains a bunch of routines which can be used in other programs by
# adding a line like this:
#
#  include utils.pasm
#
# Unfortunately, since we don't have real libraries, you get the whole thing
# whether you want it or not, but its a start.
#
#
# The calling convention is register based, and starts with I0 for the first
# integer argument, S0 for the first string argument, etc.  Return value(s)
# are placed in register 0 of whatever type is being returned.

# _absi - the absolute value of I0
_absi:  gt  I0,0,$done
mul I0,I0,-1
$done:  ret


# _absn - the absolute value of N0
_absn:  gt  N0,0,$done
mul N0,N0,-1.0
$done:  ret


# _chomp - remove the last character of S0, if it is a newline
_chomp: clones
pushi
length  I0,S0
dec I0
substr  S1,S0,I0,1
ne  S1,\n,$done   
substr  S0,S0,0,I0
$done:  saveS0
popi
pops
restore S0
ret


# _chr - return the character S0 for the argument I0
_chr:   clonei
pushs
set S0,
set 
S1,\x0\x1\x2\x3\x4\x5\x6\x7\x8\x9\xA\xB\xC\xD\xE\xF\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1A\x1B\x1C\x1D\x1E\x1F\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2A\x2B\x2C\x2D\x2E\x2F\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3A\x3B\x3C\x3D\x3E\x3F\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4A\x4B\x4C\x4D\x4E\x4F\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5A\x5B\x5C\x5D\x5E\x5F\x6A\x6B\x6C\x6D\x6E\x6F\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7A\x7B\x7C\x7D\x7E\x7F\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8A\x8B\x8C\x8D\x8E\x8F\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9A\x9B\x9C\x9D\x9E\x9F\xA0\xA1\xA2\xA3\xA4\xA5\xA6\xA7\xA8\xA9\xAA\xAB\xAC\xAD\xAE\xAF\xB0\xB1\xB2\xB3\xB4\xB5\xB6\xB7\xB8\xB9\xBA\xBB\xBC\xBD\xBE\xBF\xC0\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xCA\xCB\xCC\xCD\xCE\xCF\xD0\xD1\xD2\xD3\xD4\xD5\xD6\xD7\xD8\xD9\xDA\xDB\xDC\xDD\xDE\xD\xE0\xE1\xE2\xE3\xE4\xE5\xE6\xE7\xE8\xE9\xEA\xEB\xEC\xED\xEE\xEF\xF0\xF1\xF2\xF3\xF4\xF5\xF6\xF7\xF8\xF9\xFA\xFB\xFC\xFD\xFE\xFF
substr  S0,S1,I0,1
saveS0
pops
popi
restore S0
ret


# _exit - exit the program, indicating the status of I0
_exit:
print   Program terminated with result code 
print   I0
print   \n
end


# _hex - return the decimal value of the hex argument S0
_hex:   clones
pushi
bsr _reverse
bsr _uc
length  I28,S0
set S2,S0
set I29,0
set I30,0
set I31,1
set S0,0123456789ABCDEF
set S1,
$loop:  substr  S1,S2,I30,1
bsr _index
mul I0,I0,I31
add I29,I29,I0
inc I30
mul I31,I31,16
lt  I30,I28,$loop
saveI29
popi
pops
restore I0
ret

# _index - return the index of the substring S1 in the string S0, starting
#  with position I0.  -1 is returned if it is not found
_index: clones
clonei
set S2,
length  I31,S1
length  I30,S0
$loop:  substr  S2,S0,I0,I31
eq  S2,S1,$done
inc I0
lt  I0,I30,$loop
set I0,-1
$done:  saveI0
popi
pops
restore I0
ret


# _lc - lowercase the string S0
_lc:clones
set S2,abcdefghijklmnopqrstuvwxyz
set S1,ABCDEFGHIJKLMNOPQRSTUVWXYZ
bsr _tr
saveS0
pops
restore S0
ret


# _lcfirst - lowercase the first character of S0
_lcfirst:

Re: RAM?

2001-10-16 Thread Brian Wheeler


On Tue, 2001-10-16 at 15:02, James Mastros wrote:
 On Tue, 16 Oct 2001, Dan Sugalski wrote:
  That's one way to do it, sure. You can always look at a string as a bounded
  byte buffer. One of the core 'string' types is series of 8-bit bytes. We
  couldn't manage JPEG images too well without that. ;)
 Hm.  How do you convert the bytes into integers so you can do arithmetic
 with them?  (In other words, how do you write unpack('c') and friends.)

I wrote a library in parrot which implemented ord() to do it.  I was
going to do pack/unpack but I haven't had a chance yet :)


 
  Seriously, though, what are you trying to accomplish with providing generic
  access to memory? Perhaps if you had a solid application in mind it'd make
  working out what would be needed to support it easier.
 I'm thinking of porting GCC, of course G.  However, I'm thinking that
 pretty much any c-like language is going to want somthing like this.
 

You are sick. :)   Maybe you could compile the linux kernel down into
parrot.  I did think about writing a kernel module for writing drivers
in parrot, but considering the response that the last person got for
writing a script interface module to the kernel, I put the thought away
:)

Brian

Re: sample code

2001-10-16 Thread Brian Wheeler


On Tue, 2001-10-16 at 15:58, James Mastros wrote:
 On Tue, 16 Oct 2001, Brian Wheeler wrote:
  _chr - create a string (S0) with the ascii value of I0
  _ord - return (in I0) the ascii value of the first character in S0
 There's /got/ to be a better way to write these.  I propose making opcodes
 specificly to do these.  This is an inner-loop kind of thing.

Yeah, probably, but not with the opcodes we have so far.  They were just
kind of afterthoughts and to see if its actually possible to do it with
the ops we have now :)

Brian

clone operators

2001-10-15 Thread Brian Wheeler


Because I was bored this evening, I implemented the clone operators.

Dan?

Brian




# compute the factorialrecursively!
# lets do it for the numbers 0 to 6

main:
set I1,0

$loop:
print   fact of 
print   I1
printis: 
set I0,I1
bsr fact
print   I0
print   \n
inc I1,1
eq  I1,7,$done
branch  $loop
$done:
end

# I0 is the number to compute
fact:
clonei
lt  I0,2,$is_one
set I1,I0
dec I0,1
bsr fact
mul I0,I0,I1
saveI0
branch  $done
$is_one:
set I0,1
saveI0
$done:
popi
restore I0
ret


? test.pasm
? test.inc
? test.list
? fact.pasm
? interp_guts.c
? config.h
? basic_opcodes.c
? DCn.diff
? macro.pbc
? macro.list
? local_label.pbc
? test.pbc
? euclid.pbc
? macro.pasm
? test2.pbc
? test3.pbc
? mytest.pasm
? cequ.code
? assemble.pl-new_macros
? mytest.pbc
? op_info.c
? frame_test.pbc
? fact.pbc
? frame_test.pasm
? clone.diff
? include/parrot/op_info.h
? t/test.pbc
? t/test1.c
? t/test1
Index: core.ops
===
RCS file: /home/perlcvs/parrot/core.ops,v
retrieving revision 1.4
diff -u -r1.4 core.ops
--- core.ops2001/10/15 21:18:42 1.4
+++ core.ops2001/10/16 00:29:05
@@ -1678,6 +1678,35 @@
   Parrot_push_p(interpreter);
 }
 
+
+
+=item Bclonei()
+
+=item Bclonen()
+
+=item Bclonep()
+
+=item Bclones()
+
+Save all the registers of the type indicated in the name of the operation.
+
+=cut
+
+AUTO_OP clonei() {
+  Parrot_clone_i(interpreter);
+}
+
+AUTO_OP clonen() {
+  Parrot_clone_n(interpreter);
+}
+
+AUTO_OP clones() {
+  Parrot_clone_s(interpreter);
+}
+
+AUTO_OP clonep() {
+  Parrot_clone_p(interpreter);
+}
 
 =back
 
Index: register.c
===
RCS file: /home/perlcvs/parrot/register.c,v
retrieving revision 1.10
diff -u -r1.10 register.c
--- register.c  2001/10/02 14:01:30 1.10
+++ register.c  2001/10/16 00:29:06
@@ -38,6 +38,39 @@
 }
 }
 
+/*=for api register Parrot_clone_i
+  pushes a new integer register frame on the frame stack and
+  copies the last frame to the current frame
+*/
+void
+Parrot_clone_i(struct Parrot_Interp *interpreter) {
+struct IRegChunk *chunk_base;
+
+chunk_base = CHUNK_BASE(interpreter-int_reg);
+/* Do we have any slots left in the current chunk? */
+if (chunk_base-free) {
+interpreter-int_reg = chunk_base-IReg[chunk_base-used++];
+chunk_base-free--;
+mem_sys_memcopy(chunk_base-IReg[chunk_base-used-1],
+chunk_base-IReg[chunk_base-used-2],
+sizeof(struct IReg));
+}
+/* Nope, so plan B time. Allocate a new chunk of integer register frames */
+else {
+struct IRegChunk *new_chunk;
+new_chunk = mem_allocate_aligned(sizeof(struct IRegChunk));
+new_chunk-used = 1;
+new_chunk-free = FRAMES_PER_INT_REG_CHUNK - 1;
+new_chunk-next = NULL;
+new_chunk-prev = chunk_base;
+chunk_base-next = new_chunk;
+mem_sys_memcopy(new_chunk-IReg[0],
+chunk_base-IReg[chunk_base-used-1],
+sizeof(struct IReg));
+interpreter-int_reg = new_chunk-IReg[0];
+}
+}
+
 /*=for api register Parrot_pop_i
   pops an integer register frame off of the frame stack
 */
@@ -107,6 +140,42 @@
 }
 }
 
+/*=for api register Parrot_clone_s
+  pushes a new string register frame on the frame stack and
+  copies the last frame to the current frame
+*/
+void
+Parrot_clone_s(struct Parrot_Interp *interpreter) {
+struct SRegChunk *chunk_base;
+
+chunk_base = CHUNK_BASE(interpreter-string_reg);
+/* Do we have any slots left in the current chunk? */
+if (chunk_base-free) {
+interpreter-string_reg = chunk_base-SReg[chunk_base-used++];
+chunk_base-free--;
+mem_sys_memcopy(chunk_base-SReg[chunk_base-used-1],
+chunk_base-SReg[chunk_base-used-2],
+sizeof(struct SReg));
+}
+/* Nope, so plan B time. Allocate a new chunk of string register frames */
+else {
+struct SRegChunk *new_chunk;
+new_chunk = mem_allocate_aligned(sizeof(struct SRegChunk));
+new_chunk-used = 1;
+new_chunk-free = FRAMES_PER_STR_REG_CHUNK - 1;
+new_chunk-next = NULL;
+new_chunk-prev = chunk_base;
+chunk_base-next = new_chunk;
+mem_sys_memcopy(new_chunk-SReg[0],
+chunk_base-SReg[chunk_base-used-1],
+sizeof(struct SReg));
+interpreter-string_reg = new_chunk-SReg[0];
+/* Gotta NULL them out as some string
+   functions depend on NULL strings */
+Parrot_clear_s(interpreter);
+}
+}

string weirdness

2001-10-15 Thread Brian Wheeler


With the addition of clone, I started writing some generic routines
which might be useful (index,lc,uc,reverse,abs,tr,etc)...and I came
across some weirdness:

doing:
   save  S0
   restore S1

(since there's no set S1,S0)

binds the registers together, so a change to one is a change to
both...which doesn't happen on int registers.

In addition, I've got a test program which I cannot figure out why it
fails...and I believe its related to the above problem.  Here's the test
code (sorry for the size):


set S0,Hello World
bsr _uc
print   UC is: 
print   S0
print   \n


bsr _lc
print   LC is: 
print   S0
print   \n

end

# index - return the position (I0) of a substring (S1) within a string
(S0)
_index:
clones
pushi
set I0,0
length  I1,S1
length  I2,S0
$loop:  substr  S2,S0,I0,I1
eq  S1,S2,$done
inc I0,1
lt  I0,I2,$loop
set I0,-1
branch  $done
$done:  saveI0
popi
pops
restore I0
ret


# tr - convert the string S0 by replacing chars in S1 with those in S2.
_tr:
clones
pushi
length  I22,S2
length  I21,S1
length  I20,S0
set I0,0
substr  S22,S2,I0,I22 # no set s,s
substr  S21,S1,I0,I21
substr  S20,S0,I0,I20

set S28, # our result string.
set I29,0 # zero
set I28,1 # one
set I27,0
$loop:  substr  S1,S20,I27,I28
substr  S0,S21,I29,I21
bsr _index
ne  I0,-1,$found
concat  S28,S1
branch  $next
$found: substr  S1,S22,I0,I28
concat  S28,S1
$next:  print   building string: 
print   S28
print   , I27= 
print   I27
print   , I20= 
print   I20
print   \n
inc I27,1
lt  I27,I20,$loop
saveS28
popi
pops
restore S0
ret


_lc:clones
set S1,ABCDEFGHIJKLMNOPQRSTUVWXYZ
set S2,abcdefghijklmnopqrstuvwxyz
bsr _tr
saveS0
pops
restore S0
ret 


_uc:clones
set S2,ABCDEFGHIJKLMNOPQRSTUVWXYZ
set S1,abcdefghijklmnopqrstuvwxyz
bsr _tr
saveS0
pops
restore S0
ret 



Here's what it outputs:
building string: H, I27= 0, I20= 11
building string: HE, I27= 1, I20= 11
building string: HEL, I27= 2, I20= 11
building string: HELL, I27= 3, I20= 11
building string: HELLO, I27= 4, I20= 11
building string: HELLO , I27= 5, I20= 11
building string: HELLO W, I27= 6, I20= 11
building string: HELLO WO, I27= 7, I20= 11
building string: HELLO WOR, I27= 8, I20= 11
building string: HELLO WORL, I27= 9, I20= 11
building string: HELLO WORLD, I27= 10, I20= 11
UC is:  HELLO WORLD
building string: dH, I27= 0, I20= 11
building string: dE, I27= 1, I20= 11
building string: dL, I27= 2, I20= 11
building string: dL, I27= 3, I20= 11
building string: dO, I27= 4, I20= 11
building string: d , I27= 5, I20= 11
building string: dW, I27= 6, I20= 11
building string: dO, I27= 7, I20= 11
building string: dR, I27= 8, I20= 11
building string: dL, I27= 9, I20= 11
building string: dD, I27= 10, I20= 11
LC is:  dD

S28 is somehow not growing the 2nd time a routine using _tr is
called...which leads me to believe that some binding of S28 has
occurred.


Maybe its a dumb typo.  I don't know, and I'm stumped.

Anyone have any hints?

Brian

push* change [was: simple subs...]

2001-10-13 Thread Brian Wheeler


Here's a small change to push* which copies the last context
automatically.  I was thinking about Dan's clone opcode, when I
realized that most of the time you're going to want the values, and if
you don't, you can always issue a clear.  It passes all of the make
tests, and it allows recursive programs to be written, such as the
factorial below.

Dan, what do you think?

Brian


# compute the factorialrecursively!
# lets do it for the numbers 0 to 6

main:
set I1,0

$loop:
print   fact of 
print   I1
printis: 
set I0,I1
bsr fact
print   I0
print   \n
inc I1,1
eq  I1,7,$done
branch  $loop
$done:
end

# I0 is the number to compute
fact:
pushi
lt  I0,2,$is_one
set I1,I0
dec I0,1
bsr fact
mul I0,I0,I1
saveI0
branch  $done
$is_one:
set I0,1
saveI0
$done:
popi
restore I0
ret


Index: register.c
===
RCS file: /home/perlcvs/parrot/register.c,v
retrieving revision 1.10
diff -u -r1.10 register.c
--- register.c  2001/10/02 14:01:30 1.10
+++ register.c  2001/10/14 01:28:43
@@ -24,6 +24,9 @@
 if (chunk_base-free) {
 interpreter-int_reg = chunk_base-IReg[chunk_base-used++];
 chunk_base-free--;
+mem_sys_memcopy(chunk_base-IReg[chunk_base-used-1],
+chunk_base-IReg[chunk_base-used-2],
+sizeof(struct IReg));
 }
 /* Nope, so plan B time. Allocate a new chunk of integer register frames */ 
else {
@@ -34,6 +37,9 @@
 new_chunk-next = NULL;
 new_chunk-prev = chunk_base;
 chunk_base-next = new_chunk;
+mem_sys_memcopy(new_chunk-IReg[0],
+chunk_base-IReg[chunk_base-used-1],
+sizeof(struct IReg));
 interpreter-int_reg = new_chunk-IReg[0];
 }
 }
@@ -90,6 +96,9 @@
 if (chunk_base-free) {
 interpreter-string_reg = chunk_base-SReg[chunk_base-used++];
 chunk_base-free--;
+mem_sys_memcopy(chunk_base-SReg[chunk_base-used-1],
+chunk_base-SReg[chunk_base-used-2],
+sizeof(struct SReg));
 }
 /* Nope, so plan B time. Allocate a new chunk of string register frames */
 else {
@@ -100,6 +109,9 @@
 new_chunk-next = NULL;
 new_chunk-prev = chunk_base;
 chunk_base-next = new_chunk;
+mem_sys_memcopy(new_chunk-SReg[0],
+chunk_base-SReg[chunk_base-used-1],
+sizeof(struct SReg));
 interpreter-string_reg = new_chunk-SReg[0];
 /* Gotta NULL them out as some string
functions depend on NULL strings */
@@ -159,6 +171,9 @@
 if (chunk_base-free) {
 interpreter-num_reg = chunk_base-NReg[chunk_base-used++];
 chunk_base-free--;
+mem_sys_memcopy(chunk_base-NReg[chunk_base-used-1],
+chunk_base-NReg[chunk_base-used-2],
+sizeof(struct NReg));
 }
 /* Nope, so plan B time. Allocate a new chunk of float register frames */
 else {
@@ -169,6 +184,9 @@
 new_chunk-next = NULL;
 new_chunk-prev = chunk_base;
 chunk_base-next = new_chunk;
+mem_sys_memcopy(new_chunk-NReg[0],
+chunk_base-NReg[chunk_base-used-1],
+sizeof(struct NReg));
 interpreter-num_reg = new_chunk-NReg[0];
 }
 }
@@ -225,6 +243,9 @@
 if (chunk_base-free) {
 interpreter-pmc_reg = chunk_base-PReg[chunk_base-used++];
 chunk_base-free--;
+mem_sys_memcopy(chunk_base-PReg[chunk_base-used-1],
+chunk_base-PReg[chunk_base-used-2],
+sizeof(struct PReg));
 }
 /* Nope, so plan B time. Allocate a new chunk of float register frames */
 else {
@@ -235,6 +256,9 @@
 new_chunk-next = NULL;
 new_chunk-prev = chunk_base;
 chunk_base-next = new_chunk;
+mem_sys_memcopy(new_chunk-PReg[0],
+chunk_base-PReg[chunk_base-used-1],
+sizeof(struct PReg));
 interpreter-pmc_reg = new_chunk-PReg[0];
 /* Gotta NULL them out or we might GC Wrong things later */
 Parrot_clear_p(interpreter);

Re: Fetching the PC?

2001-10-11 Thread Brian Wheeler


On Thu, 2001-10-11 at 20:49, Dan Sugalski wrote:
 At 08:25 PM 10/11/2001 -0500, Brian Wheeler wrote:
 Since we're passing guilt around, there's an equate of '*' which is the
 current PC...and I didn't document it.  You can do
set  I1,*
 and it will set I1 to the current PC.  It doesn't allow any math,
 though.  I thought about hooking up eval to various brackets but I never
 got the time before my job got busy...
 
 Absolute or relative PC?

Well, its relative to the start of the bytecode...which I suppose would
be absolute...unless multiple bytecode chunks are placed in the same
memory block, in which case it'd be relative.  Now I'm confused. :)



 
 Though I like Gregor's way of doing it:  we know the addresses (more or
 less) at compile time, so we might as well not waste ops doing
 arithmetic that we know in advance...
 
 Fair enough, though we don't really know the absolute PC at assembly time, 
 as we're all position independent. Thinking further, having the getpc 
 opcode take an offset would let us do something like:
 
getpc I0, FOO
 
 to put the absolute address of FOO into I0, suitable for jumps and jsrs.
 

This just comes out as a specialized add, right?  In fact, isn't it
this:
set I0,*
inc I0,FOO  -- assuming the assembler knew that this is an address


How are multiple bytecode chunks (i.e. libraries) going to be handled? 
Are they going to be contiguous?  Are they going to be segmented somehow
so there's a far jump which takes us out of the current block?


Brian

Re: Fetching the PC? [RESEND]

2001-10-11 Thread Brian Wheeler


Argh, my mailer crashed as I sent this, so I don't know if it went out.



On Thu, 2001-10-11 at 21:23, Dan Sugalski wrote:
 At 09:12 PM 10/11/2001 -0500, Brian Wheeler wrote:
 On Thu, 2001-10-11 at 20:49, Dan Sugalski wrote:
   At 08:25 PM 10/11/2001 -0500, Brian Wheeler wrote:
   Since we're passing guilt around, there's an equate of '*' which
is the
   current PC...and I didn't document it.  You can do
  set  I1,*
   and it will set I1 to the current PC.  It doesn't allow any math,
   though.  I thought about hooking up eval to various brackets but
I never
   got the time before my job got busy...
  
   Absolute or relative PC?
 
 Well, its relative to the start of the bytecode...which I suppose
would
 be absolute...unless multiple bytecode chunks are placed in the same
 memory block, in which case it'd be relative.  Now I'm confused. :)
 
 Absolute addresses are, well, absolute addresses. Relative addresses
are 
 offsets from the current location.
 
 I think the confusion's because the jump opcode's broken. When you say
 
jump 12
 
 It should jump to absolute address 12, not 12 bytes/words/opcodes from
the 
 current position.
 

Ok, fair enough...but should it be jumping to:
(char *)byte_code+12
or
(char *)12

If its the former, it seems there will be some problems jumping to other
bytecode blocks, and if its the latter, at the very least someone can do
a jump 0 and coredump the process.

I could be missing stuff, because I've not had a chance to review the
changes over the last few weeks




   Though I like Gregor's way of doing it:  we know the addresses
(more or
   less) at compile time, so we might as well not waste ops doing
   arithmetic that we know in advance...
  
   Fair enough, though we don't really know the absolute PC at
assembly time,
   as we're all position independent. Thinking further, having the
getpc
   opcode take an offset would let us do something like:
  
  getpc I0, FOO
  
   to put the absolute address of FOO into I0, suitable for jumps and
jsrs.
  
 
 This just comes out as a specialized add, right?  In fact, isn't it
 this:
  set I0,*
  inc I0,FOO  -- assuming the assembler knew that this is an
address
 
 Yup. Only set doesn't take * as a parameter--it can't, because there's
no 
 way to know at assembly time what the real PC will be.
 

Yeah, I thought about that afterwards (the condition where real PC is
actually a  memory address)


 How are multiple bytecode chunks (i.e. libraries) going to be
handled?
 
 They're just going to get mmapped in wherever the system puts 'em.
 
 Are they going to be contiguous?
 
 Nope.
 
 Are they going to be segmented somehow
 so there's a far jump which takes us out of the current block?
 
 Nope. Jumps and jsrs take absolute addresses, so they can go anywhere.
 Branches are relative so fixing them up to bounce between segments
would be 
 tough, but we're not going to do that. :)
 
 

Fair enough :)

Brian




   Dan
 
 --it's like
this---
 Dan Sugalski  even samurai
 [EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

Re: [BUG] Mandlebrot core

2001-10-03 Thread Brian Wheeler


On Wed, 2001-10-03 at 14:43, Leon Brocard wrote:
 Leon Brocard sent the following bits through the ether:
 
  It's all greek to me, Leon
 
 The following bytecode:
 
   0075 [01d4]:  0032  0008  000a  0008
gt_nc_ic N8, [nc:10], L2
 
 is wrong. It shouldn't be [nc:10]. 10 is the numeric constant which
 should have been in the constant table ([nc:10] refers to the 10th
 constant, there are only constants 0-6 hence the crash), so I point
 the finger at the assembler (of which I little). But I know more about
 debugging coredumps now, honest.

Its sort of the assembler, but an easy fix is to make the constant
10.0 instead of 10. 

Brian






 
 Leon
 -- 
 Leon Brocard.http://www.astray.com/
 Nanoware...http://www.nanoware.org/
 
 ... I tried to think but nothing happened!

Re: Manifest constants?

2001-10-03 Thread Brian Wheeler


On Tue, 2001-10-02 at 08:23, Dan Sugalski wrote:
 Is 'pi' a string to be looked up in a table at assemble time and
 converted to an intrinsic constant table index
 
 Yes. At some point the assembler needs to have a way to declare named 
 constants, we just haven't gotten there yet.
 


How about the attached patch?  It adds these directives: DCI, DCN, DCP,
DCS for Define Constant (Integer, Number, PMC, String)

It (currently) works by abusing the equates.  In the cases of Integer
and PMC, it is exactly equivalent.  In the case of Number and String,
the value of the equate is [sc:#] or [nc:#] which is the index into the
look up table.  It could easily be expanded to store these constant
names somewhere other than the equate hash.

Since we're anticipating the release of 0.02, I didn't apply it, and I'd
like feedback on it.

Brian 



Index: Parrot/Assembler.pm
===
RCS file: /home/perlcvs/parrot/Parrot/Assembler.pm,v
retrieving revision 1.3
diff -u -r1.3 Assembler.pm
--- Parrot/Assembler.pm 2001/10/03 23:58:41 1.3
+++ Parrot/Assembler.pm 2001/10/04 00:29:23
@@ -670,7 +670,8 @@
 
 sub has_asm_directive {
   return $_[0] =~ /^[_a-zA-Z]\w*\s+macro\s+.+$/i ||
- $_[0] =~ /^[_a-zA-Z]\w*\s+equ\s+.+$/i;
+ $_[0] =~ /^[_a-zA-Z]\w*\s+equ\s+.+$/i ||
+$_[0] =~ /^[_a-zA-Z]\w*\s+dc[inps]\s+.+$/i ;
 }
 
 
@@ -678,8 +679,10 @@
 
 =head2 handle_asm_directive
 
-Processes macros and equ directives. equ directives get stored in an equ hash.
-Macros store all program lines in an array.
+Processes macro definitions, dc* declarations, and equ directives. equ 
+directives get stored in an equ hash.  Macros store all program lines in an 
+array.  dc* declarations create a constant and an equ which refers to the 
+constant.
 
 NOTE: This function modifies @program.
 
@@ -691,6 +694,26 @@
 my( $name, $data ) = ($1, $2);
 $equate{$name} = $data;
 return 1;
+  }
+  elsif( $line =~ /^([_a-zA-Z]\w*)\s+dc([inps])\s+(.+)$/i ) {
+  my( $name, $type, $data ) = ($1, $2, $3);
+  if( $type eq i ) {
+ # TODO: we cheat on integers, since they really don't go in the
+ #   constants table.
+ $equate{$name} = $data;
+  } 
+  elsif( $type eq n ) {
+ $equate{$name} = constantize_number($data);
+  } 
+  elsif( $type eq p) {
+ # TODO: what does a PMC constant look like?
+ $equate{$name} = $data;
+  }
+  elsif( $type eq s) {
+ $data=~s/^(.+)$/$1/;
+ $equate{$name} = constantize_string($data);
+  }
+  return 1;
   }
   elsif( $line =~ /^([_a-zA-Z]\w*)\s+macro\s+(.+)$/i ) {
 # a macro definition

Re: instructions per second benchmark (in parrot ;)

2001-09-20 Thread Brian Wheeler


On Thu, 2001-09-20 at 16:46, Dan Sugalski wrote:
 At 04:54 PM 9/20/2001 -0500, Brian Wheeler wrote:
 Since all benchmarks are crap anyway, I've written a test which tells
 the average number of instructions per second.  On my athlon 700 I get
 3966053 instructions per second and on my PIII 866 I get 5081485
 instructions per second.  Do those sound like reasonable numbers?  Of
 course, since time_i is one of the opcodes looped, it probably brings
 the numbers down.
 
 That's actually what test.pasm tests. :) I just checked in a new version 
 that prints labels.

Yeah, I realized that _as soon as I posted it_.  Doh!  Heheh, for the
longest time I thought test.pasm just did stuff for a while.  Anyway,
it started out as an implementation of the bogomips.c program and slowly
became this.

 
 FWIW, my 600MHz Alpha clocks in at around 23M ops/sec. Nyah! ;-P
 

I get 10M ops/sec on the Athlon 700 using test.pasm.  time_i is a
killer, especially judging by the difference in cpu usage (test.pasm is
nearly 100% user, and mine is 75/25 user/system)

The 866 machine gets 15M.  Maybe when/if I get my 1.26GHz machine at
work I'll be able to match you :)

Brian




   Dan
 
 --it's like this---
 Dan Sugalski  even samurai
 [EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

Changes to assemble.pl: Includes and Macros

2001-09-19 Thread Brian Wheeler


Please test this out to make sure I haven't done anything stupid!

The syntax for including another file is:
include 'filename'
or
include filename

The file will be included as-is at that spot.  The listing will reflect
that a series of lines was included by printing 
# Start of filename 

# End of filename 

at the beginning and end of the included file.



Macros are defined like:

println macro   thing
print   thing
print   \n
endm

and invoked like

println My happy string

in the listing, expanded macros appear with  in front of the code.


You should be able to include up to any depth, but I didn't test it.  In
addition, macros should be able to use other macros.  I did some basic
tests, but nothing extensive.

The only incompatibility I've introduced is now assemble.pl won't read
from stdin...you have to give it a filename.  Patches welcome!


Enjoy!
Brian

Re: Wanted: Subroutine call example

2001-09-17 Thread Brian Wheeler


On Sun, 2001-09-16 at 14:26, Gregor N. Purdy wrote:
 Brian --
 
  Its not going to work, if I understand it correctly.  I tried doing it
  (even set up the symbol '*' to mean the current PC) and do it, but it
  seems the ops take a relative offset.  Take jump_i, for example:
 
 Taking this into account, I modivied jump.pasm and fixed the offsets.
 I tested it this way...
 
   $ assemble.pl --output t/jump.pbc --listing t/jump.list t/jump.pasm
   $ test_prog t/jump.pbc
   Jump test.
   Jumping to subroutine...
   $ perl disassemble.pl t/jump.pbc

 
 I also did 'vi t/jump.list'. With these hard-coded relative offsets, I
 think the program should be producing the desired output, but it
 doesn't. If it did, I could do some tricks in jakoc and/or assemble.pl
 to create a pseudo-op for jumping to any label, and another for
 setting up the return-offset based on the end-label of the destination
 block and the label we want it to return to. It sure would be nice,
 though, to have a real 'absjump_i' opcode (really start-of-bytecode-
 relative rather than current-program-counter relative).
 

This won't be needed once jsr/ret become available.  ret is (really) the
absjump_i you're looking for since its only purpose is to return to a
previous place.  Everything else can be setup as relative to the PC.


 It would also be helpful to have jump_ic and absjump_ic, rather than
 having to load these things into registers.

You can use branch_ic for this.  absjump is probably a bad idea since it
appears the return value of the opcode function is really a memory
address of the host machine.  If an absolute value (relative to the
beginning of the opcode stream) was used, then it would have to be
relocated at load-time and wouldn't be able to remain read-only.

 
 Anyway, I'm attaching the revised jump.pasm example. Maybe I still
 don't have the offsets calculated right, but I stared at jump.list
 and the disassembly listing for a while and I *think* they are
 right.
 
 I wish I had a tracing mode where I could watch exactly where it
 was jumping to and what ops it was executing. I still wonder if
 somehow its jumping out of the bytecode and landing on an op zero
 (end). Jumping out of the bytecode *should* raise an error, though,
 no?
 
 If I'm not using this right, I sure would like to see an example of
 the correct use of jump_i...

I've not figured out what jump_i is for :)  I suspect its for calculated
jumps (i.e. the switch statement) rather than for subroutines.


Overall, I think you're barking up the wrong tree.  jsr/ret are for
subroutines.  Your subroutine isn't going to work for more than one
caller, since the return offset is hardcoded, you might as well write
jump.pasm as:

MAIN:   print Jump test.\n
print Jumping to subroutine...\n
branch SUB

RET:print Returned from subroutine!\n
end

SUB:print Entered subroutine...\n
branch RET

because the bytecodes dumped out are virtually identical.

We're going to have to wait for Dan (or, you could manually apply the
patch I posted earlier which adds a simple jsr/ret implementation)


Brian

Re: Wanted: Subroutine call example

2001-09-17 Thread Brian Wheeler


On Mon, 2001-09-17 at 11:20, Gregor N. Purdy wrote:
 
 I agree that jsr/ret are what I really want, but I'm dying to play
 with baby subroutines in jako, and I think I could play enough games
 with a properly understood jump_i and some assembler magic to make
 them work. I now have jump.pasm working right (the key was to think
 in terms of word offsets, not byte offsets).

:) I think I read somewhere that impatience was a good thing...




 
   It would also be helpful to have jump_ic and absjump_ic, rather than
   having to load these things into registers.
  
  You can use branch_ic for this.  absjump is probably a bad idea since it
  appears the return value of the opcode function is really a memory
  address of the host machine.  If an absolute value (relative to the
  beginning of the opcode stream) was used, then it would have to be
  relocated at load-time and wouldn't be able to remain read-only.
 
 My mental model for this thing is Parrot is a CPU architecture and
 a lot of what I'm trying to make it do is behave as much as possible
 in that way. So, if we are talking about a CPU architecture (ISA),
 lacking the full complement of relative and absolute jumps, register
 and constant comparisons, etc. is a bummer.

I think we're ok on relative ops, but you're right, we're short of
absolute.


 
 Mucking with things like memory addresses of host machines, etc. is
 important, but only at the interpreter level, not at the ISA design
 level (IMHO).
 
 I don't see a relocation issue anyway, since the way I look at the
 interpreter code is:
 
 PC === (code - start_code)
 
 So if an instruction says  absjump 43, then I know that I need
 to do:
 
 code = (start_code + 43)
 
 and if an instruction says (rel)jump 43 then I know that I need
 to do:
 
 code += 43
 
 neither of these is a problem in my eyes.
 

Ok, they're not really problems, per se, but the way the interpreter is
constructed, the ops *always* return the offset to the next PC.


  Overall, I think you're barking up the wrong tree.  jsr/ret are for
  subroutines.  Your subroutine isn't going to work for more than one
  caller, since the return offset is hardcoded, you might as well write
  jump.pasm as:
 
 Not quite. First, recall jump.pasm (attached). It does work now that
 I've got the offsets correct. Now, there is nothing to stop us from
 moving the setting of I2 to earlier (jump2.pasm, attached). With
 appropriate labels and address arithmetic (jump3.pasm, attached) it
 is even almost readable.
 
 Now, if we also have good macros and '* == PC' and calculated labels,
 we can have some real fun (jump4.pasm, attached).
 

Ok, fair enough.  I've implemented '*' as the PC in the assembler.  I've
not done macros yet, but I remember seeing a patch somewhere.

 Of course, this doesn't allow us to do recursion, wastes registers,
 etc. But, it would allow me to play with the syntax of the language
 with a simple implementation until cooler ops are available.
 

:)  there are real architectures that couldn't (easily, if at all) do
recursion.  I think the PDP-8 was one of them.

  We're going to have to wait for Dan (or, you could manually apply the
  patch I posted earlier which adds a simple jsr/ret implementation)
 
 Probably it would be best to wait. But my Impatience is showing...
 

Like I said...I think I've heard its a good thing :)

Brian

Re: Assembler missing byte code segment header?

2001-09-17 Thread Brian Wheeler


As near as I can tell, its always been just the bytecode following
without a length specifier.  I was going to play with it, but since
we're still deciding on the file format, I thought I'd leave it alone.

Brian


On Mon, 2001-09-17 at 16:06, Gregor N. Purdy wrote:
 All --
 
 I'm not certain about this, but it sure doesn't look right to me.
 (note I did this after turning off fingerprint saving in the
 assembler for clarity):
 
   $ perl -e 'print set_i I0, 4\nend\n' | assemble.pl | od -x   
   000 55a1 0131     0002 
   020   0004   
   034
 
  * The first four bytes are the magic number
  * The next four bytes are all zero for no fixup table here
  * The next four bytes are all zero for no const table here
  * The next four bytes should be pack('l', 16) for 16 bytes
of byte code follow (NOTE: THESE SEEM TO BE MISSING)
 
 Now, the disassembler and interpreter don't seem to care, but I
 thought the current format involved 3 or 4 segments in the
 length-payload pair format. The fourth (optional) segment according
 to the docs is a place to store the source code.
 
 
 Regards,
 
 -- Gregor
  _ 
 / perl -e 'srand(-2091643526); print chr rand 90 for (0..4)'  \
 
Gregor N. Purdy  [EMAIL PROTECTED]
Focus Research, Inc.http://www.focusresearch.com/
8080 Beckett Center Drive #203   513-860-3570 vox
West Chester, OH 45069   513-860-3579 fax
 \_/

Local labels in assemble.pl

2001-09-16 Thread Brian Wheeler



I've commited a change which allows local labels to be used in parrot.
The labels are local relative to the last non-local label defined (i.e.
local labels are forgotten when a non-local is defined).

Here's my test program:
main:   print   test 1\n
branch  $ok
$ng:print   ng 1\n
branch  test2
$ok:print   ok 1\n

test2:  print   test 2\n
branch  $ok
$ng:print   ng 2\n
branch  done
$ok:print   ok 2\n

done:   end


It should make automatic code generation easier for jako.  If a listing
is produced, local labels are _not_ shown.

Brian

RFC: bytecode file format (again)

2001-09-16 Thread Brian Wheeler


After reading various posts about the bytcode file format it occurred to
me that we need to determine what we need :)   

Meta-information:
* Magic cookie
* version 
* endian/size markers
* index of all chunks for fast lookup

Things we need to store:
* bytecode 
* external symbols/list of modules required
* public symbols
* source code 
* raw data (think: predefined PMCs, DATA sections ,etc)
* other stuff :)


So, in the spirit of KISS, what about this format:

Offset Size Description
0 4 Magic Cookie 'PAR0'
4 8 Endian mark (0x0123456789ABCDEF)
12 4 Version (Major * 0x1 + Minor)
16 4 Size Marker
20 12 Padding/Reserved for future use
32 n Index (n=# of entries * 16)
32+n m Data blocks, stored sequentially


Each index entry 16 bytes, and is laid out like::
Offset Size Description
0 4 Type (bytecode, comment, source, fixup, etc)
4 4 ID (to differentiate blocks of the same type)
8 4 Offset (offset from beginning of file of data)
12 4 Length (in bytes)


Block IDs are used when we need to differentiate between different
blocks of the same type, i.e. constant data or multiple sets of bytecode
in the same file as in libraries.


Given this overall format, the format of each data block varies
depending on its type.

Type_id Name
 Ignore/Empty/Invalid - This block is ignored
0001 Bytecode
0002 String Constants
0003 External Symbols
0004 Public Symbols
0005 Source Code
0006 Debugging Info


Thoughts on *this* can of worms?

Brian

segfaults in today's cvs

2001-09-14 Thread Brian Wheeler


I've tracked it down to string problems.

Looks like:
init_bytecode(program_code);
calls
read_constants_table(program_code);
calls
Parrot_string_constants[i++] = string_make(*program_code /* ouch */,
buflen, encoding, flags, type);
calls
string_compute_strlen(s);

which segfaults.

Any ideas what's going on?

Brian

Re: Calls for a Macro Assembler Guy

2001-09-14 Thread Brian Wheeler


On Fri, 2001-09-14 at 10:20, Dan Sugalski wrote:
 Okay, we've had a number of people in favor of a good macro assembler for 
 Parrot. Given that, do we have anyone who'll volunteer to define, maintain, 
 and extend the thing? (Presuming we jump off of the current assembler, 
 which seems reasonable)
 
 There probably isn't a huge amount to do with the thing--maintain macro 
 substitutions, handle local labels, manage sub definitions, and suchlike 
 things.
 

Wouldn't it largely be just filtering the input through cpp?

Brian

 Anyone?
 
   Dan
 
 --it's like this---
 Dan Sugalski  even samurai
 [EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

RFC: Bytecode file format

2001-09-14 Thread Brian Wheeler


I've been thinking alot about the bytecode file format lately.  Its
going to get really gross really fast when we start adding other
(optional) sections to the code.

So, with that in mind, here's what I propose:

* All data sizes are in longwords (4 bytes) because that's just the way
things are :)

* The file is composed of a header (which is really just a magic 
cookie) , a series of data chunks, and a directory (of sorts)


Offset  Length  Description
0   1   Magic Cookie (0x013155a1)
1   n   Data
n+1 m   Directory Table
m+n+1   1   Offset of beginning of directory table (i.e. n+1)

The directory is after the data so offsets can be determined as the data
is written.  The directory offset is at the very end, so it can be
determined before the directory is written, and easily found by loaders.


Each Directory Entry consists of 3 longs:

Offset  Length  Description
0   1   ID/Type of chunk
1   1   Size of chunk
2   1   Offset from beginning of file


Chunk types we've seen so far are: Bytecode, Fixup, String Table. Just
guessing, but we'll probably need things likes symbol tables, source
chunks, notes, etc.

What do you guys think? 

Brian

Re: RFC: Bytecode file format

2001-09-14 Thread Brian Wheeler


On Fri, 2001-09-14 at 15:44, Buddha Buck wrote:
 At 03:10 PM 09-14-2001 -0500, Brian Wheeler wrote:
 I've been thinking alot about the bytecode file format lately.  Its
 going to get really gross really fast when we start adding other
 (optional) sections to the code.
 
 So, with that in mind, here's what I propose:
 
 snip
 
 
 What do you guys think?
 
 Have you taken a look at the old Amiga IFF format?  It consisted mainly of 
 chunks identified by a 32-bit type code and  a chunk-length code.  While 
 most implementations were for specific multi-media applications (chunks 
 defining sound formats, chunks defining image formats, etc), the standard 
 itself was data-neutral.

:) That was one of the references I used.  I looked up these formats on
wotsit before proposing:
* IFF 
Consistant, but you've got to scan the whole file to find out ifa
chunk is even there.
* Doom WAD
Clean format, mini-directory at the end.
* ELF
Overkill, I think, but does give some ideas of the types of sections
we'll be needing.


 
 I believe that Microsoft is using a derivative of that format for some of 
 its files, and I think that TIFF files are another instantiation.
 
 It may be worth looking at to avoid re-inventing wheels.
 
 

Agreed.  

Brian

Re: RFC: Bytecode file format

2001-09-14 Thread Brian Wheeler


On Fri, 2001-09-14 at 15:42, Dan Sugalski wrote:
 At 03:10 PM 9/14/2001 -0500, Brian Wheeler wrote:
 I've been thinking alot about the bytecode file format lately.  Its
 going to get really gross really fast when we start adding other
 (optional) sections to the code.
 
 So, with that in mind, here's what I propose:
 
 * All data sizes are in longwords (4 bytes) because that's just the way
 things are :)
 
 Nope. At the very least, a bytecode file needs to start with:
 
 8-byte word:endianness (magic value 0x123456789abcdef0)
 byte:   word size
 byte[7]:empty
 word:   major version
 word:   minor version
 
 Where all word values are as big as the word size says they are.
 
 The magic value can be something else, but it should byteswap such that if 
 you read it in you can tell whether it was a big-endian write or a 
 little-endian write.
 
 Followed by the offsets to the various segments in the file. The directory 
 (basically the offsets) *must* be at the very beginning, right after the 
 header.
 
 Also, don't choose a stream format that makes writing the file easy--choose 
 one that makes reading easy and fast. Two or three passes before writing 
 are fine.
 


And...in another message

 Heh. Doing our bytecode files in IFF format would be really nifty. We 
 could even see about raiding the old Fred Fish library for C source to 
 read and write 'em... :)

Fred Fish...that's a name I've not heard in a long time.  I still have a
floppy or two with fish disk ### stuck on them.

Ok, I'll admit I was initially considering IFF but it seemed too
scary...but then it could be residual feelings from when I was a
youngster and tried to write them with BASIC :)

Ok, what if we did IFF with these caveats:
* all chunks must be padded to 4 bytes (instead of IFF's 2)
* no nesting of FORMs 

Chunks we'd need are:

Name:   'PINF' - Parrot Information
Size:   28 bytes + size of directory
Optional:   No
Data:
longmagic cookie (or will PINF) be enough?
8-byte word:endianness (magic value 0x123456789abcdef0)
byte:   word size
byte[7]:empty
word:   major version
word:   minor version
longcount of directory entries
--- directory goes here ---
-- each entry as follows --
longtype of chunk
longoffset


Name:   'PBYT' - Parrot Bytecode
Size:   Varies
Optional:   Sure. :)
Data:
bytes of the bytecode


Name:   'PSTR' - Parrot String Table 
Size:   Varies
Optional:   Yes
Data:
longCount of string entries
--- each string as follows ---
longbyte length
n bytes + pad   string data


Name:   'PFIX' - Parrot Fixup Table
Size:   Varies
Optional:   Yes
Data:
--- beats me...how are we doing fixups? ---


Name:   'PNOT' - Parrot Notes Block
Size:   Varies
Optional:   Yes
Data:
free-form text for 'notes' about the file.




How's this?

Brian

Re: pasm.pl: a different parrot assembler

2001-09-13 Thread Brian Wheeler


On Thu, 2001-09-13 at 02:23, Simon Cozens wrote:
 On Wed, Sep 12, 2001 at 11:23:27PM -0500, Brian Wheeler wrote:
  I've been having tons of problems with labels in the current
  assembler...so I wrote my own.  It should provide all of the features
  that the current assembler has.  I'ved tested and ran all of the current
  t/*.pasm files.
  
  Here it is...feedback is always welcome.
 
 Cool. Now, if it only handled abbreviated op names, I'd apply it... :)
 
 Simon


You, sir, are a very picky man. :)

Here it is...I tested it on this euclid.pasm:

MAIN:   set I1, 96
set I2, 64
set I3, 0
set S1, Algorithm E (Euclid's algorithm)
print S1
E1: mod I4, I1, I2
E2: eq I4, I3, DONE, E3
E3: set I1, I2
set I2, I4
branch E1
DONE:   print I2
end


Enjoy!
Brian



#! /usr/bin/perl -w
#
# pasm.pl - take a parrot assembly file and spit out a bytecode file
#   This is based heavily on assemble.pl
# Brian Wheeler ([EMAIL PROTECTED])

use strict;

my $opt_c;
if (@ARGV and $ARGV[0] eq -c) {
shift @ARGV;
$opt_c = 1;
}


# define data types
my(%pack_type)=('i'='l','n'='d');
my(%real_type)=('I'='i','i'='i',
'N'='i','n'='n',
'S'='i','s'='i',
'D'='i');

# compute sizes
my(%sizeof);
foreach (keys(%real_type)) {
$sizeof{$_}=length(pack($pack_type{$real_type{$_}},0));
}


# get opcodes from guts.
open GUTS, interp_guts.h;
my %opcodes;
while (GUTS) {
next unless /\tx\[(\d+)\] = ([a-z_]+);/;
$opcodes{$2}{CODE} = $1;
}
close GUTS;

# get opcodes and their arg lists
open OPCODES, opcode_table or die Can't get opcode table, $!/$^E;
while (OPCODES) {
next if /^\s*#/;
chomp;
s/^\s+//;
next unless $_;
my ($name, $args, @types) = split /\s+/, $_;
my @rtypes=@types;
@types=map { $_ = $real_type{$_}} @types;
$opcodes{$name}{ARGS} = $args;
$opcodes{$name}{TYPES} = [@types];
$opcodes{$name}{RTYPES}=[@rtypes];
}
close OPCODES;


# read source and assemble
my $pc=0; my $op_pc=0;
my ($bytecode,%label,%fixup,%constants,@constants);
my $line=0;
while() {
$line++;
chomp;
s/^\s*//;
s/\s*$//;
next if(/^\#/ || $_ eq );
if(m/^((\S+):)?\s*(.+)?/) {
my($label,$code)=($2,$3);
if(defined($label)  $label ne ) {
if(exists($label{$label})) {
error('$label' already defined!);
}
if(exists($fixup{$label})) {
# backpatch everything with this PC.
while(scalar(@{$fixup{$label}})) {
my $op_pc=shift(@{$fixup{$label}});
my $offset=shift(@{$fixup{$label}});
substr($bytecode,$offset,4)=pack('l',($pc-$op_pc)/4);
}
delete($fixup{$label});  
}
$label{$label}=$pc; # store it.
}
next if(!defined($code));
1 while $code=~s/\([^\\\]*(?:\\.[^\\\]*)*)\/constantize($1)/eg;
$code=~s/,/ /g;
my($opcode,@args)=split(/\s+/,$code);
	$opcode=lc($opcode);
if (!exists $opcodes{$opcode}) {
	# try to determine _real_ opcode.
	my @arg_t=();
	foreach (@args) {
		if(m/^([INPS])\d+$/) {
		# a register.
		push @arg_t,lc($1);
		} elsif(m/^\d+$/) {
		# a constant of some sort
		push @arg_t,'(ic|nc|sc)';
		} else {
		# a label
		push @arg_t,'ic';
		}
	}
	my $test;
	my($first,$last)=($arg_t[0],$arg_t[-1]);
	if($first ne $last) {
		$test=$opcode\_$first\_$last;
	} else {
		$test=$opcode\_$first;
	}
	my($found_op)=0;
	foreach my $op (grep($_=~/^$opcode/,keys(%opcodes))) {
		if($op=~/$test/) {
		$opcode=$op;
		$found_op=1;
		last;
		}
	}
error(No opcode $opcode in $_) if(!$found_op);
}
if (@args != $opcodes{$opcode}{ARGS}) {
error(Wrong arg count--got .scalar(@args). needed
.$opcodes{$opcode}{ARGS});
}
$bytecode .= pack l, $opcodes{$opcode}{CODE};
$op_pc=$pc;
$pc+=4;

foreach (0..$#args) {
my($rtype)=$opcodes{$opcode}{RTYPES}[$_];
my($type)=$opcodes{$opcode}{TYPES}[$_];
if($rtype eq I || $rtype eq N || $rtype eq P || $rtype eq
S) {
# its a register argument
$args[$_]=~s/^[INPS](\d+)$/$1/i;
$pc+=$sizeof{$rtype}
} elsif($rtype eq D) {
# a destination
if(!exists($label{$args[$_]})) {
# we have not seen it yet...put it on the fixup list
push(@{$fixup{$args[$_]}},$op_pc,$pc);
$args[$_]=0x;
} else {
$args[$_]=($label{$args[$_]}-$op_pc)/4;
}
$pc+=$sizeof{$rtype};
} else {
$args[$_]=oct($args[$_]) if($args[$_]=~/^0/);
$pc+=$sizeof{$rtype

patch: assemble.pl choosing wrong op sometimes...

2001-09-13 Thread Brian Wheeler


I caught it trying to use inc_i_ic instead of inc_i in a test program I
was running.  this patch fixes it.

Brian


Index: assemble.pl
===
RCS file: /home/perlcvs/parrot/assemble.pl,v
retrieving revision 1.12
diff -r1.12 assemble.pl
112c112
   if($op=~/$test/) {
---
   if($op=~/^$test$/) {

Re: patch: assemble.pl choosing wrong op sometimes...

2001-09-13 Thread Brian Wheeler


On Thu, 2001-09-13 at 09:52, Jarkko Hietaniemi wrote:
 On Thu, Sep 13, 2001 at 09:54:35AM -0500, Brian Wheeler wrote:
  I caught it trying to use inc_i_ic instead of inc_i in a test program I
  was running.  this patch fixes it.
  
  Brian
  
  
  Index: assemble.pl
  ===
  RCS file: /home/perlcvs/parrot/assemble.pl,v
  retrieving revision 1.12
  diff -r1.12 assemble.pl
  112c112
 if($op=~/$test/) {
  ---
 if($op=~/^$test$/) {
 
 Unless $test is really a pattern here this is $op eq $test...
 

It is a pattern.  If you have a constant, it doesn't know what kind, so
the pattern it creates is (ic|nc|sc)

Brian

patch: print op cleanups, new ops, assembler support of \(whatever)

2001-09-13 Thread Brian Wheeler


This patch gives the assembler support of '\a','\n','\r','\t', and '\\'
in string constants.

In addition, it changes (for all registers) I reg %li is ... to just
the value of the register.  Printing constants is also supported, but
alas, you have to specify the type (print_sc, print_ic, print_nc).

Brian




? pasm.pl
? patch
? test2.pbc
? test3.pbc
? euclid0.pbc
? euclid1.pbc
? euclid.pbc
? print_cleanups.diff
? bitops.pbc
? bitops+assembler.patch
? t/bitops.pasm
Index: assemble.pl
===
RCS file: /home/perlcvs/parrot/assemble.pl,v
retrieving revision 1.12
diff -r1.12 assemble.pl
112c112
   if($op=~/$test/) {
---
   if($op=~/^$test$/) {
121,122c121
 error(Wrong arg count--got .scalar(@args). needed
 .$opcodes{$opcode}{ARGS});
---
 error(Wrong arg count--got .scalar(@args). needed 
.$opcodes{$opcode}{ARGS});
131,132c130
 if($rtype eq I || $rtype eq N || $rtype eq P || $rtype eq
 S) {
---
 if($rtype eq I || $rtype eq N || $rtype eq P || $rtype eq S) {
219a218,220
 # handle \ characters in the constant
 my %escape = ('a'=\a,'n'=\n,'r'=\r,'t'=\t,'\\'='\\',);
 $s=~s/\\([anrt\\])/$escape{$1}/g;
223a225
 
Index: basic_opcodes.ops
===
RCS file: /home/perlcvs/parrot/basic_opcodes.ops,v
retrieving revision 1.13
diff -r1.13 basic_opcodes.ops
120c120
   printf(I reg %li is %li\n, P1, INT_REG(P1));
---
   printf(%li, INT_REG(P1));
121a122,127
 
 /* PRINT ic */
 AUTO_OP print_ic {
   printf(%li, P1);
 }
 
212c218
   printf(N reg %li is %f\n, P1, NUM_REG(P1));
---
   printf(%f, NUM_REG(P1));
214a221,225
 /* PRINT nc */
 AUTO_OP print_nc {
   printf(%f, P1);
 }
 
317c328,334
printf(S reg %li is %.*s\n, P1, (int) string_length(s), (char *) s-bufstart);
---
   printf(%.*s,(int)string_length(s),(char *) s-bufstart);
 }
 
 /* PRINT sc */
 AUTO_OP print_sc {
   STRING *s = Parrot_string_constants[P1];
   printf(%.*s,(int)string_length(s),(char *) s-bufstart);
318a336
 
Index: opcode_table
===
RCS file: /home/perlcvs/parrot/opcode_table,v
retrieving revision 1.12
diff -r1.12 opcode_table
56a57
 print_sc  1   s
86a88
 print_ic   1  i
88a91
 print_nc  1   n

Yet another patch: assemble.pl - better opcode guessing

2001-09-13 Thread Brian Wheeler


This does a better job at guessing the correct opcode:  the constant is
compared to a regex and determined which kind it is, instead of saying
its just some sort of constant.  This fixes the guessing problems with
my (print_ic print_sc print_nc) patch.

Brian




Index: assemble.pl
===
RCS file: /home/perlcvs/parrot/assemble.pl,v
retrieving revision 1.12
diff -r1.12 assemble.pl
95,97d94
   } elsif(m/^\d+$/) {
   # a constant of some sort
   push @arg_t,'(ic|nc|sc)';
99,100c96,109
   # a label
   push @arg_t,'ic';
---
   # a constant of some sort
   if(m/^\[(\d+)\]$/) {
   # string
   push @arg_t,'sc';
   } elsif(m/^((-?\d+)|(0b[01]+)|(0x[0-9a-f]+))$/i) {
   # integer
   push @arg_t,'ic';
   } elsif(m/^[a-z][\w]*$/i) {
   # label
   push @arg_t,'ic';
   } else {
   # numeric
   push @arg_t,'nc';
   }
109a119
   print STDERR test: $test\n;
112c122
   if($op=~/$test/) {
---
   if($op eq $test) {
121,122c131
 error(Wrong arg count--got .scalar(@args). needed
 .$opcodes{$opcode}{ARGS});
---
 error(Wrong arg count--got .scalar(@args). needed 
.$opcodes{$opcode}{ARGS});
131,132c140
 if($rtype eq I || $rtype eq N || $rtype eq P || $rtype eq
 S) {
---
 if($rtype eq I || $rtype eq N || $rtype eq P || $rtype eq S) {
145a154,156
   } elsif($rtype eq 's') {
   $args[$_]=~s/[\[\]]//g;
   $pc+=$sizeof{$rtype};   
220,222c231,238
 return $constants{$s} if exists $constants{$s};
 push @constants, $s;
 return $constants{$s} = $#constants;
---
 # handle \ characters in the constant
 my %escape = ('a'=\a,'n'=\n,'r'=\r,'t'=\t,'\\'='\\',);
 $s=~s/\\([anrt\\])/$escape{$1}/g;
 if(!exists($constants{$s})) {
   push(@constants,$s);
   $constants{$s}=$#constants;
 }
 return [.$constants{$s}.];
223a240

patch: bitops with constants

2001-09-13 Thread Brian Wheeler


This patch allows you to do thingies like:

and I1,I2,0x

'and', 'or', and 'xor' have been adapted to use this.

Also, shl and shr can take an integer register as the amount to shift.

Brian






? pasm.pl
? patch
? test2.pbc
? test3.pbc
? euclid0.pbc
? euclid1.pbc
? euclid.pbc
? const_bitops.diff
? bitops.pbc
? t/bitops.pasm
Index: basic_opcodes.ops
===
RCS file: /home/perlcvs/parrot/basic_opcodes.ops,v
retrieving revision 1.14
diff -u -r1.14 basic_opcodes.ops
--- basic_opcodes.ops   2001/09/13 16:16:38 1.14
+++ basic_opcodes.ops   2001/09/13 19:13:43
@@ -562,6 +562,12 @@
   INT_REG(P1) = INT_REG(P2)  INT_REG(P3);
 }
 
+/* AND_i_ic */
+AUTO_OP and_i_ic {
+  INT_REG(P1) = INT_REG(P2)  P3;
+}
+
+
 /* NOT_i */
 AUTO_OP not_i {
   INT_REG(P1) = ! INT_REG(P2);
@@ -572,17 +578,37 @@
   INT_REG(P1) = INT_REG(P2) | INT_REG(P3);
 }
 
+/* OR_i_ic */
+AUTO_OP or_i_ic {
+  INT_REG(P1) = INT_REG(P2) | P3;
+}
+
 /* SHL_i_ic */
 AUTO_OP shl_i_ic {
   INT_REG(P1) = INT_REG(P2)  P3;
 }
 
+/* SHL_i */
+AUTO_OP shl_i {
+  INT_REG(P1) = INT_REG(P2)  INT_REG(P3);
+}
+
 /* SHR_i_ic */
 AUTO_OP shr_i_ic {
   INT_REG(P1) = INT_REG(P2)  P3;
 }
 
+/* SHR_i */
+AUTO_OP shr_i {
+  INT_REG(P1) = INT_REG(P2)  INT_REG(P3);
+}
+
 /* XOR_i */
 AUTO_OP xor_i {
   INT_REG(P1) = INT_REG(P2) ^ INT_REG(P3);
+}
+
+/* XOR_i_ic */
+AUTO_OP xor_i_ic {
+  INT_REG(P1) = INT_REG(P2) ^ P3;
 }
Index: opcode_table
===
RCS file: /home/perlcvs/parrot/opcode_table,v
retrieving revision 1.13
diff -u -r1.13 opcode_table
--- opcode_table2001/09/13 16:16:38 1.13
+++ opcode_table2001/09/13 19:13:44
@@ -152,9 +152,15 @@
 # Bitwise Ops
 
 and_i 3   I I I
+and_i_ic   3   I I i
 not_i 2   I I
 or_i  3   I I I
+or_i_ic3   I I i
+shl_i  3   I I I
 shl_i_ic 3I I i
+shr_i  3   I I I
 shr_i_ic 3I I i
 xor_i 3   I I I
+xor_i_ic   3   I I i
+

patch: assembly listings from assembler

2001-09-13 Thread Brian Wheeler


This patch does a couple of things:
* uses Getopt::Long for options.  -c is now --checksyntax.  I wasn't
sure how to keep compatible (patches welcome!)
* options include:
--help
--version
--verbose
--output=file
--listing=file
--checksyntax
* produces verbose listing of what the assembler saw :)  Only one
nitpick with it:  unknown symbols are given as 0x,
unfortunately, this includes symbols which may be defined later in the
file (i.e. forward jumps).

Brian


Index: assemble.pl
===
RCS file: /home/perlcvs/parrot/assemble.pl,v
retrieving revision 1.14
diff -r1.14 assemble.pl
7a8
 use Getopt::Long;
9,12c10,33
 my $opt_c;
 if (@ARGV and $ARGV[0] eq -c) {
 shift @ARGV;
 $opt_c = 1;
---
 my %options;
 GetOptions(\%options,('checksyntax',
 'help',
 'version',
 'verbose',
 'output=s',
 'listing=s'));
 
 if($options{'version'}) {
 print $0,'Version $Id$ ',\n;
 exit;
 }
 
 if($options{'help'}) {
 print $0 - Parrot Assembler
 Options:
--checksyntaxCheck assembler syntax only, no output
--help   This text
--listingDump assembly listing to file
--output File to dump bytecode into
--verboseShow what's going on
--versionShow assembler version
 ;
 exit;
14a36,44
 if(exists($options{'output'})  $options{'output'} eq ) {
 print STDERR You must provide a file with --output flag!\n;
 exit;
 }
 
 if(exists($options{'listing'})  $options{'listing'} eq ) {
 print STDERR You must provide a file with --listing flag!\n;
 exit;
 }
54a85,87
 my $listing=PARROT ASSEMBLY LISTING - .scalar(localtime).\n\n;
 
 
62a96
 my $sline=$_;
65c99,104
 next if(/^\#/ || $_ eq );
---
 if(/^\#/ || $_ eq ) {
   if($options{'listing'}) {
   $listing.=sprintf(%4d %08x %-44s %s\n, $line, $op_pc, '',$sline);
   }
   next;
 }
121a161
   log_message(substituting $op for $opcode);
161a202,209
   if($options{'listing'}) {
   # add line to listing.
   my $odata;
   foreach (unpack('l*',substr($bytecode,$op_pc))) {
   $odata.=sprintf(%08x ,$_);
   }
   $listing.=sprintf(%4d %08x %-44s %s\n, $line, $op_pc, $odata,$sline);
   }
163a212
 $listing.=\n if($options{'listing'});
173c222,229
 # FIXUP
---
 # FIXUP (also, dump listing symbols)
 if($options{'listing'}) {
 $listing.=DEFINED SYMBOLS:\n;
 foreach (sort(keys(%label))) {
   $listing.=sprintf(\t%08x   %s\n,$label{$_},$_);
 }
 }
 
184c240,244
 exit;
---
 $listing.=\nUNDEFINED SYMBOLS:\n;
 foreach (sort(keys(%fixup))) {
   $listing.=\t$_\n;
 }
 exit;  # some day, unresolved symbols won't be an error!
195a256,258
 if($options{'listing'}) {
   $listing.=\nSTRING CONSTANTS\n;
 }
196a260
 my $counter=0;
204c268,271
}
---
   }
   $listing.=sprintf(\t%04x %08x [[%s]]\n,$counter,length($_),$_) 
   if($options{'listing'});
   $counter++;
217,218c284,298
 if(!$opt_c) {
 print $output;
---
 if(!$options{'checksyntax'}) {
 if($options{'output'} ne ) {
   open O,$options{'output'} || die $!;
   print O $output;
   close O;
 } else {
   print $output;
 }
 }
 
 
 if($options{'listing'}) {
 open L,$options{'listing'} || die $!;
 print L $listing;
 close L;
224a305,311
 }
 
 sub log_message {
 my($message)=@_;
 if($options{'verbose'}) {
   print STDERR INFO ($line): $message\n;
 }

patches to assembler opcode_table

2001-09-12 Thread Brian Wheeler


The assembler patches:
* handle blank lines containing a label
* handle constants in decimal,octal, or hex.

Opcode table patch (and basic_opcodes.ops):
* adds and, or, not, xor, shl, and shr.

 Builds ok, but coredumps in the interpreter.  Any hints on what I
did wrong welcome! 

Attached is also a test program for the new ops (which coredumps).


Brian




? op.h
? config.h
? patch
? interp_guts.h
? basic_opcodes.c
? test_prog
? test2.pbc
? test3.pbc
? euclid.pbc
? bitops.pbc
? bitops+assembler.patch
? t/bitops.pasm
Index: assemble.pl
===
RCS file: /home/perlcvs/parrot/assemble.pl,v
retrieving revision 1.8
diff -u -r1.8 assemble.pl
--- assemble.pl 2001/09/12 09:54:46 1.8
+++ assemble.pl 2001/09/12 20:24:07
@@ -62,19 +62,17 @@
 push @code, $_;
 $pc += 1+@args;
 }
-
 emit_magic();
 emit_fixup_section();
 emit_constants_section();
-
 # Now assemble
 $pc = 0;
 my $line = 0;
-while ($_ = shift @code) {
+foreach (@code) {
 $line++;
 chomp;
 s/,/ /g;
-
+next if(/^\s*$/);
 my ($opcode, @args) = split /\s+/, $_;
 
 if (!exists $opcodes{lc $opcode}) {
@@ -93,6 +91,8 @@
} elsif($rtype eq D) {
# a destination
$args[$_]=fixup($args[$_]);
+   } else {
+  $args[$_]=oct($args[$_]) if($args[$_]=~/^0/);
}
$output .= pack $type, $args[$_];
 }
Index: basic_opcodes.ops
===
RCS file: /home/perlcvs/parrot/basic_opcodes.ops,v
retrieving revision 1.8
diff -u -r1.8 basic_opcodes.ops
--- basic_opcodes.ops   2001/09/12 18:39:12 1.8
+++ basic_opcodes.ops   2001/09/12 20:24:07
@@ -329,3 +329,34 @@
 // NOOP
 AUTO_OP noop {
 }
+
+// AND_I
+AUTO_OP and_i {
+  INT_REG(P1)=INT_REG(P2)  INT_REG(P3);
+}
+
+// NOT_I
+AUTO_OP not_i {
+  INT_REG(P1)=!INT_REG(P2);
+}
+
+// OR_I
+AUTO_OP or_i {
+  INT_REG(P1)=INT_REG(P2) | INT_REG(P3);
+}
+
+// SHL_I
+AUTO_OP shl_i_ic {
+  INT_REG(P1)=INT_REG(P2)  P3;
+}
+
+
+// SHR_I
+AUTO_OP shr_i_ic {
+  INT_REG(P1)=INT_REG(P2)  P3;
+}
+
+// XOR
+AUTO_OP xor_i {
+  INT_REG(P1)=INT_REG(P2) ^ INT_REG(P3);
+}
Index: opcode_table
===
RCS file: /home/perlcvs/parrot/opcode_table,v
retrieving revision 1.9
diff -u -r1.9 opcode_table
--- opcode_table2001/09/12 18:39:12 1.9
+++ opcode_table2001/09/12 20:24:07
@@ -102,3 +102,10 @@
 clear_n0
 clear_p0
 
+# Bitops
+and_i  3   I I I
+not_i  2   I I
+or_i   3   I I I
+shl_i_ic 3 I I i
+shr_i_ic 3 I I i
+xor_i  3   I I I


# bitops.pasm : test bitops.
#   and, not, or, shl, shr, xor
# Brian Wheeler ([EMAIL PROTECTED])

MAIN:   set_i_icI1,0b
set_i_icI2,0b
set_i_icI3,0b10101010
set_i_icI4,0b01010101
set_i_icI5,0
set_s_scS1,OK
set_s_scS2,NG
set_s_scS3,Test Number

# test 'and'
set_s_scS5,AND Test
print_s S5

AND_T1: 
inc_i   I5
print_s S3
print_i I5  
and_i   I6,I1,I2
print_i I6
eq_i_ic I6,I2,A1_OK,A1_NG

A1_OK:  print_s S1
branch_ic   AND_T2

A1_NG:  print_s S2
branch_ic   AND_T2



AND_T2:
inc_i   I5
print_s S3
print_i I5
and_i   I6,I1,I3
eq_i_ic I6,I3,A2_OK,A3_NG

A2_OK:  print_s S1
branch_ic   AND_T3

A3_NG:  print_s S2
branch_ic   AND_T3

AND_T3: inc_i   I5
print_s S3
print_i I5
and_i   I6,I3,I4
eq_i_ic I6,I2,A3_OK,A3_NG

A3_OK:  print_s S1
branch_ic   OR_T

A3_NG:  print_s S2
branch_ic   OR_T




# test or
OR_T:
set_s_scS5,OR Test
print_s S5
set_i_icI5,0

end

Resend: working patch for bitops

2001-09-12 Thread Brian Wheeler


This seems to be working:
* fixes for label-only lines in assembler
* recognition of 0x, 0b, etc in constants
* and, not, or, shl, shr, xor

Enjoy!
Brian




Patch follows


Index: assemble.pl
===
RCS file: /home/perlcvs/parrot/assemble.pl,v
retrieving revision 1.8
diff -u -r1.8 assemble.pl
--- assemble.pl 2001/09/12 09:54:46 1.8
+++ assemble.pl 2001/09/13 00:46:50
@@ -70,11 +70,12 @@
 # Now assemble
 $pc = 0;
 my $line = 0;
-while ($_ = shift @code) {
+foreach (@code) {
 $line++;
 chomp;
+next if(m/^\s*$/); # blank lines
 s/,/ /g;
-
+
 my ($opcode, @args) = split /\s+/, $_;
 
 if (!exists $opcodes{lc $opcode}) {
@@ -93,6 +94,8 @@
} elsif($rtype eq D) {
# a destination
$args[$_]=fixup($args[$_]);
+   } else {
+  $args[$_]=oct($args[$_]) if($args[$_]=~/^0/);
}
$output .= pack $type, $args[$_];
 }
Index: basic_opcodes.ops
===
RCS file: /home/perlcvs/parrot/basic_opcodes.ops,v
retrieving revision 1.8
diff -u -r1.8 basic_opcodes.ops
--- basic_opcodes.ops   2001/09/12 18:39:12 1.8
+++ basic_opcodes.ops   2001/09/13 00:46:51
@@ -329,3 +329,33 @@
 // NOOP
 AUTO_OP noop {
 }
+
+// AND_i
+AUTO_OP and_i {
+  INT_REG(P1) = INT_REG(P2)  INT_REG(P3);
+}
+
+// NOT_i
+AUTO_OP not_i {
+  INT_REG(P1) = ! INT_REG(P2);
+}
+
+// OR_i
+AUTO_OP or_i {
+  INT_REG(P1) = INT_REG(P2) | INT_REG(P3);
+}
+
+// SHL_i_ic
+AUTO_OP shl_i_ic {
+  INT_REG(P1) = INT_REG(P2)  P3;
+}
+
+// SHR_i_ic
+AUTO_OP shr_i_ic {
+  INT_REG(P1) = INT_REG(P2)  P3;
+}
+
+// XOR_i
+AUTO_OP xor_i {
+  INT_REG(P1) = INT_REG(P2) ^ INT_REG(P3);
+}
Index: opcode_table
===
RCS file: /home/perlcvs/parrot/opcode_table,v
retrieving revision 1.9
diff -u -r1.9 opcode_table
--- opcode_table2001/09/12 18:39:12 1.9
+++ opcode_table2001/09/13 00:46:51
@@ -102,3 +102,9 @@
 clear_n0
 clear_p0
 
+and_i  3   I I I
+not_i  2   I I
+or_i   3   I I I
+shl_i_ic 3 I I i
+shr_i_ic 3 I I i
+xor_i  3   I I I

pasm.pl: a different parrot assembler

2001-09-12 Thread Brian Wheeler



I've been having tons of problems with labels in the current
assembler...so I wrote my own.  It should provide all of the features
that the current assembler has.  I'ved tested and ran all of the current
t/*.pasm files.

Here it is...feedback is always welcome.

Brian


#! /usr/bin/perl -w
#
# pasm.pl - take a parrot assembly file and spit out a bytecode file
#   This is based heavily on assemble.pl
# Brian Wheeler ([EMAIL PROTECTED])

use strict;

my $opt_c;
if (@ARGV and $ARGV[0] eq -c) {
shift @ARGV;
$opt_c = 1;
}


# define data types
my(%pack_type)=('i'='l','n'='d');
my(%real_type)=('I'='i','i'='i',
'N'='i','n'='n',
'S'='i','s'='i',
'D'='i');

# compute sizes
my(%sizeof);
foreach (keys(%real_type)) {
$sizeof{$_}=length(pack($pack_type{$real_type{$_}},0));
}


# get opcodes from guts.
open GUTS, interp_guts.h;
my %opcodes;
while (GUTS) {
next unless /\tx\[(\d+)\] = ([a-z_]+);/;
$opcodes{$2}{CODE} = $1;
}
close GUTS;

# get opcodes and their arg lists
open OPCODES, opcode_table or die Can't get opcode table, $!/$^E;
while (OPCODES) {
next if /^\s*#/;
chomp;
s/^\s+//;
next unless $_;
my ($name, $args, @types) = split /\s+/, $_;
my @rtypes=@types;
@types=map { $_ = $real_type{$_}} @types;
$opcodes{$name}{ARGS} = $args;
$opcodes{$name}{TYPES} = [@types];
$opcodes{$name}{RTYPES}=[@rtypes];
}
close OPCODES;


# read source and assemble
my $pc=0; my $op_pc=0;
my ($bytecode,%label,%fixup,%constants,@constants);
my $line=0;
while() {
$line++;
chomp;
s/^\s*//;
s/\s*$//;
next if(/^\#/ || $_ eq );
if(m/^((\S+):)?\s*(.+)?/) {
my($label,$code)=($2,$3);
if(defined($label)  $label ne ) {
if(exists($label{$label})) {
error('$label' already defined!);
}
if(exists($fixup{$label})) {
# backpatch everything with this PC.
while(scalar(@{$fixup{$label}})) {
my $op_pc=shift(@{$fixup{$label}});
my $offset=shift(@{$fixup{$label}});
substr($bytecode,$offset,4)=pack('l',($pc-$op_pc)/4);
}
delete($fixup{$label});  
}
$label{$label}=$pc; # store it.
}
next if(!defined($code));
1 while $code=~s/\([^\\\]*(?:\\.[^\\\]*)*)\/constantize($1)/eg;
$code=~s/,/ /g;
my($opcode,@args)=split(/\s+/,$code);

if (!exists $opcodes{lc $opcode}) {
error(No opcode $opcode in $_);
}
if (@args != $opcodes{$opcode}{ARGS}) {
error(Wrong arg count--got .scalar(@args). needed
.$opcodes{$opcode}{ARGS});
}
$bytecode .= pack l, $opcodes{$opcode}{CODE};
$op_pc=$pc;
$pc+=4;

foreach (0..$#args) {
my($rtype)=$opcodes{$opcode}{RTYPES}[$_];
my($type)=$opcodes{$opcode}{TYPES}[$_];
if($rtype eq I || $rtype eq N || $rtype eq P || $rtype eq
S) {
# its a register argument
$args[$_]=~s/^[INPS](\d+)$/$1/i;
$pc+=$sizeof{$rtype}
} elsif($rtype eq D) {
# a destination
if(!exists($label{$args[$_]})) {
# we have not seen it yet...put it on the fixup list
push(@{$fixup{$args[$_]}},$op_pc,$pc);
$args[$_]=0x;
} else {
$args[$_]=($label{$args[$_]}-$op_pc)/4;
}
$pc+=$sizeof{$rtype};
} else {
$args[$_]=oct($args[$_]) if($args[$_]=~/^0/);
$pc+=$sizeof{$rtype};   
}
$bytecode .= pack $type, $args[$_];
}
}
}

my $output;

# build file in memory

# MAGIC COOKIE
$output=pack($pack_type{i},0x13155a1);


# FIXUP
if(keys(%fixup)) {
print STDERR SQUAWK!  These symbols were referenced but not
defined:\n;
foreach (sort(keys(%fixup))) {
print STDERR \t$_ at pc: ;
foreach my $pc (@{$fixup{$_}}) {
print STDERR sprintf(%08x ,$pc);
}
print STDERR \n;
}
exit;
} else {
# dump empty header
$output.=pack($pack_type{i},0);
}

# CONSTANTS
if(@constants) {
my($const);
# Then spit out how many constants there are, so we can allocate
$const .= pack($pack_type{i}, scalar @constants);

# Now emit each constant
for (@constants) {
$const .= pack($pack_type{i},0) x 3; # Flags, encoding, type
$const .= pack($pack_type{i},length($_)); # Strlen followed by
that many bytes.
$const .= $_;
my $pad=(length($_) % $sizeof{i});
if($pad) {
$const .= \0 x ($sizeof{i}-(length($_) % $sizeof{i})); # Padding;
   }
}

$output.=pack($pack_type{i},length($const));
$output.=$const;
} else {
# no constants, dump empty header

jsr_ic ret support

2001-09-12 Thread Brian Wheeler


This diff adds  jsr_ic and ret to the interpreter.  I don't know if my
way of returning is legal, and I know there's probably issues with 64
bit machines, but it works...and that's the important part :)

Right now it only has a depth of 32 and no bounds checking, but its
enough to get started.

Brian


Index: basic_opcodes.ops
===
RCS file: /home/perlcvs/parrot/basic_opcodes.ops,v
retrieving revision 1.8
diff -r1.8 basic_opcodes.ops
331a332,345
 
 // JSR
 MANUAL_OP jsr_ic {
   interpreter-call_stack[interpreter-call_sp]=cur_opcode+2;
   interpreter-call_sp++;
   RETURN(P1);
 }
 
 // RET
 MANUAL_OP ret {
   interpreter-call_sp--;
   // probably not legal :)
   return interpreter-call_stack[(interpreter-call_sp)];
 }
Index: interpreter.h
===
RCS file: /home/perlcvs/parrot/interpreter.h,v
retrieving revision 1.3
diff -r1.3 interpreter.h
31a32,34
 
   unsigned int call_sp;
   unsigned long call_stack[32];
Index: opcode_table
===
RCS file: /home/perlcvs/parrot/opcode_table,v
retrieving revision 1.9
diff -r1.9 opcode_table
75a76,77
 jsr_ic1   D
 ret   0

Re: Math functions? (Particularly transcendental ones)

2001-09-10 Thread Brian Wheeler


On Mon, 2001-09-10 at 08:47, Dan Sugalski wrote:
 At 08:07 PM 9/9/2001 -0400, Uri Guttman wrote:
   DS == Dan Sugalski [EMAIL PROTECTED] writes:
 
DS Yeah, I can't think of a good reason for a noop. We might have one
DS anyway, though, just in case one comes along anyway.
 
 in a hardware cpu they were commonly used to fill an instruction slot to
 keep a pipeline filled, or to follow a branch decision, or to pad a long
 running op.
 
 Yup, I realize that. I wasn't sure that we might not have some sort of 
 in-memory opcode whiteout thing we need to do, in which case it'd be useful 
 and potentially faster than recalculating a bunch of jump addresses.
 
 Here's a dumb question:  will parrot allow bytecode which is stored in a
 perl scalar to be executed?
 
DS Yup, in a restricted sandbox too, if you want. That way we'll be
DS able to serialize code to bytestreams, spit them across the 'net,
DS and execute them on the other end.
 
 will the op code table need to be sent over if it is code from a module
 which defines new op codes?
 
 Basically we'll build a small freeze to disk section and send it over the 
 wire instead of freezing to disk. It'll have all the standard stuff--fixup 
 section, constants section, and code.
 

I was thinking about NOP this morning, and I realized that it might very
well be necessary.  If someone was writing a simple assembler for
parrot, it might be useful for padding.


Brian

 
   Dan
 
 --it's like this---
 Dan Sugalski  even samurai
 [EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

Re: Math functions? (Particularly transcendental ones)

2001-09-10 Thread Brian Wheeler


On Mon, 2001-09-10 at 09:16, Bryan C. Warnock wrote:
 On Monday 10 September 2001 10:28 am, Brian Wheeler wrote:
 
  I was thinking about NOP this morning, and I realized that it might very
  well be necessary.  If someone was writing a simple assembler for
  parrot, it might be useful for padding.
 
 Pad what?
 

How about preserving offsets during an optimization phase:

add i3,i1,1
add i3,i3,8

could become

add i3,i1,9
nop

without having to recompute offsets for later bytecode.

In the same way, you could also use it for reserving space for things
like debugging code, like adding 10 nops if debugging is turned off, and
using those 10 instructions for debugging if it is turned
onmaintaining the relative addresses of things.  Of course, one
could just recompile using the parrot assembler, so this would only be
for those tinkering with their own assembler, I suppose.

Honestly, I don't care either way, since add i0,i0,0 is the same
(basically) as a nop, but takes a little more cpu.  One could always
#define nop add i0,i0,0
:)

Brian

Patch to assembler/disassembler + parrot asm inconsistancies

2001-09-10 Thread Brian Wheeler


First off, here's an inconsistancy I found:  In test.pasm

REDO:   eq_i_ic I2, I4, DONE, NEXT

appears.  Shouldn't this be comparing to a constant, not a register?  It
became a little obvious when I made a few changes to the
assembler/disassembler to give more details about the data (and to allow
shortcuts like add I1,I2,I3 to go to add_i I1,I2,I3, etc)

There's 3 pieces:  opcode_table, disassemble.pl and assemble.pl

The opcode_table patch changes the argument encoding to use these terms:
#   i   Integer constant
#   I   Integer register
#   n   Numeric constant
#   N   Numeric register
#   s   String constant?
#   S   String register
#   D   Destination 

The opcodes definitions were changed accordingly.


Disassemble.pl takes the new definitions and prints things out a little
prettier (test.pbc):
 time_i I1
0008 set_i_ic I2 0
0014 set_i_ic I3 1
0020 set_i_ic I4 1000
002c eq_i_ic I2 4 0058 0040
0040 add_i I2 I2 I3
0050 branch_ic 002c
0058 time_i I5
0060 print_i I1
0068 print_i I5
0070 print_i I2
0078 sub_i I2 I5 I1
0088 print_i I2
0090 set_i_ic I1 3
009c mul_i I4 I4 I1
00ac iton_n_i N1 I4
00b8 iton_n_i N2 I2
00c4 print_i I4
00cc print_n N1
00d4 print_i I2
00dc print_n N2
00e4 div_n N1 N1 N2
00f4 print_n N1

It also skips the magic number, and skips (but doesn't handle) the
constant data.  String registers aren't handled either...yet

assemble.pl:  this just contains workarounds to the new opcode_table
format.

Brian


Lastly, here's the patch:
Index: assemble.pl
===
RCS file: /home/perlcvs/parrot/assemble.pl,v
retrieving revision 1.5
diff -u -r1.5 assemble.pl
--- assemble.pl 2001/09/10 17:30:29 1.5
+++ assemble.pl 2001/09/10 22:04:29
@@ -10,6 +10,15 @@
 %pack_type = (i = 'l',
  n = 'd',
  );
+my %real_type=('i'='i',
+  'n'='n',
+  'N'='i',
+  'I'='i',
+  'S'='i',
+  's'='i',
+  'D'='i');
+
+
 my $sizeof_packi = length(pack($pack_type{i},1024));
 
 
@@ -20,6 +29,7 @@
 s/^\s+//;
 next unless $_;
 my ($code, $name, $args, @types) = split /\s+/, $_;
+@types=map { $_ = $real_type{$_}} @types;
 $opcodes{$name} = {CODE = $code,
   ARGS = $args,
   TYPES = [@types]
Index: disassemble.pl
===
RCS file: /home/perlcvs/parrot/disassemble.pl,v
retrieving revision 1.1
diff -u -r1.1 disassemble.pl
--- disassemble.pl  2001/08/29 12:07:02 1.1
+++ disassemble.pl  2001/09/10 22:04:29
@@ -7,10 +7,20 @@
 
 my %unpack_type;
 %unpack_type = (i = 'l',
+   I = 'l',
n = 'd',
+   N = 'l',
+   D = 'l',
+   S = 'l',
+   s = 'l',
);
 my %unpack_size = (i = 4,
   n = 8,
+  I = 4,
+  N = 4,
+  D = 4,
+  S = 4,
+  s = 4,
   );
 
 open OPCODES, opcode_table or die Can't get opcode table, $!/$^E;
@@ -28,16 +38,34 @@
   }
 }
 
+
 $/ = \4;
+my $magic=;
+my $constants=;
+ $constants=;
+
+my $offset=0;
 while () {
 $code = unpack 'l', $_;
 $args = $opcodes[$code]{ARGS};
-print $opcodes[$code]{NAME};
+my $op_offset=$offset;
+print sprintf(%08x,$offset), ,$opcodes[$code]{NAME};
+$offset+=4;
+
 if ($args) {
foreach (1..$args) {
local $/ = \$unpack_size{$opcodes[$code]{TYPES}[$_-1]};
$data = ;
-   print  , unpack $unpack_type{$opcodes[$code]{TYPES}[$_-1]},
$data;+ $offset+=$ {$/ };
+   if($opcodes[$code]{TYPES}[$_-1] eq N ||
+  $opcodes[$code]{TYPES}[$_-1] eq I) {
+   print  ,$opcodes[$code]{TYPES}[$_-1],unpack
$unpack_type{$opcodes[$code]{TYPES}[$_-1]}, $data;
+   } elsif($opcodes[$code]{TYPES}[$_-1] eq D) {
+   # handle destination
+   print 
,sprintf(%08x,$op_offset+unpack($unpack_type{$opcodes[$code]{TYPES}[$_-1]},$data)*4);
+   } else {
+   print  , unpack $unpack_type{$opcodes[$code]{TYPES}[$_-1]}, $data;
+   }
}
 }
 print \n;
Index: opcode_table
===
RCS file: /home/perlcvs/parrot/opcode_table,v
retrieving revision 1.5
diff -u -r1.5 opcode_table
--- opcode_table2001/09/10 15:48:36 1.5
+++ opcode_table2001/09/10 22:04:30
@@ -9,61 +9,70 @@
 # not the type of the register or anything. So N3 is still an i, since
that
 # 3 specifying the register should be packed as an integer.
 
+# Revised arg types:
+#  i   Integer constant
+#  I   Integer register
+#  n   Numeric constant
+#

Re: Patch to assembler/disassembler + parrot asm

2001-09-10 Thread Brian Wheeler


 
 At 05:23 PM 9/10/2001 -0500, Brian Wheeler wrote:
 First off, here's an inconsistancy I found:  In test.pasm
 
 REDO:   eq_i_ic I2, I4, DONE, NEXT
 
 appears.  Shouldn't this be comparing to a constant, not a register?
 
 Nope, though if I let you in on the actual secret it's help.
 
 That should really be eq_i_ic_ic. (Well, actually there should be only one 
 label, and we fall through otherwise. It's a bug in implementation and 
 assembly, not opcode name... :) The intention is the last _x covers the 
 last arg, the next to last covers the next to last arg, and so on. When we 
 run out, we repeat the innermost type.
 
 The ultimate intention is that you'd write that as a plain:
 
  eq I2, I4, DONE, NEXT
 
 or probably
 
  eq I2, I4, DONE
 
 and either way the assembler would know DONE was a constant and we needed i 
 registers since that was specified, and emit the eq_i_ic opcode.

BUT, I'm more confused now :)

If eq_i_ic is really treated as /eq(_i)+_ic/ then this code still
doesn't work:

eq_i_ic I1,I2,NEXT,DONE

because that'd be like eq_i_i_ic_ic, right?

I assume that opcodes aren't going to have variable arguments at this level,
so there should be a one-to-one mapping between function and opcode, right?

A thought (though gross):  if we restrict mneumonics to not use the underscore,
then anything after _ can be the op signature.

The opcode_table could use these characters for different data types:
integer i
integer constantj
numeric n
numeric constanto
address a
string  s
string constant t

The file could be reorganized as:

set 2   i   j
set 2   i   i
set 2   n   o
set 2   s   t


The perl scripts which create the interfaces (process_opfunc.pl, etc) could
use this information to create 4 opcodes:
set_ij
set_ii
set_no
set_st

When the assembler comes across 'set I1,I2', It knows the set_ii form is
the one to use.  The disassembler can dump it as 'set_ii I1,I2' or
(I suppose) as 'set I1,I2'

Also, doing it this way takes out the special cases for the comparison and
jump ops:  the fixups are known to be done with things that have type 'a'

  


 
 It
 became a little obvious when I made a few changes to the
 assembler/disassembler to give more details about the data (and to allow
 shortcuts like add I1,I2,I3 to go to add_i I1,I2,I3, etc)
 
 
 The opcode_table patch changes the argument encoding to use these terms:
 #   i   Integer constant
 #   I   Integer register
 #   n   Numeric constant
 #   N   Numeric register
 #   s   String constant?
 #   S   String register
 #   D   Destination
 
 I was using a trailing c to note a constant since we're using the opcode 
 name as a C function name, and we're not counting on case-sensitivity in 
 symbols.
 

fair enough... 

 Other than that (well, and Simon has a patch in to the repository to yank 
 out the opcode numbers entirely from opcode_table) it looks keen.
 

I'll take a peek and see what all depends on the opcode_table file...

Brian

   Dan
 
 --it's like this---
 Dan Sugalski  even samurai
 [EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk

Re: Patch to assembler/disassembler + parrot asm

2001-09-10 Thread Brian Wheeler



another thought...

 
 A thought (though gross):  if we restrict mneumonics to not use the underscore,
 then anything after _ can be the op signature.
 
 The opcode_table could use these characters for different data types:
 integer   i
 integer constant  j
 numeric   n
 numeric constant  o
 address   a
 strings
 string constant   t
 
 The file could be reorganized as:
 
 set   2   i   j
 set   2   i   i
 set   2   n   o
 set   2   s   t
 

what if the table had another column (optional) at the end:

set 2   i   j   set_i

which gave the name of the C function which implemented it.  that way the
assembly ops are independant of the C function names and multiple ops could
map to a single C routine (if needed) 

I've got to know...what's the significance of the magic number? :)

Brian

Re: Patch to assembler/disassembler + parrot asm

2001-09-10 Thread Brian Wheeler


On Mon, 2001-09-10 at 19:54, Dan Sugalski wrote:
 At 07:45 PM 9/10/2001 -0500, Brian Wheeler wrote:
 If eq_i_ic is really treated as /eq(_i)+_ic/ then this code still
 doesn't work:
 
 eq_i_ic I1,I2,NEXT,DONE
 
 because that'd be like eq_i_i_ic_ic, right?
 
 Right. But don't forget, I screwed up the eq op--it ought to have a single 
 destination. :)
 

DOH!  That's the trick :)



 I assume that opcodes aren't going to have variable arguments at this level,
 so there should be a one-to-one mapping between function and opcode, right?
 
 Each opcode number has a single function, yes. The same high-level 
 opcode, for example eq or add, might map to two or more different 'real' 
 opcodes based on the types of the args. There won't be any runtime 
 morphing--it's more The assembler sees the first arg of foo as a numberic 
 register and the second as a constant, so it must be foo_n_nc.
 
 A thought (though gross):  if we restrict mneumonics to not use the 
 underscore,
 then anything after _ can be the op signature.
 
 Too gross. We don't need to go there. :)
 

Just checking.


 Also, doing it this way takes out the special cases for the comparison and
 jump ops:  the fixups are known to be done with things that have type 'a'
 
 The jump ops will be easy to figure--either they'll take a register, a 
 constant number, or a label. We don't allow labels that could be confused 
 with registers. (No I0: anywhere...)
 
 

I've had more thoughts about my first patch.  The case issue isn't an
issue since it only touches the generation tools, not the C code that's
generated (at least, not directly).


It also provides the additional information needed to let the assembler
choose the correct opcode, and the disassembler to dump things nicely :)

I've also fixed up the supporting tools.  As a test case, I rebuilt
test_prog, assembled test.parm, ran it and dissassembled it.  Looks for
for that one, at least :)

Please consider this new patch.

Brian


Index: assemble.pl
===
RCS file: /home/perlcvs/parrot/assemble.pl,v
retrieving revision 1.6
diff -u -r1.6 assemble.pl
--- assemble.pl 2001/09/10 21:26:08 1.6
+++ assemble.pl 2001/09/11 02:02:15
@@ -9,7 +9,16 @@
 my %pack_type;
 %pack_type = (i = 'l',
  n = 'd',
- );
+ );
+
+my %real_type=('i'='i',
+  'n'='n',
+  'N'='i',
+  'I'='i',
+  'S'='i',
+  's'='i',
+  'D'='i');
+
 my $sizeof_packi = length(pack($pack_type{i},1024));
 
 open GUTS, interp_guts.h;
@@ -26,8 +35,11 @@
 s/^\s+//;
 next unless $_;
 my ($name, $args, @types) = split /\s+/, $_;
+my @rtypes=@types;
+@types=map { $_ = $real_type{$_}} @types;
 $opcodes{$name}{ARGS} = $args;
 $opcodes{$name}{TYPES} = [@types];
+$opcodes{$name}{RTYPES}=[@rtypes];
 }
 
 my $pc = 0;
@@ -65,23 +77,17 @@
die wrong arg count--got . scalar @args.  needed  .
$opcodes{$opcode}{ARGS};
 }
 
-$args[0] = fixup($args[0])
-if $opcode eq branch_ic and $args[0] =~ /[a-zA-Z]/;
-
-#if ($opcode eq eq_i_ic or $opcode eq lt_i_ic) {
-if ($opcode =~ /^(eq|ne|lt|le|gt|ge)_i_ic$/) {
-$args[2] = fixup($args[2]) if $args[2] =~ /[a-zA-Z]/;
-$args[3] = fixup($args[3]) if $args[3] =~ /[a-zA-Z]/;
-}
-if ($opcode eq if_i_ic) {
-$args[1] = fixup($args[1]) if $args[1] =~ /[a-zA-Z]/;
-$args[2] = fixup($args[2]) if $args[2] =~ /[a-zA-Z]/;
-}
-
 print pack l, $opcodes{$opcode}{CODE};
 foreach (0..$#args) {
-   $args[$_] =~ s/^[INPS]?(\d+)$/$1/i;
-   my $type = $pack_type{$opcodes{$opcode}{TYPES}[$_]};
+   my($rtype)=$opcodes{$opcode}{RTYPES}[$_];
+   my($type)=$opcodes{$opcode}{TYPES}[$_];
+   if($rtype eq I || $rtype eq N || $rtype eq P || $rtype eq S) {
+   # its a register argument
+   $args[$_]=~s/^[INPS](\d+)$/$1/i;
+   } elsif($rtype eq D) {
+   # a destination
+   $args[$_]=fixup($args[$_]);
+   }
print pack $type, $args[$_];
 }
 $pc += 1+@args;
Index: disassemble.pl
===
RCS file: /home/perlcvs/parrot/disassemble.pl,v
retrieving revision 1.3
diff -u -r1.3 disassemble.pl
--- disassemble.pl  2001/09/10 21:45:33 1.3
+++ disassemble.pl  2001/09/11 02:02:16
@@ -8,14 +8,25 @@
 
 my(%opcodes, @opcodes);
 
-my %unpack_type;
-%unpack_type = (i = 'l',
-   n = 'd',
-   );
+
+my %unpack_type = (i = 'l',
+  I = 'l',
+  n = 'd',
+  N = 'l',
+  D = 'l',
+  S = 'l',
+  s = 'l',
+  );
 my %unpack_size = (i = 4,
   n = 8,
+  I = 4,
+  N = 4,
+  D = 4,
+  S = 4,
+  s = 4,
   );
 
+
 open GUTS

Re: Patch to assembler/disassembler + parrot asm inconsistancies

2001-09-10 Thread Brian Wheeler


On Mon, 2001-09-10 at 20:52, Dan Sugalski wrote:
 At 07:25 PM 9/10/2001 -0400, Bryan C. Warnock wrote:
 I think Dan mentioned this,  but it looks like the suffixes can be derived
 from the args being passed in.  That would greatly simply the assembler to
 just the function names: set, eq, add, branch.
 
 Were there problems with the scheme, is someone working on it, or did it
 fall through the cracks?  (I'm very much in favor of such a change, and will
 pick it up if no one else is working on it.)
 
 No, I dont' think so, and yes, respectively. (Or, rather, we did the easy 
 literal stuff first and planned on smartening up the assembler later. It's 
 in the TODO even... :)
 
 

Hint, Nudge, Wink:  the last patch I sent that hits on the assembler
should make it very easy to add better assembly checking

For example, register checking could be done like this:

if($rtype eq I || $rtype eq N || $rtype eq P || $rtype eq S) {
# its a register argument
if($args[$_]=~m/^[INPS](\d+)$/) {
my($arg_num)=$1;
if($arg_num  32) {
# bad register number
} else {
$args[$_]=$arg_num;
}
} else {
# non-register being used!
}
} ...

Brian

Another Patch...

2001-09-10 Thread Brian Wheeler


This patch (which is pretty big) does:

* Changes the opcode_table file to provide additional information about
the operands.  Case shouldn't be a problem since that data never becomes
a C symbol [this is pretty much as before]

* Padding errors solved:  assemble.pl and bytecode.c were padding the
constants incorrectly.  It should have been 4-(size % 4), not just (size
% 4).  It is now fixed in both places.

* assembler has less special cases, and should be easier to hang error
checking on

* disassembler dumps constant table and the format is a bit prettier,
including register names, etc.  Test2.pbc dumps as this:

# Constants: 1 entries (32 bytes)
# ID  FlagsEncoding Type Size Data
:    000b Hello World
# Code Section
:  set_i_ic I2, 1
000c:  set_i_ic I1, 0
0018:  set_s_sc S1, [string ]
0024:  eq_i_ic  I1, I2, 0060, 0038
0038:  length_s_i   S1, I1
0044:  print_s  S1
004c:  chopn_s_ic   S1, 1
0058:  branch_ic0024
0060:  end


Let me know what you guys think!
Brian

[Crap, there's some wordwrapping below.  Too bad you can plug emacs into
evolution :) ]


Index: assemble.pl
===
RCS file: /home/perlcvs/parrot/assemble.pl,v
retrieving revision 1.6
diff -u -r1.6 assemble.pl
--- assemble.pl 2001/09/10 21:26:08 1.6
+++ assemble.pl 2001/09/11 03:14:32
@@ -9,7 +9,16 @@
 my %pack_type;
 %pack_type = (i = 'l',
  n = 'd',
- );
+ );
+
+my %real_type=('i'='i',
+  'n'='n',
+  'N'='i',
+  'I'='i',
+  'S'='i',
+  's'='i',
+  'D'='i');
+
 my $sizeof_packi = length(pack($pack_type{i},1024));
 
 open GUTS, interp_guts.h;
@@ -26,8 +35,11 @@
 s/^\s+//;
 next unless $_;
 my ($name, $args, @types) = split /\s+/, $_;
+my @rtypes=@types;
+@types=map { $_ = $real_type{$_}} @types;
 $opcodes{$name}{ARGS} = $args;
 $opcodes{$name}{TYPES} = [@types];
+$opcodes{$name}{RTYPES}=[@rtypes];
 }
 
 my $pc = 0;
@@ -65,23 +77,17 @@
die wrong arg count--got . scalar @args.  needed  .
$opcodes{$opcode}{ARGS};
 }
 
-$args[0] = fixup($args[0])
-if $opcode eq branch_ic and $args[0] =~ /[a-zA-Z]/;
-
-#if ($opcode eq eq_i_ic or $opcode eq lt_i_ic) {
-if ($opcode =~ /^(eq|ne|lt|le|gt|ge)_i_ic$/) {
-$args[2] = fixup($args[2]) if $args[2] =~ /[a-zA-Z]/;
-$args[3] = fixup($args[3]) if $args[3] =~ /[a-zA-Z]/;
-}
-if ($opcode eq if_i_ic) {
-$args[1] = fixup($args[1]) if $args[1] =~ /[a-zA-Z]/;
-$args[2] = fixup($args[2]) if $args[2] =~ /[a-zA-Z]/;
-}
-
 print pack l, $opcodes{$opcode}{CODE};
 foreach (0..$#args) {
-   $args[$_] =~ s/^[INPS]?(\d+)$/$1/i;
-   my $type = $pack_type{$opcodes{$opcode}{TYPES}[$_]};
+   my($rtype)=$opcodes{$opcode}{RTYPES}[$_];
+   my($type)=$opcodes{$opcode}{TYPES}[$_];
+   if($rtype eq I || $rtype eq N || $rtype eq P || $rtype eq S) {
+   # its a register argument
+   $args[$_]=~s/^[INPS](\d+)$/$1/i;
+   } elsif($rtype eq D) {
+   # a destination
+   $args[$_]=fixup($args[$_]);
+   }
print pack $type, $args[$_];
 }
 $pc += 1+@args;
@@ -112,7 +118,10 @@
 for (@constants) {
 $size += 4*$sizeof_packi;
 $size += length($_);
-$size += length($_) % $sizeof_packi; # Padding
+   my($pad)=length($_) % $sizeof_packi;
+   if($pad) {
+   $size+=$sizeof_packi-$pad;
+   }
 }
 
 $size += $sizeof_packi if @constants; # That's for the number of
constants
@@ -127,6 +136,9 @@
 print pack($pack_type{i},0) x 3; # Flags, encoding, type
 print pack($pack_type{i},length($_)); # Strlen followed by that
many bytes.
 print $_;
-print \0 x (length($_) % $sizeof_packi); # Padding;
+   my $pad=(length($_) % $sizeof_packi);
+   if($pad) {
+   print \0 x ($sizeof_packi-(length($_) % $sizeof_packi)); #
Padding;
+   }
 }
 }
Index: bytecode.c
===
RCS file: /home/perlcvs/parrot/bytecode.c,v
retrieving revision 1.4
diff -u -r1.4 bytecode.c
--- bytecode.c  2001/09/10 21:47:26 1.4
+++ bytecode.c  2001/09/11 03:14:33
@@ -79,6 +79,7 @@
 IV encoding = GRAB_IV(program_code);
 IV type = GRAB_IV(program_code);
 IV buflen   = GRAB_IV(program_code);
+   int pad;
 
 len -= 4 * sizeof(IV);
 
@@ -87,9 +88,11 @@
 len -= buflen;
 
 /* Padding */
-if (buflen % sizeof(IV)) {
-len -= buflen % sizeof(IV);
-(char*)*program_code += buflen % sizeof(IV);
+   pad=buflen % sizeof(IV);
+   if(pad) {
+ pad=sizeof(IV)-pad;
+ len -= pad;
+ (char*)*program_code += pad;
 }
 num--;
 if (len  0 ||

parrot question

2001-09-07 Thread Brian Wheeler


While waiting for Parrot (dammit, I took the wrong week off), I've been
scanning the various documents and samples which have been floating
around on the list.  Is there a document describing Parrot syntax yet?
Or is that a will be released on monday thing as well?


Brian Wheeler
[EMAIL PROTECTED]

74 matches

Mail list logo