In message <[EMAIL PROTECTED]>
          Dan Sugalski <[EMAIL PROTECTED]> wrote:

> utf8 and utf16 are both variable length encodings for space reasons.
> There's not much reason to space-compact something then expand the heck out
> of it. On the other hand, I'd really, *really* rather not have Unicode
> constants in anything other than UTF-32, so I'd as soon we chopped out the
> utf-8 and utf-16 constant support from this.
>
> A should be the prefix for US-ASCII characters.
> U should be the prefix for Unicode characters
> N should be the prefix for the native character set (and the default)
>
> Beyond that I'm not sure what, if anything, we should accommodate in the
> assembler.

Attached is a patch to drop the U8, U16 and U32 prefixes and
add U and N prefixes.

I havn't added the A prefix because I'm still not clear what
encoding those are supposed to map to. I can understand the
following mappings:

  N => enc_native
  U => enc_utf32

but what is A supposed to map to exactly? or is the assembler
supposed to mangle an A string into an N or U string and then
put it in the bytecode in one of those formats?

Tom

-- 
Tom Hughes ([EMAIL PROTECTED])
http://www.compton.nu/
Index: Assembler.pm
===================================================================
RCS file: /home/perlcvs/parrot/Parrot/Assembler.pm,v
retrieving revision 1.8
diff -u -w -r1.8 Assembler.pm
--- Assembler.pm        2001/10/09 02:45:36     1.8
+++ Assembler.pm        2001/10/09 21:25:28
@@ -279,7 +279,7 @@
 
 =cut
 
-my %encodings=('' => 0, 'U8' => 1, 'U16' => 2, 'U32' => 3);
+my %encodings=('' => 0, 'N' => 0, 'U' => 3);
 
 my %opcodes = Parrot::Opcode::read_ops( -f "../opcode_table" ? "../opcode_table" : 
"opcode_table" );
 
@@ -662,7 +662,7 @@
 
 sub replace_string_constants {
   my $code = shift;
-  $code =~ 
s/(U(?:8|16|32))?\"([^\\\"]*(?:\\.[^\\\"]*)*)\"/constantize_string($2,$1)/eg;
+  $code =~ s/([NU])?\"([^\\\"]*(?:\\.[^\\\"]*)*)\"/constantize_string($2,$1)/eg;
   return $code;
 }
 

Reply via email to