>
> So it seems to me that the "obvious" way to go is to have all bit-s
> operations first convert to raw bytes (possibly throwing an exception)
> and then proceed to do their work.
If these conversions croak if there are code points beyond \x{ff}, I'm
fine with it. But trying to mix \x{100} or higher just leads into silly
discontinuities (basically we would need to decide on a word width, and
I think that would be a silly move).
> This means that UTF-8 strings will be handled just fine, and (as I
Please don't mix encodings and code points. That strings might be
serialized or stored as UTF-8 should have no consequence with bitops.
> understand it) some subset of Unicode-at-large will be handled as well.
> In other-words, the burden goes on the conversion functions, not on the
> bit ops.
>
> It's not that it's going to be meaningful in the general case, but if
I'd rather have meaningful results.
> you have code like:
>
> sub foo() { return "\x01"+|"\x02" }
Please consider what happens when the operands have code points beyond 0xff.
> I would expect the get the bit-string, "\x03" back even though strings
> may default to Unicode in Perl 6.
Of course. But I would expect a horrible flaming death for
"\x{100}"|+"\x02".
> You could put this on the shoulders of the client language (by saying
> that the operands must be pre-converted, but that seems to be contrary
> to Parrot's usual MO.
>
> Let me know. I'm happy to do it either way, and I'll look at modifying
> the other bit-string operators if they don't conform to the decision.
>
--
Jarkko Hietaniemi <[EMAIL PROTECTED]> http://www.iki.fi/jhi/ "There is this special
biologist word we use for 'stable'. It is 'dead'." -- Jack Cohen