I'm sorry, John. I was talking figuratively. I didn't mean real bullets.

How come Perl sees "C2 A0" whenever HexEdit sees "CA" and visa versa? I don't care what kind of characters we are talking here. To paraphrase Gertrude Stein, "a byte is a byte is a byte." At least that's what I thought until now.

Regards,

Vic


At 5:33 PM +0000 1/10/04, John Delacour wrote:
At 11:22 am -0500 10/1/04, Vic Norton wrote:

What is going on here? HexEdit sees one byte for each bullet and perl
sees two. I thought hex stuff was unambiguous, but, as a mathematician,
I am pretty certain that 1 is not equal to 2.

Perl talks UTF-8. The bullet in utf-8 is chr (8226) "\x{2022}"


>>>> perldoc -X encoding | more


TMTOWTDI but it sounds as though you'd like to work as though Unicode didn't exist and something like this might be simplest.



binmode(STDOUT=>':encoding(MacRoman)') ; my $display_in_dumb_editor = 1 ; my $f = '/tmp/bullet.txt' ; open F, ">$f"; print F "Here's a bullet *\r" ; `open -a 'simpletext' $f` if $display_in_dumb_editor; close F ; open F, $f ; for (<F>) { /*/ and print "Got one !" or print " :-< " }

PS. for anyone rash enough, like me, to have installed 5.8.3 and having problems finding CongigLocal.pm, this will solve the problem:

enc2xs -C


JD




Reply via email to