Hi Greg,

On Mar 27, 2009, at 6:47 AM, Greg Aiken wrote:
> the problem here is that the ‘msinfo.txt’ file is not written in  
> (single byte per character, ascii) format.  instead the first two  
> bytes of the file happen to be (hexFF)(hexFE).  Beyond the first two  
> bytes, each human readable ascii character is represented with TWO  
> BYTES, (hex-ascii character value)(hex00)
(hexFF) (hexFE) is the Byte-Order Mark 
(http://en.wikipedia.org/wiki/Byte-order_mark 
), so yes, definitely Unicode, and - if I'm reading the Wikipedia  
article correctly - definitely either UTF-16 or UTF-32.

> in addition, if anyone knows how to modify the following block so  
> that I can effectively, read the records of this file, and convert  
> the read record into ‘plain old ascii’ encoding – I would be most  
> appreciative.
>
> open (IN, ‘infile.txt’);
> while ($rec = <IN>) {
>             convert_$rec_from_its_current_encoding? 
> _to_simple_ascii_encoding; <<<<<<<<<< the magic code would go here
>                         print $rec;
> }

Okay, here's my understanding of what's going on: Perl 5.8 and above  
will try to load the file up in UTF-8, Perl's native string format.  
But the file you're trying to open appears to be in UTF-16 or UTF-32  
(You can use the table in the Wikipedia article above to figure out  
which one it is). Searching at http://perldoc.perl.org/ brought me to 
http://perldoc.perl.org/Encode/Unicode.html 
, which seems to be Perl's way of handling Unicode which isn't UTF-8.  
Since it's part of the Encode method, you should be able to use:
        open(IN, '<:encoding(utf-32)', 'infile.txt') or die "Could not open  
'infile.txt': $!";
to tell Perl to translate that file from UTF-32 into Perl's native  
UTF-8 while reading. Similarly, to write out to this file without  
changing its UTF-16/32ishness, you can use:
        open(OUT, '>:encoding(utf-32)', 'outfile.txt') or die "Could not open  
'outfile.txt' for writing: $!";
so Perl converts its native UTF-8 into UTF-32 on output.

The Perl Cookbook backs me up on this [1] :-).

Once you've figured this out, let us know how you did it - I think  
it'll make a nice page for the Perl Win32 wiki (http://win32.perl.org/).

cheers,
Gaurav

[1] 
http://books.google.com/books?id=IzdJIax6J5oC&pg=PA335&lpg=PA335&dq=perl+opening+UTF-32&source=bl&ots=z6zl7q9efS&sig=HdQeMKL8NHjc5pi6gE5jAonqdCw&hl=en&ei=dEHMSeyOEZCw6wPtodCbBw&sa=X&oi=book_result&resnum=7&ct=result
_______________________________________________
Perl-Win32-Users mailing list
Perl-Win32-Users@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to