On 4/5/10 Mon  Apr 5, 2010  2:56 PM, "Doug Cacialli"
<doug.cacia...@gmail.com> scribbled:

> I sincerely appreciate the tips on improving my code; I implement (or
> at least take strong note) of all the suggestions I receive.  In the
> code I posted, however, I'm primarily interested in learning if
> there's a way to avoid opening the file to determine the character
> encoding, and then opening it again with the character encoding
> specified.  In the second block of code that I originally posted, I
> only open the file once but I'm consistently encountering the "UTF16:
> Unrecognised BOM" error that I mentioned.

I don't work with UTF-encoded files in Perl, so I am not going to be able to
answer your question definitively.

In general, the operating system does not maintain the encoding of text
files (there may be exceptions I don't know about). The information you seek
is in the file itself. There is nothing wrong with opening a file twice, if
that is what it takes. Data read from disk, including directory information,
is usually cached, so the second open should occur very quickly, and may not
require access to the drive itself.

If you really want to know how a file is encoded without opening it, you can
maintain the information outside of the file, itself. One suggestion would
be to always use the same encoding. Another suggestion would be to use a
file-naming convention: e.g., include '-utf16' in the file name if that is
the encoding. You could also create an index file that specifies the
encoding for each file.

All of these involve extra work on your part, so avoiding opening the file
twice may not be worth it.

> Does anyone have any ideas how I can make the second block of code
> work?  Or otherwise accomplish the task without opening the .txt file
> twice?

Even if I could solve your problem, I don't have your "second block of code"
on this system. It is always a good idea to include a short program that
demonstrates the problem you are having with each post.

Maybe somebody smarter than me has better suggestions.

Good luck.



-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to