On Jan 25, 2008 10:06 AM, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: snip > Great! both worked. The thing I still don't understand is that in the > file the BOM is FFFE not FEFF snip
This is because it is little endian, if it were a big endian file it would be FEFF. The character is the same, but the order of the bytes change depending on the endian-ness of the file. The BOM isn't a marker that says the file is one endian or another, it is a character that is known in advance that lets you easily tell which endian the file is. snip > so I have already tried to use s/ > ^x{FFFE}//; with no success but your feedback worked with the s/ > ^{FEFF}//; it is in reverse order for some reason. snip Perl uses the Unicode character number for "\x{}", so ZERO WIDTH NO-BREAK SPACE is "\x{FEFF}" even if it is written to the file in little-endian bytes FF FE. Avoid confusing the encoding of Unicode with Unicode itself. For instance, The UTF-8 encoding of "\x{FEFF}" is EF BB BF. snip >Now I need to read > further into "zero-width no-break space", not sure that I understand > why it is called that and not BOM. Dealing with unicode at the moment > is over my head a bit so thanks very much for the fix to what was a > simple change. Off to find more material to read about this subject > matter, thanks again! snip from http://en.wikipedia.org/wiki/Byte_Order_Mark In most character encodings the BOM is a pattern which is unlikely to be seen in other contexts (it would usually look like a sequence of obscure control codes). If a BOM is misinterpreted as an actual character within Unicode text then it will generally be invisible due to the fact it is a zero-width no-break space. Use of the U+FEFF character for non-BOM purposes has been deprecated in Unicode 3.2 (which provides an alternative, U+2060, for those other purposes), allowing U+FEFF to be used solely with the semantic of BOM. Also, there is a nice chart here: http://www.websina.com/bugzero/kb/unicode-bom.html -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] http://learn.perl.org/