On 12/17/2010 3:25 PM, Mike Cowlishaw wrote:
>
>> I have a Rexx program that merges several small files onto
>> one large one. As it turned out a few of the small files were
>> prefixed with a UTF8 BOM, |0xEFBBBF|. Should the BOM have
>> been recognized and discarded?
> How could Rexx (or any other processor) decide that some particular
> prefix/content/suffix of a file is worthless and should be discarded?
>
> ("darn it, this file ends in 'ILY'; delete that!").
It would handle it as any other text processor. Open the file, read the 
first three or four
bytes. If no BOM is present reposition to the beginning, else position 
to the first char
after the BOM.

I realize that Rexx can not handle wide characters and use of the UTF8 
BOM is
discouraged, and at least on *ix systems can lead to problems with some 
apps.
But the use of UTF8 is not forbidden. So when processing text files, it 
seems to me
that a BOM should be checked for, even if it is ignored. Or a error 
issued for an
unsupported encoding. For UTF8 I would ignore it and process the file as 
ASCII.


James Johnson



------------------------------------------------------------------------------
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d
_______________________________________________
Oorexx-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oorexx-devel

Reply via email to