[
https://issues.apache.org/jira/browse/IO-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13430967#comment-13430967
]
Yaniv Kunda edited comment on IO-341 at 8/8/12 9:23 AM:
--------------------------------------------------------
You might know the charset in advance, but if you're not using BOMInputStream
or the likes (i.e. you have a Reader or simply a String passed to you) you
can't know if a BOM was present unless you check the first char and compare it
to '\uFEFF':
{code:java}
public String stripBom(@Nonnull String s) {
return s.isEmpty() || s.charAt(0) != ByteOrderMark.BOM_CHAR ? s :
s.substring(1);
}
{code}
I think BOM_CHAR is preferable since ByteOrderMark is BOM so it makes
ByteOrderMark.BOM look like "salsa sauce".
was (Author: kunda):
You might know the charset in advance, but if you're not using
BOMInputStream or the likes (i.e. you have a Reader or simply a String passed
to you) you can't know if a BOM was present unless you check the first char and
compare it to '\uFEFF':
{code:java}
public String stripBom(@Nonnull String s) {
return s.isEmpty() || s.charAt(0) != ByteOrderMark.BOM_CHAR ? s :
s.substring(1);
}
{code}
> A constant for holding the BOM character (U+FEFF)
> --------------------------------------------------
>
> Key: IO-341
> URL: https://issues.apache.org/jira/browse/IO-341
> Project: Commons IO
> Issue Type: Improvement
> Components: Streams/Writers
> Reporter: Yaniv Kunda
> Priority: Minor
> Attachments: ByteOrderMark-char.patch
>
>
> This can be useful when working with readers/writers -
> can be put as a constant in ByteOrderMark, for example.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira