On 19 June 2014 15:00, Gary Gregory <[email protected]> wrote:
> To support https://issues.apache.org/jira/browse/CSV-107, it would make
> life easy to depend on Commons IO to use BOMInputStream and the classes it
> depends on instead of copying them to [csv].
>
> I think we need to deal with BOMs in [csv] because casual users may not
> recognize the problem and using a BOMInputStream or other workaround is not
> trivial to find. First recognizing that the stream has a thing called a BOM
> and secondly finding a clean way to deal with said BOM. In addition, there
> are different kinds of BOMs with different sizes to deal with.

This seems out of scope for CSV to me.

However, if it is decided to add this, then I think it needs to be
added to ALL file readers.
Otherwise, the user will need to know about the BOM in advance.
In which case they can add their own code plus dependency to handle it
(as is done in the CSVParserTest#testBOMInputStream() method now.

What appears to happen with the test case is that the BOM is included
as part of the first column header name, so the testBOM() unit test
fails because the "Date" column is not present.


> I am fine with adding this dependency. Other Commons component depend on
> others.

I don't think the use-case warrants adding the dependency.

I would prefer CSV to report some kind of error if a BOM is seen in
the input file.

AFAICT the BOM is stored at the start of the first record, so it
should be possible to detect a BOM input file by looking for the
relevant bytes.

> We can then also talk about whether ExtendedBufferedReader is generic
> enough to move to [io].
>
> Gary
>
> --
> E-Mail: [email protected] | [email protected]
> Java Persistence with Hibernate, Second Edition
> <http://www.manning.com/bauer3/>
> JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
> Spring Batch in Action <http://www.manning.com/templier/>
> Blog: http://garygregory.wordpress.com
> Home: http://garygregory.com/
> Tweet! http://twitter.com/GaryGregory

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to