Hi, I agree with sebb.
Users can still create a BOMInputStream, wrap it into a reader and pass it to CSVParser.parse(final Reader reader, final CSVFormat format). I'm open for moving ExtendedBufferedReader to IO, if it fits. But in this case I would probably use the shade plugin to get it into CSV again. just my 2 cents Benedikt 2014-06-19 18:00 GMT+02:00 sebb <[email protected]>: > On 19 June 2014 15:00, Gary Gregory <[email protected]> wrote: > > To support https://issues.apache.org/jira/browse/CSV-107, it would make > > life easy to depend on Commons IO to use BOMInputStream and the classes > it > > depends on instead of copying them to [csv]. > > > > I think we need to deal with BOMs in [csv] because casual users may not > > recognize the problem and using a BOMInputStream or other workaround is > not > > trivial to find. First recognizing that the stream has a thing called a > BOM > > and secondly finding a clean way to deal with said BOM. In addition, > there > > are different kinds of BOMs with different sizes to deal with. > > This seems out of scope for CSV to me. > > However, if it is decided to add this, then I think it needs to be > added to ALL file readers. > Otherwise, the user will need to know about the BOM in advance. > In which case they can add their own code plus dependency to handle it > (as is done in the CSVParserTest#testBOMInputStream() method now. > > What appears to happen with the test case is that the BOM is included > as part of the first column header name, so the testBOM() unit test > fails because the "Date" column is not present. > > > > I am fine with adding this dependency. Other Commons component depend on > > others. > > I don't think the use-case warrants adding the dependency. > > I would prefer CSV to report some kind of error if a BOM is seen in > the input file. > > AFAICT the BOM is stored at the start of the first record, so it > should be possible to detect a BOM input file by looking for the > relevant bytes. > > > We can then also talk about whether ExtendedBufferedReader is generic > > enough to move to [io]. > > > > Gary > > > > -- > > E-Mail: [email protected] | [email protected] > > Java Persistence with Hibernate, Second Edition > > <http://www.manning.com/bauer3/> > > JUnit in Action, Second Edition <http://www.manning.com/tahchiev/> > > Spring Batch in Action <http://www.manning.com/templier/> > > Blog: http://garygregory.wordpress.com > > Home: http://garygregory.com/ > > Tweet! http://twitter.com/GaryGregory > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
