[
https://issues.apache.org/jira/browse/CAMEL-12698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16565421#comment-16565421
]
Jason Black commented on CAMEL-12698:
-------------------------------------
That's right [~davsclaus]; I'm finalizing it now and hope to have a PR before
the end of the week.
> Unmarshaling a CSV file with the NEL (next line) character will cause Bindy
> to misread the entire file
> ------------------------------------------------------------------------------------------------------
>
> Key: CAMEL-12698
> URL: https://issues.apache.org/jira/browse/CAMEL-12698
> Project: Camel
> Issue Type: Bug
> Components: camel-bindy
> Affects Versions: 2.22.0
> Reporter: Jason Black
> Priority: Major
>
> I am using Apache Camel to process a lot of large CSV files, and relying on
> Bindy to assist with unmarshalling them into POJOs.
> We have an upstream data bug which causes a record of ours to contain the
> Unicode character
> [NEL|http://www.fileformat.info/info/unicode/char/85/index.htm], but while
> we're working through the cause of that, I found it curious as to what Bindy
> is actually doing with it. We rely on the unmarshal process to perform a
> batch insert, and because our POJO is missing certain fields, we started
> observing that the
> Bindy is relying on Scanner to read lines in a large file; however, Scanner
> itself also does some parsing of the line with the assumption that, if it
> sees the NEL character, it will regard it as a newline character. The modern
> Files API does not make this distinction and reads to a newline designation
> only (e.g \n, \r, or \r\n).
> There are two ways to fix this from what I've been able to smoke test:
> * Change the Scanner implementation to use a delimeter of the more
> traditional newline characters
> * Use Java 8's Files API and stream the file in
> I would personally want to use the Files API to handle this since it's more
> robust and capable of higher performance, but I'll explore both approaches
> and see where I end up.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)