[
https://issues.apache.org/jira/browse/FLUME-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491133#comment-14491133
]
Joey Echeverria commented on FLUME-2646:
----------------------------------------
Some feedback on {{CSVParser}}:
Should the {{parseHeader}} method use the configured delimiter rather than
always using {{,}}?
When loading a parser based on an event header with the CSV header string,
should it parse other parsing properties from other event headers rather than
using the ones configured in the {{flume.properties}} file?
I don't think the error handling in {{#parse()}} is correct. If we can't create
a parser because we can't determine a schema from the header, then I don't
think that's recoverable and it should throw {{NonRecoverableEventException}}.
The same is true if you have a {{CharacterCodingException}} as that implies the
string wasn't encoded with the expected {{Charset}}. You should also turn
{{RuntimeExceptions}} into {{NonRecoverableEventExceptions}} as the
{{AvroParser}} does.
Some feedback on {{JSONParser}}:
In {{#parse()}}, if you get a {{CharacterCodingException}} as that implies the
string wasn't encoded with the expected {{Charset}} and the method should throw
{{NonRecoverableEventException}}. You should also turn {{RuntimeExceptions}}
into {{NonRecoverableEventExceptions}} as the {{AvroParser}} does.
It would also be good to add tests similar to {{TestDatasetSink
#testIncompatibleSchemas()}} to check the error handling when you get a bad
JSON or CVS record.
> Add JSON and CSV entity parsers to DatasetSink
> ----------------------------------------------
>
> Key: FLUME-2646
> URL: https://issues.apache.org/jira/browse/FLUME-2646
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Reporter: Ryan Blue
> Assignee: Ryan Blue
> Attachments: FLUME-2646.1.diff
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)