[
https://issues.apache.org/jira/browse/NIFI-551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523664#comment-14523664
]
Ricky Saltzer commented on NIFI-551:
------------------------------------
Hey [~markap14] -
Unfortunately the API that this processor leverages does not have a way to
obtain the actual record it failed to parse. To enable this, we'd have to make
a change to the underlying Kite SDK library. The use case here is to at the
very least give the user something to go off of if a record fails to parse,
other than a counter that says "Hey we failed to parse _n_ records". I wouldn't
be opposed to adding a third relationship called something along the lines of
"errors" or "conversion errors", and use that for sending the parse errors.
The reason I didn't just go with sending the parse errors to the bulletin board
is because if a file has a ton (thousands to millions) of invalid records,
there would be a overwhelming amount of error messages. One alternative
approach could be to de-dupe the conversion failures by field name, so I only
alert for one conversion failure per field. I could hold off till the end of
the processing to alert, and say something along the lines of "<conversion
failure> <_n_ other failures like this one>", or something similar...
For example:
{code}
Cannot convert field id [Cannot convert to long: "120V"] (322 similar failures)
Cannot convert field color [Cannot convert to string: 15.23] (933 similar
failures)
{code}
Thoughts?
Ricky
> Improve error handling for ConvertJSONToAvro processor
> ------------------------------------------------------
>
> Key: NIFI-551
> URL: https://issues.apache.org/jira/browse/NIFI-551
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Extensions
> Reporter: Ricky Saltzer
> Labels: patch, review
> Fix For: 0.1.0
>
> Attachments: NIFI-551.1.patch, NIFI-551.2.patch
>
>
> Currently, if the ConvertJSONToAvro processor fails to process an individual
> record, a counter is incremented, but no alerts are produced. It would be
> better to notify the bulletin board that we've failed to process some records
> for a flowfile. Further, we should stream the records we fail to process down
> the failure relationship for further inspection.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)