[
https://issues.apache.org/jira/browse/NIFI-738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14604480#comment-14604480
]
ASF GitHub Bot commented on NIFI-738:
-------------------------------------
Github user joewitt commented on the pull request:
https://github.com/apache/incubator-nifi/pull/69#issuecomment-116186741
Builds cleanly including the contribution check for licenses and check
style.
ConvertCSVToAvro
- Line 258 should use parameters passed in via object[] rather than string
concatenation.
- What happens if you have processed an entire input and produced no
successful results due to a DatasetRecordExceptions? I think it would forward
an empty flow file since nothing would have been written to it. Is that what
you want? Such a case seems like a FAILURE to me.
- Consider removing the incompatible relationship and simply adding an
attribute to the success flow file which lists the reasons or add an attribute
per reason for difficulty converting the original. If you will keep the
incompatible relationship then I’d recommend sending it a clone of the original
item and then adding the relationship. That would make troubleshooting or
analysis of that object much easier. In any case be very careful about how
large these attributes could become. If you’re storing a series of stack
traces this could become unwieldy and memory inefficient really fast.
ConvertJSONToAvro
- Line 160 should use parameters passed in via object[] rather than string
concatenation.
- Same comments as Convert CSV to Avro.
FailureTracker
- This class is a cool idea. My concern with it is that it appears to
store all the raw reasons which could be quite significant in large files with
*many* thousands of lines. This could create issues with the heap, cause
excessive GC, etc. Consider refactoring this class to both immediately store
the summarized representation and to bound the total number of reasons it will
store. It can just ignore additional errors as they arrive. This will allow
it to behave well in terms of memory consumption even in the presence of
massive failures as everything will be bounded.
Thanks
Joe
> Do not write conversion error messages to flow file content
> -----------------------------------------------------------
>
> Key: NIFI-738
> URL: https://issues.apache.org/jira/browse/NIFI-738
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Affects Versions: 0.1.0
> Reporter: Ryan Blue
> Assignee: Ryan Blue
> Fix For: 0.2.0
>
>
> NIFI-551 extended the error handling provided by the ConvertJSONToAvro
> processor, but wrote error messages as the content of a file sent on the
> failure relationship. I think the right thing to do is to output the bad
> records as the file content and put the error messages in the outgoing
> attributes.
> NIFI-551 wasn't included in 0.1.0, so changing this behavior is safe.
> Consequently, I'd like to get this fix into 0.2.0.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)