[ 
https://issues.apache.org/jira/browse/NIFI-12745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17815794#comment-17815794
 ] 

ASF subversion and git services commented on NIFI-12745:
--------------------------------------------------------

Commit ba2e24b68f036363f333e216ee968d999a59e268 in nifi's branch 
refs/heads/support/nifi-1.x from Rajmund Takacs
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=ba2e24b68f ]

NIFI-12745: Fix AvroReader silently dropping malformed records

This closes #8361.

Signed-off-by: Tamas Palfy <tpa...@apache.org>


> AvroReader silently drops record if it's malformed
> --------------------------------------------------
>
>                 Key: NIFI-12745
>                 URL: https://issues.apache.org/jira/browse/NIFI-12745
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 2.0.0-M1, 1.18.0, 1.19.0, 1.20.0, 1.19.1, 1.21.0, 
> 1.22.0, 1.23.0, 1.24.0, 1.23.1, 1.23.2, 1.25.0, 2.0.0-M2
>            Reporter: Rajmund Takacs
>            Assignee: Rajmund Takacs
>            Priority: Major
>         Attachments: ValidateRecord.json
>
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> See the attached example flow. It reproduces the issue very reliably.
> {{GenerateFlowFile}} is set to generate the following Json:
> {code:json}
> [{
>   "field_1" : 123456789,
>   "field_2" : "44",
>   "field_3" : 5
> }] 
> {code}
> This input is converted to Avro format, using the {{ConvertRecord}} 
> processor. The 'Schema Write Strategy' of {{AvroRecordSetWriter}} is set to 
> anything different than 'Embed Avro Schema'.
> Then, the resulting FF is routed to a processor that uses an {{AvroReader}} 
> to work on the records. The reader is set to use a predefined, fixed schema, 
> which does not match with the input avro file, contains at least an extra 
> field. It does not matter if that field has a default value or not.
> {code:json}
> {
>   "type":"record",
>   "name":"message_name",
>   "namespace":"message_namespace",
>   "fields":[
>     {
>       "name":"field_1",
>       "type":["long"]
>     },
>     {
>       "name":"field_2",
>       "type":["string"]
>     },
>     {
>       "name":"field_3",
>       "type":["int"]
>     },
>     {
>       "name":"extra_field",
>       "type":["string"],
>       "default":"empty"
>     }
>   ]
> }
> {code}
> When this processor consumes the input, the reader silently drops the record, 
> without even making an error log message. At the processor level, this is 
> equivalent to having no records to process, so nothing happens. The user 
> won't notice that there is a misconfiguration somewhere until they start 
> noticing the missing the flow files.
> The expected behavior from the processors would be to route the malformed 
> input FF to their failure relationship, and shout an error on its bulletin.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to