David Doran created NIFI-4510:
---------------------------------

             Summary: ValidateRecord does not work properly with 
AvroRecordSetWriter
                 Key: NIFI-4510
                 URL: https://issues.apache.org/jira/browse/NIFI-4510
             Project: Apache NiFi
          Issue Type: Bug
          Components: Extensions
    Affects Versions: 1.4.0
         Environment: Hortonworks HDF Sandbox with inbuilt NiFi 1.2 disabled, 
and NiFi 1.4 downloaded & running
            Reporter: David Doran
         Attachments: ValidateRecordTest.xml

When using CSVReader and JsonRecordSetWriter, the ValidateRecord processor 
works as expected: Valid records are emitted as a flowfile on the valid queue, 
invalid ones on the invalid queue.

However, when using CSVReader and AvroRecordSetWriter, the presence of an 
invalid record causes the ValidateRecord processor to fail: Nothing is emitted 
on any of the downstream connectors (failure, invalid or valid). Instead the 
session is rolled back and the input file is left in the upstream queue.

Here's the simple schema I've been using:
        {
         "type": "record",
         "name": "test",
         "fields": [
          {
           "name": "Key",
           "type": "string"
          },
          {
           "name": "ShouldBeLong",
           "type": "long"
          }]
        }

And here's some sample CSV data:
        TheKey,123
        TheKey,456
        TheKey,NotALong1
        TheKey,NotALong2
        TheKey,NotALong3
        TheKey,321
        TheKey,654

Using CSVReader->JsonRecordSetWriter results in a flowfile in the valid path:
        [ {
          "Key" : "TheKey",
          "ShouldBeLong" : 123
        }, {
          "Key" : "TheKey",
          "ShouldBeLong" : 456
        }, {
          "Key" : "TheKey",
          "ShouldBeLong" : 321
        }, {
          "Key" : "TheKey",
          "ShouldBeLong" : 654
        } ]

and in invalid path:
        [ {
          "Key" : "TheKey",
          "ShouldBeLong" : "NotALong1"
        }, {
          "Key" : "TheKey",
          "ShouldBeLong" : "NotALong2"
        }, {
          "Key" : "TheKey",
          "ShouldBeLong" : "NotALong3"
        } ]

… as expected.

With CSVReader->AvroRecordSetWriter, the ValidateRecord processor bulletins 
errors repeatedly (because it keeps retrying) and the incoming flow file 
remains in the input queue:
        22:40:22 UTC ERROR 015f100a-3b6f-1638-43d1-143f4ca4a816
        ValidateRecord[id=015f100a-3b6f-1638-43d1-143f4ca4a816] 
ValidateRecord[id=015f100a-3b6f-1638-43d1-143f4ca4a816] failed to process due 
to java.lang.NumberFormatException: For input string: "NotALong1"; rolling back 
session: For input string: "NotALong1"
        
        22:40:22 UTC ERROR 015f100a-3b6f-1638-43d1-143f4ca4a816
        ValidateRecord[id=015f100a-3b6f-1638-43d1-143f4ca4a816] 
ValidateRecord[id=015f100a-3b6f-1638-43d1-143f4ca4a816] failed to process 
session due to java.lang.NumberFormatException: For input string: "NotALong1": 
For input string: "NotALong1"

Thanks,
Dave.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to