David Hicks created NIFI-2841:
---------------------------------

             Summary: SplitAvro Processor is Broken
                 Key: NIFI-2841
                 URL: https://issues.apache.org/jira/browse/NIFI-2841
             Project: Apache NiFi
          Issue Type: Bug
            Reporter: David Hicks
            Priority: Critical


This is largely the fault of the Avro DataFileStream reader, but it's making 
the processor unusable.  The problem appears to occur when you make the 
following series of calls (which happens because of the splitSize comparison):
reader.next() -> returns last element
reader.hasNext() -> returns false
reader.hasNext() -> returns true
reader.next() -> EOFException

This should be reproducible with any and all avro files.

org.apache.nifi.processor.exception.ProcessException: IOException thrown from 
SplitAvro[id=22e03ca4-0151-4474-92fc-040e1fe12ab9]: java.io.EOFException
        at 
org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2013)
 ~[na:na]
        at 
org.apache.nifi.processors.avro.SplitAvro$RecordSplitter$1.process(SplitAvro.java:250)
 ~[nifi-avro-processors-0.7.0.jar:0.7.0]
        at 
org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1851)
 ~[na:na]
        at 
org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1822)
 ~[na:na]
        at 
org.apache.nifi.processors.avro.SplitAvro$RecordSplitter.split(SplitAvro.java:236)
 ~[nifi-avro-processors-0.7.0.jar:0.7.0]
        at 
org.apache.nifi.processors.avro.SplitAvro.onTrigger(SplitAvro.java:202) 
~[nifi-avro-processors-0.7.0.jar:0.7.0]
        at 
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
 [nifi-api-0.7.0.jar:0.7.0]
        at 
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054)
 [nifi-framework-core-0.7.0.jar:0.7.0]
        at 
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136)
 [nifi-framework-core-0.7.0.jar:0.7.0]
        at 
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
 [nifi-framework-core-0.7.0.jar:0.7.0]
        at 
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127)
 [nifi-framework-core-0.7.0.jar:0.7.0]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_101]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
[na:1.8.0_101]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 [na:1.8.0_101]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 [na:1.8.0_101]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_101]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_101]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
Caused by: java.io.EOFException: null
        at 
org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473) 
~[avro-1.7.7.jar:1.7.7]
        at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128) 
~[avro-1.7.7.jar:1.7.7]
        at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:259) 
~[avro-1.7.7.jar:1.7.7]
        at 
org.apache.avro.io.ResolvingDecoder.readString(ResolvingDecoder.java:201) 
~[avro-1.7.7.jar:1.7.7]
        at 
org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:363)
 ~[avro-1.7.7.jar:1.7.7]
        at 
org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:355)
 ~[avro-1.7.7.jar:1.7.7]
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:157) 
~[avro-1.7.7.jar:1.7.7]
        at 
org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193)
 ~[avro-1.7.7.jar:1.7.7]
        at 
org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183)
 ~[avro-1.7.7.jar:1.7.7]
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) 
~[avro-1.7.7.jar:1.7.7]
        at 
org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142) 
~[avro-1.7.7.jar:1.7.7]
        at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233) 
~[avro-1.7.7.jar:1.7.7]
        at 
org.apache.nifi.processors.avro.SplitAvro$RecordSplitter$1$1.process(SplitAvro.java:259)
 ~[nifi-avro-processors-0.7.0.jar:0.7.0]
        at 
org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:1998)
 ~[na:na]
        ... 17 common frames omitted




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to