David Hicks created NIFI-2841: --------------------------------- Summary: SplitAvro Processor is Broken Key: NIFI-2841 URL: https://issues.apache.org/jira/browse/NIFI-2841 Project: Apache NiFi Issue Type: Bug Reporter: David Hicks Priority: Critical
This is largely the fault of the Avro DataFileStream reader, but it's making the processor unusable. The problem appears to occur when you make the following series of calls (which happens because of the splitSize comparison): reader.next() -> returns last element reader.hasNext() -> returns false reader.hasNext() -> returns true reader.next() -> EOFException This should be reproducible with any and all avro files. org.apache.nifi.processor.exception.ProcessException: IOException thrown from SplitAvro[id=22e03ca4-0151-4474-92fc-040e1fe12ab9]: java.io.EOFException at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2013) ~[na:na] at org.apache.nifi.processors.avro.SplitAvro$RecordSplitter$1.process(SplitAvro.java:250) ~[nifi-avro-processors-0.7.0.jar:0.7.0] at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1851) ~[na:na] at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1822) ~[na:na] at org.apache.nifi.processors.avro.SplitAvro$RecordSplitter.split(SplitAvro.java:236) ~[nifi-avro-processors-0.7.0.jar:0.7.0] at org.apache.nifi.processors.avro.SplitAvro.onTrigger(SplitAvro.java:202) ~[nifi-avro-processors-0.7.0.jar:0.7.0] at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) [nifi-api-0.7.0.jar:0.7.0] at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054) [nifi-framework-core-0.7.0.jar:0.7.0] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-0.7.0.jar:0.7.0] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-0.7.0.jar:0.7.0] at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127) [nifi-framework-core-0.7.0.jar:0.7.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_101] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_101] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101] Caused by: java.io.EOFException: null at org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:259) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.io.ResolvingDecoder.readString(ResolvingDecoder.java:201) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:363) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:355) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:157) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233) ~[avro-1.7.7.jar:1.7.7] at org.apache.nifi.processors.avro.SplitAvro$RecordSplitter$1$1.process(SplitAvro.java:259) ~[nifi-avro-processors-0.7.0.jar:0.7.0] at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:1998) ~[na:na] ... 17 common frames omitted -- This message was sent by Atlassian JIRA (v6.3.4#6332)