[jira] [Updated] (NIFI-2841) SplitAvro Processor is Broken
[ https://issues.apache.org/jira/browse/NIFI-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleg Zhurakousky updated NIFI-2841: --- Resolution: Fixed Status: Resolved (was: Patch Available) > SplitAvro Processor is Broken > - > > Key: NIFI-2841 > URL: https://issues.apache.org/jira/browse/NIFI-2841 > Project: Apache NiFi > Issue Type: Bug >Reporter: David Hicks >Assignee: Bryan Bende >Priority: Critical > Fix For: 1.1.0 > > Attachments: NIFI-2841.patch > > > This is largely the fault of the Avro DataFileStream reader, but it's making > the processor unusable. The problem appears to occur when you make the > following series of calls (which happens because of the splitSize comparison): > reader.next() -> returns last element > reader.hasNext() -> returns false > reader.hasNext() -> returns true > reader.next() -> EOFException > org.apache.nifi.processor.exception.ProcessException: IOException thrown from > SplitAvro[id=22e03ca4-0151-4474-92fc-040e1fe12ab9]: java.io.EOFException > at > org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2013) > ~[na:na] > at > org.apache.nifi.processors.avro.SplitAvro$RecordSplitter$1.process(SplitAvro.java:250) > ~[nifi-avro-processors-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1851) > ~[na:na] > at > org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1822) > ~[na:na] > at > org.apache.nifi.processors.avro.SplitAvro$RecordSplitter.split(SplitAvro.java:236) > ~[nifi-avro-processors-0.7.0.jar:0.7.0] > at > org.apache.nifi.processors.avro.SplitAvro.onTrigger(SplitAvro.java:202) > ~[nifi-avro-processors-0.7.0.jar:0.7.0] > at > org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) > [nifi-api-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054) > [nifi-framework-core-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) > [nifi-framework-core-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) > [nifi-framework-core-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127) > [nifi-framework-core-0.7.0.jar:0.7.0] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_101] > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > [na:1.8.0_101] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > [na:1.8.0_101] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > [na:1.8.0_101] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_101] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_101] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101] > Caused by: java.io.EOFException: null > at > org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473) > ~[avro-1.7.7.jar:1.7.7] > at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128) > ~[avro-1.7.7.jar:1.7.7] > at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:259) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.io.ResolvingDecoder.readString(ResolvingDecoder.java:201) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:363) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:355) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:157) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142) > ~[avro-1.7.7.jar:1.7.7] > at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233) > ~[avro-1.7.7.jar:1.7.7] > at >
[jira] [Updated] (NIFI-2841) SplitAvro Processor is Broken
[ https://issues.apache.org/jira/browse/NIFI-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Bende updated NIFI-2841: -- Fix Version/s: 1.1.0 > SplitAvro Processor is Broken > - > > Key: NIFI-2841 > URL: https://issues.apache.org/jira/browse/NIFI-2841 > Project: Apache NiFi > Issue Type: Bug >Reporter: David Hicks >Priority: Critical > Fix For: 1.1.0 > > Attachments: NIFI-2841.patch > > > This is largely the fault of the Avro DataFileStream reader, but it's making > the processor unusable. The problem appears to occur when you make the > following series of calls (which happens because of the splitSize comparison): > reader.next() -> returns last element > reader.hasNext() -> returns false > reader.hasNext() -> returns true > reader.next() -> EOFException > org.apache.nifi.processor.exception.ProcessException: IOException thrown from > SplitAvro[id=22e03ca4-0151-4474-92fc-040e1fe12ab9]: java.io.EOFException > at > org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2013) > ~[na:na] > at > org.apache.nifi.processors.avro.SplitAvro$RecordSplitter$1.process(SplitAvro.java:250) > ~[nifi-avro-processors-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1851) > ~[na:na] > at > org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1822) > ~[na:na] > at > org.apache.nifi.processors.avro.SplitAvro$RecordSplitter.split(SplitAvro.java:236) > ~[nifi-avro-processors-0.7.0.jar:0.7.0] > at > org.apache.nifi.processors.avro.SplitAvro.onTrigger(SplitAvro.java:202) > ~[nifi-avro-processors-0.7.0.jar:0.7.0] > at > org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) > [nifi-api-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054) > [nifi-framework-core-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) > [nifi-framework-core-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) > [nifi-framework-core-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127) > [nifi-framework-core-0.7.0.jar:0.7.0] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_101] > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > [na:1.8.0_101] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > [na:1.8.0_101] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > [na:1.8.0_101] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_101] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_101] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101] > Caused by: java.io.EOFException: null > at > org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473) > ~[avro-1.7.7.jar:1.7.7] > at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128) > ~[avro-1.7.7.jar:1.7.7] > at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:259) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.io.ResolvingDecoder.readString(ResolvingDecoder.java:201) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:363) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:355) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:157) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142) > ~[avro-1.7.7.jar:1.7.7] > at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.nifi.processors.avro.SplitAvro$RecordSplitter$1$1.process(SplitAvro.java:259) >
[jira] [Updated] (NIFI-2841) SplitAvro Processor is Broken
[ https://issues.apache.org/jira/browse/NIFI-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Bende updated NIFI-2841: -- Attachment: NIFI-2841.patch The attached patch should avoid making two consecutive calls to reader.hasNext(), although I would really like to reproduce the problem to know for sure that this solves it. Existing unit tests still pass. I know there has been one other change to SplitAvro after the 1.0.0 release, so if you wanted to try this out you could apply this patch to master and do a build of that, and then take the updated Avro NAR. > SplitAvro Processor is Broken > - > > Key: NIFI-2841 > URL: https://issues.apache.org/jira/browse/NIFI-2841 > Project: Apache NiFi > Issue Type: Bug >Reporter: David Hicks >Priority: Critical > Attachments: NIFI-2841.patch > > > This is largely the fault of the Avro DataFileStream reader, but it's making > the processor unusable. The problem appears to occur when you make the > following series of calls (which happens because of the splitSize comparison): > reader.next() -> returns last element > reader.hasNext() -> returns false > reader.hasNext() -> returns true > reader.next() -> EOFException > org.apache.nifi.processor.exception.ProcessException: IOException thrown from > SplitAvro[id=22e03ca4-0151-4474-92fc-040e1fe12ab9]: java.io.EOFException > at > org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2013) > ~[na:na] > at > org.apache.nifi.processors.avro.SplitAvro$RecordSplitter$1.process(SplitAvro.java:250) > ~[nifi-avro-processors-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1851) > ~[na:na] > at > org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1822) > ~[na:na] > at > org.apache.nifi.processors.avro.SplitAvro$RecordSplitter.split(SplitAvro.java:236) > ~[nifi-avro-processors-0.7.0.jar:0.7.0] > at > org.apache.nifi.processors.avro.SplitAvro.onTrigger(SplitAvro.java:202) > ~[nifi-avro-processors-0.7.0.jar:0.7.0] > at > org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) > [nifi-api-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054) > [nifi-framework-core-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) > [nifi-framework-core-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) > [nifi-framework-core-0.7.0.jar:0.7.0] > at > org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127) > [nifi-framework-core-0.7.0.jar:0.7.0] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_101] > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > [na:1.8.0_101] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > [na:1.8.0_101] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > [na:1.8.0_101] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_101] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_101] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101] > Caused by: java.io.EOFException: null > at > org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473) > ~[avro-1.7.7.jar:1.7.7] > at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128) > ~[avro-1.7.7.jar:1.7.7] > at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:259) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.io.ResolvingDecoder.readString(ResolvingDecoder.java:201) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:363) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:355) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:157) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193) > ~[avro-1.7.7.jar:1.7.7] > at > org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183) > ~[avro-1.7.7.jar:1.7.7] > at >
[jira] [Updated] (NIFI-2841) SplitAvro Processor is Broken
[ https://issues.apache.org/jira/browse/NIFI-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Hicks updated NIFI-2841: -- Description: This is largely the fault of the Avro DataFileStream reader, but it's making the processor unusable. The problem appears to occur when you make the following series of calls (which happens because of the splitSize comparison): reader.next() -> returns last element reader.hasNext() -> returns false reader.hasNext() -> returns true reader.next() -> EOFException org.apache.nifi.processor.exception.ProcessException: IOException thrown from SplitAvro[id=22e03ca4-0151-4474-92fc-040e1fe12ab9]: java.io.EOFException at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2013) ~[na:na] at org.apache.nifi.processors.avro.SplitAvro$RecordSplitter$1.process(SplitAvro.java:250) ~[nifi-avro-processors-0.7.0.jar:0.7.0] at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1851) ~[na:na] at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:1822) ~[na:na] at org.apache.nifi.processors.avro.SplitAvro$RecordSplitter.split(SplitAvro.java:236) ~[nifi-avro-processors-0.7.0.jar:0.7.0] at org.apache.nifi.processors.avro.SplitAvro.onTrigger(SplitAvro.java:202) ~[nifi-avro-processors-0.7.0.jar:0.7.0] at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) [nifi-api-0.7.0.jar:0.7.0] at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054) [nifi-framework-core-0.7.0.jar:0.7.0] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-0.7.0.jar:0.7.0] at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-0.7.0.jar:0.7.0] at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127) [nifi-framework-core-0.7.0.jar:0.7.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_101] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_101] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_101] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101] Caused by: java.io.EOFException: null at org.apache.avro.io.BinaryDecoder.ensureBounds(BinaryDecoder.java:473) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:128) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:259) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.io.ResolvingDecoder.readString(ResolvingDecoder.java:201) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:363) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:355) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:157) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:193) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:183) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:151) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:142) ~[avro-1.7.7.jar:1.7.7] at org.apache.avro.file.DataFileStream.next(DataFileStream.java:233) ~[avro-1.7.7.jar:1.7.7] at org.apache.nifi.processors.avro.SplitAvro$RecordSplitter$1$1.process(SplitAvro.java:259) ~[nifi-avro-processors-0.7.0.jar:0.7.0] at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:1998) ~[na:na] ... 17 common frames omitted was: This is largely the fault of the Avro DataFileStream reader, but it's making the processor unusable. The problem appears to occur when you make the following series of calls (which happens because of the splitSize comparison): reader.next() -> returns last element reader.hasNext() -> returns false reader.hasNext() ->