Alexey Romanenko created BEAM-4798:
--------------------------------------

             Summary: IndexOutOfBoundsException when Flink parallelism > 1
                 Key: BEAM-4798
                 URL: https://issues.apache.org/jira/browse/BEAM-4798
             Project: Beam
          Issue Type: Bug
          Components: runner-flink
    Affects Versions: 2.5.0
            Reporter: Alexey Romanenko
            Assignee: Alexey Romanenko


Running job on Flink in streaming mode and get data from a Kafka topic with 
parallelism > 1 causes an exception:

{noformat}
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
        at java.util.ArrayList.rangeCheck(ArrayList.java:657)
        at java.util.ArrayList.get(ArrayList.java:433)
        at 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper.run(UnboundedSourceWrapper.java:277)
        at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87)
        at 
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:56)
        at 
org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:99)
        at 
org.apache.flink.streaming.runtime.tasks.StoppableSourceStreamTask.run(StoppableSourceStreamTask.java:45)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:306)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:703)
        at java.lang.Thread.run(Thread.java:748)

{noformat}

It happens when number of Kafka topic partitions is less than value of 
parallelism (number of task slots).
So, workaround for now can be to set parallelism <= number of topic partitions, 
thus if parallelism=2 then number_partitions >= 2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to