[
https://issues.apache.org/jira/browse/BEAM-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16699726#comment-16699726
]
Ankur Goenka commented on BEAM-6060:
------------------------------------
This issue was mitigated by setting
{code:java}
taskmanager.memory.segment-size: 1048576
{code}
cc: [~mxm]
> Tensor flow Chicago taxi failing for 117MB data with IndexOutOfBoundException
> on Flink MemorySegment
> ----------------------------------------------------------------------------------------------------
>
> Key: BEAM-6060
> URL: https://issues.apache.org/jira/browse/BEAM-6060
> Project: Beam
> Issue Type: Bug
> Components: java-fn-execution, runner-flink
> Reporter: Ankur Goenka
> Priority: Major
>
> Relevant Stack trace:
>
> {code:java}
> [flink-runner-job-server] ERROR
> org.apache.beam.runners.flink.FlinkJobInvocation - Error during job
> invocation
> BeamApp-goenka-1114034940-664c45b0_33f1a6cc-adad-47ec-8742-0fbb2617d705.
> org.apache.flink.client.program.ProgramInvocationException: Job
> f80127168dd5f8eea7c071539e642cc8 failed. at
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:464)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:452)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:427)
> at
> org.apache.flink.client.RemoteExecutor.executePlanWithJars(RemoteExecutor.java:216)
> at
> org.apache.flink.client.RemoteExecutor.executePlan(RemoteExecutor.java:193)
> at
> org.apache.flink.api.java.RemoteEnvironment.execute(RemoteEnvironment.java:173)
> at
> org.apache.beam.runners.flink.FlinkJobInvocation.runPipeline(FlinkJobInvocation.java:117)
> at
> org.apache.beam.repackaged.beam_runners_flink_2.11.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111)
> at
> org.apache.beam.repackaged.beam_runners_flink_2.11.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58)
> at
> org.apache.beam.repackaged.beam_runners_flink_2.11.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748) Caused by:
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
> at
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
> at
> org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:262)
> ... 13 more Caused by: java.lang.Exception: The data preparation for task
> 'GroupReduce (GroupReduce at
> GenerateStatistics/RunStatsGenerators/CommonStatsGenerator/CombinePerKey/GroupByKey)'
> , caused an error: Error obtaining the sorted input: Thread 'SortMerger
> Reading Thread' terminated due to an exception: null at
> org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:479) at
> org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368) at
> org.apache.flink.runtime.taskmanager.Task.run(Task.java:712) ... 1 more
> Caused by: java.lang.RuntimeException: Error obtaining the sorted input:
> Thread 'SortMerger Reading Thread' terminated due to an exception: null at
> org.apache.flink.runtime.operators.sort.UnilateralSortMerger.getIterator(UnilateralSortMerger.java:650)
> at
> org.apache.flink.runtime.operators.BatchTask.getInput(BatchTask.java:1108) at
> org.apache.flink.runtime.operators.GroupReduceDriver.prepare(GroupReduceDriver.java:99)
> at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:473) ...
> 3 more Caused by: java.io.IOException: Thread 'SortMerger Reading Thread'
> terminated due to an exception: null at
> org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:831)
> Caused by: java.lang.IndexOutOfBoundsException at
> org.apache.flink.core.memory.HybridMemorySegment.put(HybridMemorySegment.java:232)
> at
> org.apache.flink.runtime.memory.AbstractPagedOutputView.write(AbstractPagedOutputView.java:192)
> at
> org.apache.beam.runners.flink.translation.wrappers.DataOutputViewWrapper.write(DataOutputViewWrapper.java:48)
> at java.io.DataOutputStream.write(DataOutputStream.java:107) at
> java.io.ByteArrayOutputStream.writeTo(ByteArrayOutputStream.java:167) at
> org.apache.beam.sdk.coders.LengthPrefixCoder.encode(LengthPrefixCoder.java:58)
> at
> org.apache.beam.sdk.coders.IterableLikeCoder.encode(IterableLikeCoder.java:98)
> at
> org.apache.beam.sdk.coders.IterableLikeCoder.encode(IterableLikeCoder.java:60)
> at org.apache.beam.sdk.coders.Coder.encode(Coder.java:136) at
> org.apache.beam.sdk.coders.KvCoder.encode(KvCoder.java:71) at
> org.apache.beam.sdk.coders.KvCoder.encode(KvCoder.java:36) at
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:529)
> at
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:520)
> at
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.encode(WindowedValue.java:480)
> at
> org.apache.beam.runners.flink.translation.types.CoderTypeSerializer.serialize(CoderTypeSerializer.java:83)
> at
> org.apache.flink.api.java.typeutils.runtime.TupleSerializer.serialize(TupleSerializer.java:125)
> at
> org.apache.flink.api.java.typeutils.runtime.TupleSerializer.serialize(TupleSerializer.java:30)
> at
> org.apache.flink.runtime.operators.sort.LargeRecordHandler.addRecord(LargeRecordHandler.java:214)
> at
> org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ReadingThread.go(UnilateralSortMerger.java:982)
> at
> org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:827)
> [grpc-default-executor-21] INFO
> org.apache.beam.runners.flink.FlinkJobInvoker - Invoking job
> BeamApp-goenka-1114035621-e0778462_36ecd5e4-ead8-4ef0-a689-4b82e7b0194a
> {code}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)