[
https://issues.apache.org/jira/browse/BEAM-6051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ankur Goenka updated BEAM-6051:
-------------------------------
Description:
Chicago taxi example is failing when running on 1 node cluster with 256MB data.
Disk usage:
{code:java}
goenka@goenka:/home/build$ ls -lha
/tmp/flink-io-b9f13afc-0c5a-40ab-9a29-037d35068c1c/
total 8.5G
drwxr-xr-x 2 goenka primarygroup 4.0K Nov 12 20:27 .
drwxrwxrwt 37 root root 11M Nov 12 20:27 ..
-rw-r--r-- 1 goenka primarygroup 2.2G Nov 12 20:27
550b3393ce2a4c35ba37135c20ebccb3.channel
-rw-r--r-- 1 goenka primarygroup 2.2G Nov 12 20:27
573f0fcb90732308857025108ffc74f6.channel
-rw-r--r-- 1 goenka primarygroup 2.2G Nov 12 20:27
fb08a9ea467a1045f6087088b395ea8d.channel
-rw-r--r-- 1 goenka primarygroup 2.2G Nov 12 20:27
fcca6cbf619f712d852d2371d9cb7046.channel
{code}
was:
Chicago taxi example is failing when running on 1 node cluster with 256MB data.
Exception:
{code:java}
018-11-07 15:20:10,736 INFO org.apache.flink.runtime.taskmanager.Task -
GroupReduce (GroupReduce at
Analyze/ComputeAnalyzerOutputs[0]/Analyze[scale_to_z_score_1/mean_and_var/mean/sum/]/CombineGlobally(_CombineFnWrapper)/CombinePerKey/GroupByKey)
(1/1) (6be52aa98e1f7233172f58eb8695fb6d) switched
from RUNNING to FAILED.
java.lang.Exception: The data preparation for task 'GroupReduce (GroupReduce at
Analyze/ComputeAnalyzerOutputs[0]/Analyze[scale_to_z_score_1/mean_and_var/mean/sum/]/CombineGlobally(_CombineFnWrapper)/CombinePerKey/GroupByKey)'
, caused an error: Error obtaining the sorted input: Thread 'SortMerger
Reading Thread' ter
minated due to an exception: The record exceeds the maximum size of a sort
buffer (current maximum: 4456448 bytes).
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:479)
at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Error obtaining the sorted input: Thread
'SortMerger Reading Thread' terminated due to an exception: The record exceeds
the maximum size of a sort buffer (current maximum: 4456448 bytes).
at
org.apache.flink.runtime.operators.sort.UnilateralSortMerger.getIterator(UnilateralSortMerger.java:650)
at org.apache.flink.runtime.operators.BatchTask.getInput(BatchTask.java:1108)
at
org.apache.flink.runtime.operators.GroupReduceDriver.prepare(GroupReduceDriver.java:99)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:473)
... 3 more
Caused by: java.io.IOException: Thread 'SortMerger Reading Thread' terminated
due to an exception: The record exceeds the maximum size of a sort buffer
(current maximum: 4456448 bytes).
at
org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:831)
Caused by: java.io.IOException: The record exceeds the maximum size of a sort
buffer (current maximum: 4456448 bytes).
at
org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ReadingThread.go(UnilateralSortMerger.java:986)
at
org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:827)
2018-11-07 15:20:10,741 ERROR org.apache.flink.runtime.operators.BatchTask -
Error in task code: GroupReduce (GroupReduce at
Analyze/ComputeAnalyzerOutputs[0]/Analyze[scale_to_z_score/mean_and_var/mean/sum/]/CombineGlobally(_CombineFnWrapper)/CombinePerKey/GroupByKey)
(1/1)
java.lang.Exception: The data preparation for task 'GroupReduce (GroupReduce at
Analyze/ComputeAnalyzerOutputs[0]/Analyze[scale_to_z_score/mean_and_var/mean/sum/]/CombineGlobally(_CombineFnWrapper)/CombinePerKey/GroupByKey)'
, caused an error: Error obtaining the sorted input: Thread 'SortMerger
Reading Thread' termi
nated due to an exception: The record exceeds the maximum size of a sort buffer
(current maximum: 4456448 bytes).
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:479)
at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Error obtaining the sorted input: Thread
'SortMerger Reading Thread' terminated due to an exception: The record exceeds
the maximum size of a sort buffer (current maximum: 4456448 bytes).
at
org.apache.flink.runtime.operators.sort.UnilateralSortMerger.getIterator(UnilateralSortMerger.java:650)
at org.apache.flink.runtime.operators.BatchTask.getInput(BatchTask.java:1108)
at
org.apache.flink.runtime.operators.GroupReduceDriver.prepare(GroupReduceDriver.java:99)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:473)
... 3 more
Caused by: java.io.IOException: Thread 'SortMerger Reading Thread' terminated
due to an exception: The record exceeds the maximum size of a sort buffer
(current maximum: 4456448 bytes).
at
org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:831)
Caused by: java.io.IOException: The record exceeds the maximum size of a sort
buffer (current maximum: 4456448 bytes).
at
org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ReadingThread.go(UnilateralSortMerger.java:986)
at
org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:827)
2018-11-07 15:20:10,743 INFO org.apache.flink.runtime.taskmanager.Task -
GroupReduce (GroupReduce at
Analyze/ComputeAnalyzerOutputs[0]/Analyze[scale_to_z_score/mean_and_var/mean/sum/]/CombineGlobally(_CombineFnWrapper)/CombinePerKey/GroupByKey)
(1/1) (ae0b703b1e3a194644b32a2fa534436f) switched fr
om RUNNING to FAILED.
java.lang.Exception: The data preparation for task 'GroupReduce (GroupReduce at
Analyze/ComputeAnalyzerOutputs[0]/Analyze[scale_to_z_score/mean_and_var/mean/sum/]/CombineGlobally(_CombineFnWrapper)/CombinePerKey/GroupByKey)'
, caused an error: Error obtaining the sorted input: Thread 'SortMerger
Reading Thread' termi
nated due to an exception: The record exceeds the maximum size of a sort buffer
(current maximum: 4456448 bytes).
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:479)
at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:368)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Error obtaining the sorted input: Thread
'SortMerger Reading Thread' terminated due to an exception: The record exceeds
the maximum size of a sort buffer (current maximum: 4456448 bytes).
at
org.apache.flink.runtime.operators.sort.UnilateralSortMerger.getIterator(UnilateralSortMerger.java:650)
at org.apache.flink.runtime.operators.BatchTask.getInput(BatchTask.java:1108)
at
org.apache.flink.runtime.operators.GroupReduceDriver.prepare(GroupReduceDriver.java:99)
at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:473)
... 3 more
Caused by: java.io.IOException: Thread 'SortMerger Reading Thread' terminated
due to an exception: The record exceeds the maximum size of a sort buffer
(current maximum: 4456448 bytes).
at
org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:831)
Caused by: java.io.IOException: The record exceeds the maximum size of a sort
buffer (current maximum: 4456448 bytes).
at
org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ReadingThread.go(UnilateralSortMerger.java:986)
at
org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:827)
{code}
> Tensor flow Chicago taxi failing for 256MB data failing with disk exhausted
> ---------------------------------------------------------------------------
>
> Key: BEAM-6051
> URL: https://issues.apache.org/jira/browse/BEAM-6051
> Project: Beam
> Issue Type: Bug
> Components: java-fn-execution, runner-flink
> Reporter: Ankur Goenka
> Assignee: Ankur Goenka
> Priority: Major
> Fix For: Not applicable
>
>
> Chicago taxi example is failing when running on 1 node cluster with 256MB
> data.
> Disk usage:
> {code:java}
> goenka@goenka:/home/build$ ls -lha
> /tmp/flink-io-b9f13afc-0c5a-40ab-9a29-037d35068c1c/
> total 8.5G
> drwxr-xr-x 2 goenka primarygroup 4.0K Nov 12 20:27 .
> drwxrwxrwt 37 root root 11M Nov 12 20:27 ..
> -rw-r--r-- 1 goenka primarygroup 2.2G Nov 12 20:27
> 550b3393ce2a4c35ba37135c20ebccb3.channel
> -rw-r--r-- 1 goenka primarygroup 2.2G Nov 12 20:27
> 573f0fcb90732308857025108ffc74f6.channel
> -rw-r--r-- 1 goenka primarygroup 2.2G Nov 12 20:27
> fb08a9ea467a1045f6087088b395ea8d.channel
> -rw-r--r-- 1 goenka primarygroup 2.2G Nov 12 20:27
> fcca6cbf619f712d852d2371d9cb7046.channel
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)