[
https://issues.apache.org/jira/browse/SPARK-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14324154#comment-14324154
]
Dr. Christian Betz edited comment on SPARK-5081 at 2/17/15 12:58 PM:
---------------------------------------------------------------------
That's Spark 1.1.0, Hadoop 2.5.0 in addition to the attached document
[^Spark_Debug.pdf]:
It logs a lot of spilling from
org.apache.spark.util.collection.ExternalAppendOnlyMap.
Performance of Tasks in Stage 10 is low. (Minutes, not 10s of seconds)
No Shuffle Spill according to WebUI:
*Details for Stage 10*
Total task time across all tasks: 0 ms
*Summary Metrics for 3 Completed Tasks*
||Metric || Min|| 25th percentile|| Median ||75th percentile||
Max||
|Result serialization time| 0 ms| 0 ms| 0 ms| 0 ms| 0 ms|
|Duration |4,8 min| 4,8 min| 5,0 min| 5,0 min|
5,0 min|
|Time spent fetching task results| 0 ms| 0 ms| 0 ms| 0 ms| 0 ms|
|Scheduler delay| 33 ms| 33 ms| 34 ms| 45 ms| 45 ms|
*Aggregated Metrics by Executor*
||Executor ID|| Address ||Task Time|| Total Tasks|| Failed Tasks||
Succeeded Tasks|| Input ||Shuffle Read|| Shuffle Write|| Shuffle
Spill (Memory)|| Shuffle Spill (Disk)||
|localhost| CANNOT FIND ADDRESS| 15 min| 3| 0| 3| 0.0 B|
0.0 B| 0.0 B| 0.0 B| 0.0 B|
*Tasks*
||Index ||ID|| Attempt ||Status|| Locality Level|| Executor||
Launch Time|| Duration|| GC Time|| Accumulators|| Errors||
|0| 291| 0| SUCCESS| ANY| localhost| 2015/02/17
13:48:39| 4,8 min| 35 s|
|1| 292| 0| SUCCESS| ANY| localhost| 2015/02/17
13:48:39| 5,0 min| 35 s|
|2| 293| 0| SUCCESS| ANY| localhost| 2015/02/17
13:48:39| 5,0 min| 35 s|
was (Author: cbbetz):
That's Spark 1.1.0, Hadoop 2.5.0 in addition to the attached document
[^Spark_Debug.pdf]:
It logs a lot of spilling from
org.apache.spark.util.collection.ExternalAppendOnlyMap.
Performance of Tasks in Stage 10 is low. (Minutes, not 10s of seconds)
No Shuffle Spill according to WebUI:
*Details for Stage 10*
Total task time across all tasks: 0 ms
*Summary Metrics for 3 Completed Tasks*
||Metric || Min|| 25th percentile|| Median ||75th percentile||
Max||
|Result serialization time| 0 ms| 0 ms| 0 ms| 0 ms| 0 ms|
|Duration |4,8 min| 4,8 min| 5,0 min| 5,0 min|
5,0 min|
|Time spent fetching task results| 0 ms| 0 ms| 0 ms| 0 ms| 0 ms|
|Scheduler delay| 33 ms| 33 ms| 34 ms| 45 ms| 45 ms|
*Aggregated Metrics by Executor*
||Executor ID|| Address ||Task Time|| Total Tasks|| Failed Tasks||
Succeeded Tasks|| Input ||Shuffle Read|| Shuffle Write|| Shuffle
Spill (Memory)|| Shuffle Spill (Disk)||
|localhost| CANNOT FIND ADDRESS| 15 min| 3| 0| 3| 0.0 B|
0.0 B| 0.0 B| 0.0 B| 0.0 B|
*Tasks*
||Index ||ID|| Attempt ||Status|| Locality Level|| Executor||
Launch Time|| Duration|| GC Time|| Accumulators|| Errors||
|0| 291| 0| SUCCESS| ANY| localhost| 2015/02/17
13:48:39| 4,8 min| 35 s|
|1| 292| 0| SUCCESS| ANY| localhost| 2015/02/17
13:48:39| 5,0 min| 35 s|
|2| |293| 0| SUCCESS| ANY| localhost| 2015/02/17
13:48:39| 5,0 min| 35 s|
> Shuffle write increases
> -----------------------
>
> Key: SPARK-5081
> URL: https://issues.apache.org/jira/browse/SPARK-5081
> Project: Spark
> Issue Type: Bug
> Components: Shuffle
> Affects Versions: 1.2.0
> Reporter: Kevin Jung
> Priority: Critical
> Attachments: Spark_Debug.pdf
>
>
> The size of shuffle write showing in spark web UI is much different when I
> execute same spark job with same input data in both spark 1.1 and spark 1.2.
> At sortBy stage, the size of shuffle write is 98.1MB in spark 1.1 but 146.9MB
> in spark 1.2.
> I set spark.shuffle.manager option to hash because it's default value is
> changed but spark 1.2 still writes shuffle output more than spark 1.1.
> It can increase disk I/O overhead exponentially as the input file gets bigger
> and it causes the jobs take more time to complete.
> In the case of about 100GB input, for example, the size of shuffle write is
> 39.7GB in spark 1.1 but 91.0GB in spark 1.2.
> spark 1.1
> ||Stage Id||Description||Input||Shuffle Read||Shuffle Write||
> |9|saveAsTextFile| |1169.4KB| |
> |12|combineByKey| |1265.4KB|1275.0KB|
> |6|sortByKey| |1276.5KB| |
> |8|mapPartitions| |91.0MB|1383.1KB|
> |4|apply| |89.4MB| |
> |5|sortBy|155.6MB| |98.1MB|
> |3|sortBy|155.6MB| | |
> |1|collect| |2.1MB| |
> |2|mapValues|155.6MB| |2.2MB|
> |0|first|184.4KB| | |
> spark 1.2
> ||Stage Id||Description||Input||Shuffle Read||Shuffle Write||
> |12|saveAsTextFile| |1170.2KB| |
> |11|combineByKey| |1264.5KB|1275.0KB|
> |8|sortByKey| |1273.6KB| |
> |7|mapPartitions| |134.5MB|1383.1KB|
> |5|zipWithIndex| |132.5MB| |
> |4|sortBy|155.6MB| |146.9MB|
> |3|sortBy|155.6MB| | |
> |2|collect| |2.0MB| |
> |1|mapValues|155.6MB| |2.2MB|
> |0|first|184.4KB| | |
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]