[
https://issues.apache.org/jira/browse/SPARK-35596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17364654#comment-17364654
]
zhengruifeng commented on SPARK-35596:
--------------------------------------
I think I encounted a similar case:
in a skewed stage, the matrics of shuffle read size are:
min: 13.2M, 25th: 32.4M, median: 48.2M, 75th: 80.6M, Max:13.2G
while in \{OptimizeSkewedJoin}, the statistic shows the left side are even and
all partitions are of size 82M
55986 Optimizing skewed join for [cast(batch_id#340 as bigint)],
[batch_id#466L], LeftOuter. 55987 Left side (canSplit=true) partitions size
info: 55988 median size: 85996393, max size: 85996393, min size: 85953247, avg
size: 85995559 55989 Right side (canSplit=false) partitions size info: 55990
median size: 648168, max size: 648168, min size: 648168, avg size: 648168
> HighlyCompressedMapStatus should record accurately the size of skewed shuffle
> blocks
> ------------------------------------------------------------------------------------
>
> Key: SPARK-35596
> URL: https://issues.apache.org/jira/browse/SPARK-35596
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 3.0.2, 3.1.2
> Reporter: exmy
> Priority: Major
>
> HighlyCompressedMapStatus currently cannot record accurately the size of
> shuffle blocks which much greater than other block but small than
> `spark.shuffle.accurateBlockThreshold`, which is likely to lead OOM when
> fetch shuffle blocks. We have to tune some extra properties like
> `spark.reducer.maxReqsInFlight` to prevent it, so it is better to fix it in
> HighlyCompressedMapStatus.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]