[
https://issues.apache.org/jira/browse/SPARK-4630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14283809#comment-14283809
]
Lianhui Wang edited comment on SPARK-4630 at 1/20/15 1:28 PM:
--------------------------------------------------------------
I think it is better that we use stage's output size to decide number of next
stage's partiton. Because RDD is one of stage's operator and number of
partition is only related to shuffle,i think stage's statistics is better than
RDD.
also in SQL,there are many optimizations to choose different physical plan
based on statistics. example: hash join or sort merge join.but this is another
thing.
[~sandyr] what you said before depends on number of partitions in Map Writer is
very large, so reducer can fetch data using range partition. if initial number
of parititons is small, we need to repartition data and it is very expensive to
scan data twice. I donot know is there a better way in this situation.
so I currently use input size of parent stage to determine on number of a
stage's partition.
was (Author: lianhuiwang):
I think it is better that we use stage's output size to decide number of next
stage's partiton. Because RDD is one of stage's operator and number of
partition is only related to shuffle,i think stage's statistics is better than
RDD.
also in SQL,there are many optimizations to choose different physical plan
based on statistics. example: hash join or sort merge join.but this is another
thing.
[~sandyr] what you said before depends on number of partitions in Map Writer is
very large, so reducer can fetch data using range partition. if initial number
of parititons is small, we need to repartition data and it is very expensive to
scan data twice. I donot know is there a better way in this situation.
> Dynamically determine optimal number of partitions
> --------------------------------------------------
>
> Key: SPARK-4630
> URL: https://issues.apache.org/jira/browse/SPARK-4630
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Reporter: Kostas Sakellis
> Assignee: Kostas Sakellis
>
> Partition sizes play a big part in how fast stages execute during a Spark
> job. There is a direct relationship between the size of partitions to the
> number of tasks - larger partitions, fewer tasks. For better performance,
> Spark has a sweet spot for how large partitions should be that get executed
> by a task. If partitions are too small, then the user pays a disproportionate
> cost in scheduling overhead. If the partitions are too large, then task
> execution slows down due to gc pressure and spilling to disk.
> To increase performance of jobs, users often hand optimize the number(size)
> of partitions that the next stage gets. Factors that come into play are:
> Incoming partition sizes from previous stage
> number of available executors
> available memory per executor (taking into account
> spark.shuffle.memoryFraction)
> Spark has access to this data and so should be able to automatically do the
> partition sizing for the user. This feature can be turned off/on with a
> configuration option.
> To make this happen, we propose modifying the DAGScheduler to take into
> account partition sizes upon stage completion. Before scheduling the next
> stage, the scheduler can examine the sizes of the partitions and determine
> the appropriate number tasks to create. Since this change requires
> non-trivial modifications to the DAGScheduler, a detailed design doc will be
> attached before proceeding with the work.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]