[
https://issues.apache.org/jira/browse/SPARK-9096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gisle Ytrestøl updated SPARK-9096:
----------------------------------
Description:
When using JavaRDD.subtract(), it seems that the tasks are unevenly distributed
in the the following operations on the new JavaRDD which is created by
"subtract". The result is that in the following operation on the new JavaRDD, a
few tasks process almost all the data, and these tasks will take a long time to
finish.
was:When using JavaRDD.subtract(), it seems that the tasks are unevenly
distributed in the the following operations on the new JavaRDD which is created
by "subtract". The result is that in the following operation on the new
JavaRDD, a few tasks process almost all the data, and these tasks will take a
long time to finish.
> Unevenly distributed task loads after using JavaRDD.subtract()
> --------------------------------------------------------------
>
> Key: SPARK-9096
> URL: https://issues.apache.org/jira/browse/SPARK-9096
> Project: Spark
> Issue Type: Bug
> Components: Java API
> Affects Versions: 1.4.0, 1.4.1
> Reporter: Gisle Ytrestøl
> Attachments: ReproduceBug.java, reproduce.1.3.1.log.gz,
> reproduce.1.4.1.log.gz
>
>
> When using JavaRDD.subtract(), it seems that the tasks are unevenly
> distributed in the the following operations on the new JavaRDD which is
> created by "subtract". The result is that in the following operation on the
> new JavaRDD, a few tasks process almost all the data, and these tasks will
> take a long time to finish.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]