[ 
https://issues.apache.org/jira/browse/SPARK-9096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gisle Ytrestøl updated SPARK-9096:
----------------------------------
    Description: 
When using JavaRDD.subtract(), it seems that the tasks are unevenly distributed 
in the the following operations on the new JavaRDD which is created by 
"subtract". The result is that in the following operation on the new JavaRDD, a 
few tasks process almost all the data, and these tasks will take a long time to 
finish. 



  was:When using JavaRDD.subtract(), it seems that the tasks are unevenly 
distributed in the the following operations on the new JavaRDD which is created 
by "subtract". The result is that in the following operation on the new 
JavaRDD, a few tasks process almost all the data, and these tasks will take a 
long time to finish. 


> Unevenly distributed task loads after using JavaRDD.subtract()
> --------------------------------------------------------------
>
>                 Key: SPARK-9096
>                 URL: https://issues.apache.org/jira/browse/SPARK-9096
>             Project: Spark
>          Issue Type: Bug
>          Components: Java API
>    Affects Versions: 1.4.0, 1.4.1
>            Reporter: Gisle Ytrestøl
>         Attachments: ReproduceBug.java, reproduce.1.3.1.log.gz, 
> reproduce.1.4.1.log.gz
>
>
> When using JavaRDD.subtract(), it seems that the tasks are unevenly 
> distributed in the the following operations on the new JavaRDD which is 
> created by "subtract". The result is that in the following operation on the 
> new JavaRDD, a few tasks process almost all the data, and these tasks will 
> take a long time to finish. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to