Gisle Ytrestøl created SPARK-9096:
-------------------------------------
Summary: Unevenly distributed task loads after using
JavaRDD.subtract()
Key: SPARK-9096
URL: https://issues.apache.org/jira/browse/SPARK-9096
Project: Spark
Issue Type: Bug
Components: Java API
Affects Versions: 1.4.0, 1.4.1
Reporter: Gisle Ytrestøl
Attachments: ReproduceBug.java, reproduce.1.3.1.log.gz,
reproduce.1.4.1.log.gz
When using JavaRDD.subtract(), it seems that the tasks are unevenly distributed
in the the following operations on the new JavaRDD which is created by
"subtract". The result is that in the following operation on the new JavaRDD, a
few tasks process almost all the data, and these tasks will take a long time to
finish.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]