DjvuLee created SPARK-2138:
------------------------------

             Summary: The KMeans algorithm in the MLlib can lead to the 
Serialized Task size become bigger and bigger
                 Key: SPARK-2138
                 URL: https://issues.apache.org/jira/browse/SPARK-2138
             Project: Spark
          Issue Type: Bug
          Components: MLlib
    Affects Versions: 0.9.1, 0.9.0
            Reporter: DjvuLee


When the algorithm running at certain stage, when running the reduceBykey() 
algorithm, It can lead to Executor Lost and Task lost, after several times. the 
application exit.

When this error occurred, the size of serialized task is bigger than 10MB, and 
the size become larger as the iteration increase.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to