Xiangrui Meng created SPARK-5858:
------------------------------------

             Summary: Using first() to get feature size causes performance 
regression
                 Key: SPARK-5858
                 URL: https://issues.apache.org/jira/browse/SPARK-5858
             Project: Spark
          Issue Type: Bug
          Components: MLlib, PySpark
    Affects Versions: 1.3.0
            Reporter: Xiangrui Meng
            Assignee: Xiangrui Meng
            Priority: Critical


We call `.first()` to get the feature size. It causes performance regression 
because first() still runs on the driver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to