Xiangrui Meng created SPARK-5858:
------------------------------------
Summary: Using first() to get feature size causes performance
regression
Key: SPARK-5858
URL: https://issues.apache.org/jira/browse/SPARK-5858
Project: Spark
Issue Type: Bug
Components: MLlib, PySpark
Affects Versions: 1.3.0
Reporter: Xiangrui Meng
Assignee: Xiangrui Meng
Priority: Critical
We call `.first()` to get the feature size. It causes performance regression
because first() still runs on the driver.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]