[ https://issues.apache.org/jira/browse/SPARK-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15394190#comment-15394190 ]
Sean Owen commented on SPARK-6948: ---------------------------------- I tend to agree, ran into this just recently in the exact same context. It was surprising when a tiny vector was made sparse and then not usable with StandardScaler (with "subtract mean" enabled). > VectorAssembler should choose dense/sparse for output based on number of zeros > ------------------------------------------------------------------------------ > > Key: SPARK-6948 > URL: https://issues.apache.org/jira/browse/SPARK-6948 > Project: Spark > Issue Type: Improvement > Components: MLlib > Affects Versions: 1.4.0 > Reporter: Xiangrui Meng > Assignee: Xiangrui Meng > Priority: Minor > Fix For: 1.4.0 > > > Now VectorAssembler only outputs sparse vectors. We should choose > dense/sparse format automatically, whichever uses less memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org