[ https://issues.apache.org/jira/browse/SPARK-10117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xiangrui Meng resolved SPARK-10117. ----------------------------------- Resolution: Fixed Fix Version/s: 1.6.0 Issue resolved by pull request 8537 [https://github.com/apache/spark/pull/8537] > Implement SQL data source API for reading LIBSVM data > ----------------------------------------------------- > > Key: SPARK-10117 > URL: https://issues.apache.org/jira/browse/SPARK-10117 > Project: Spark > Issue Type: New Feature > Components: ML > Reporter: Xiangrui Meng > Assignee: Kai Sasaki > Fix For: 1.6.0 > > > It is convenient to implement data source API for LIBSVM format to have a > better integration with DataFrames and ML pipeline API. > {code} > import org.apache.spark.ml.source.libsvm._ > val training = sqlContext.read > .format("libsvm") > .option("numFeatures", "10000") > .load("path") > {code} > This JIRA covers the following: > 1. Read LIBSVM data as a DataFrame with two columns: label: Double and > features: Vector. > 2. Accept `numFeatures` as an option. > 3. The implementation should live under `org.apache.spark.ml.source.libsvm`. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org