GitHub user mengxr opened a pull request:
https://github.com/apache/spark/pull/9798
[SPARK-6787] [ML] add read/write to estimators under ml.feature (1)
Add read/write support to the following estimators under spark.ml:
* CountVectorizer
* IDF
* MinMaxScaler
* StandardScaler (a little awkward because we store some params in
spark.mllib model)
* StringIndexer
Added some necessary method for read/write. Maybe we should add
`private[ml] trait DefaultParamsReadable` and `DefaultParamsWritable` to save
some boilerplate code, though we still need to override `load` for Java
compatibility.
@jkbradley
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mengxr/spark SPARK-6787
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/9798.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #9798
----
commit 7286bbb56f8c58ff9428ab07cbeb4f7441ad147a
Author: Xiangrui Meng <[email protected]>
Date: 2015-11-18T01:42:31Z
add read/write to StringIndexer
commit 60b6a101c0ddb90257d507b86c51d4f24e2b9e5b
Author: Xiangrui Meng <[email protected]>
Date: 2015-11-18T06:06:02Z
add read/write to IDF
commit 6df87ce47ede182881ded43a5ee4a1955eda57c1
Author: Xiangrui Meng <[email protected]>
Date: 2015-11-18T06:35:29Z
add read/write to MinMaxScaler
commit 06fedbdf8f572f39680d3d2bc20759362dfe4874
Author: Xiangrui Meng <[email protected]>
Date: 2015-11-18T07:00:11Z
add read/write to StandardScaler pending test
commit 750f4b76ee346b0fbf7de93ba1536afbf1646419
Author: Xiangrui Meng <[email protected]>
Date: 2015-11-18T07:03:00Z
Merge remote-tracking branch 'apache/master' into read-write-string-indexer
commit 7b62e6ca96a4ef8ff73cc5de6df504aedd04cff7
Author: Xiangrui Meng <[email protected]>
Date: 2015-11-18T07:09:05Z
add tests for StandardScaler read/write
commit 21e514f855ed5aa1579f10407b79a90c870d86ee
Author: Xiangrui Meng <[email protected]>
Date: 2015-11-18T07:28:37Z
add read/write to CountVectorizer
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]