GitHub user mengxr opened a pull request:

    https://github.com/apache/spark/pull/9798

    [SPARK-6787] [ML] add read/write to estimators under ml.feature (1)

    Add read/write support to the following estimators under spark.ml:
    
    * CountVectorizer
    * IDF
    * MinMaxScaler
    * StandardScaler (a little awkward because we store some params in 
spark.mllib model)
    * StringIndexer
    
    Added some necessary method for read/write. Maybe we should add 
`private[ml] trait DefaultParamsReadable` and `DefaultParamsWritable` to save 
some boilerplate code, though we still need to override `load` for Java 
compatibility.
    
    @jkbradley 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mengxr/spark SPARK-6787

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9798.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9798
    
----
commit 7286bbb56f8c58ff9428ab07cbeb4f7441ad147a
Author: Xiangrui Meng <[email protected]>
Date:   2015-11-18T01:42:31Z

    add read/write to StringIndexer

commit 60b6a101c0ddb90257d507b86c51d4f24e2b9e5b
Author: Xiangrui Meng <[email protected]>
Date:   2015-11-18T06:06:02Z

    add read/write to IDF

commit 6df87ce47ede182881ded43a5ee4a1955eda57c1
Author: Xiangrui Meng <[email protected]>
Date:   2015-11-18T06:35:29Z

    add read/write to MinMaxScaler

commit 06fedbdf8f572f39680d3d2bc20759362dfe4874
Author: Xiangrui Meng <[email protected]>
Date:   2015-11-18T07:00:11Z

    add read/write to StandardScaler pending test

commit 750f4b76ee346b0fbf7de93ba1536afbf1646419
Author: Xiangrui Meng <[email protected]>
Date:   2015-11-18T07:03:00Z

    Merge remote-tracking branch 'apache/master' into read-write-string-indexer

commit 7b62e6ca96a4ef8ff73cc5de6df504aedd04cff7
Author: Xiangrui Meng <[email protected]>
Date:   2015-11-18T07:09:05Z

    add tests for StandardScaler read/write

commit 21e514f855ed5aa1579f10407b79a90c870d86ee
Author: Xiangrui Meng <[email protected]>
Date:   2015-11-18T07:28:37Z

    add read/write to CountVectorizer

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to