GitHub user jkbradley opened a pull request:
https://github.com/apache/spark/pull/8752
[SPARK-10595] [ML] [MLLIB] [DOCS] Various ML guide cleanups
Various ML guide cleanups.
* ml-guide.md: Make it easier to access the algorithm-specific guides.
* LDA user guide: EM often begins with useless topics, but running longer
generally improves them dramatically. E.g., 10 iterations on a Wikipedia
dataset produces useless topics, but 50 iterations produces very meaningful
topics.
* mllib-feature-extraction.html#elementwiseproduct: âwâ parameter
should be âscalingVecâ
* Clean up Binarizer user guide a little.
* Document in Pipeline that users should not put an instance into the
Pipeline in more than 1 place.
* spark.ml Word2Vec user guide: clean up grammar/writing
* Chi Sq Feature Selector docs: Improve text in doc.
CC: @mengxr @feynmanliang
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jkbradley/spark mlguide-fixes-1.5
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/8752.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #8752
----
commit 53d757a74f156893f2fafc5c65624acfb2920ffa
Author: Joseph K. Bradley <[email protected]>
Date: 2015-09-14T18:50:51Z
ml-guide.md: Make it easier to access the algorithm-specific guides.
LDA user guide: EM often begins with useless topics, but running longer
generally improves them dramatically. E.g., 10 iterations on a Wikipedia
dataset produces useless topics, but 50 iterations produces very meaningful
topics.
mllib-feature-extraction.html#elementwiseproduct
* âwâ parameter should be âscalingVecâ
Clean up Binarizer user guide a little.
Document in Pipeline that users should not put an instance into the
Pipeline in more than 1 place.
spark.ml Word2Vec user guide:
* clean up grammar/writing
Chi Sq Feature Selector docs
* Improve text in doc.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]