This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.2 by this push:
new 786d773 [SPARK-36578][ML] UnivariateFeatureSelector API doc
improvement
786d773 is described below
commit 786d773585a6c89bff5ec9c7c73940d0997474bc
Author: Huaxin Gao <[email protected]>
AuthorDate: Thu Aug 26 21:16:49 2021 -0700
[SPARK-36578][ML] UnivariateFeatureSelector API doc improvement
### What changes were proposed in this pull request?
Change API doc for `UnivariateFeatureSelector`
### Why are the changes needed?
make the doc look better
### Does this PR introduce _any_ user-facing change?
yes, API doc change
### How was this patch tested?
Manually checked
Closes #33855 from huaxingao/ml_doc.
Authored-by: Huaxin Gao <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 15e42b44423942be75a68993b3e34696ef2b21f6)
Signed-off-by: Dongjoon Hyun <[email protected]>
---
.../org/apache/spark/ml/feature/UnivariateFeatureSelector.scala | 9 ++++++---
python/pyspark/ml/feature.py | 8 +++++---
2 files changed, 11 insertions(+), 6 deletions(-)
diff --git
a/mllib/src/main/scala/org/apache/spark/ml/feature/UnivariateFeatureSelector.scala
b/mllib/src/main/scala/org/apache/spark/ml/feature/UnivariateFeatureSelector.scala
index 7fff159..7412c42 100644
---
a/mllib/src/main/scala/org/apache/spark/ml/feature/UnivariateFeatureSelector.scala
+++
b/mllib/src/main/scala/org/apache/spark/ml/feature/UnivariateFeatureSelector.scala
@@ -97,12 +97,15 @@ private[feature] trait UnivariateFeatureSelectorParams
extends Params
}
/**
- * The user can set `featureType` and labelType`, and Spark will pick the
score function based on
- * the specified `featureType` and labelType`.
+ * Feature selector based on univariate statistical tests against labels.
Currently, Spark
+ * supports three Univariate Feature Selectors: chi-squared, ANOVA F-test and
F-value.
+ * User can choose Univariate Feature Selector by setting `featureType` and
`labelType`,
+ * and Spark will pick the score function based on the specified `featureType`
and `labelType`.
+ *
* The following combination of `featureType` and `labelType` are supported:
* - `featureType` `categorical` and `labelType` `categorical`: Spark uses
chi-squared,
* i.e. chi2 in sklearn.
- * - `featureType` `continuous` and `labelType` `categorical`: Spark uses
ANOVATest,
+ * - `featureType` `continuous` and `labelType` `categorical`: Spark uses
ANOVA F-test,
* i.e. f_classif in sklearn.
* - `featureType` `continuous` and `labelType` `continuous`: Spark uses
F-value,
* i.e. f_regression in sklearn.
diff --git a/python/pyspark/ml/feature.py b/python/pyspark/ml/feature.py
index e066788..cf6b91c 100755
--- a/python/pyspark/ml/feature.py
+++ b/python/pyspark/ml/feature.py
@@ -5816,14 +5816,16 @@ class UnivariateFeatureSelector(JavaEstimator,
_UnivariateFeatureSelectorParams,
JavaMLWritable):
"""
UnivariateFeatureSelector
- The user can set `featureType` and `labelType`, and Spark will pick the
score function based on
- the specified `featureType` and `labelType`.
+ Feature selector based on univariate statistical tests against labels.
Currently, Spark
+ supports three Univariate Feature Selectors: chi-squared, ANOVA F-test and
F-value.
+ User can choose Univariate Feature Selector by setting `featureType` and
`labelType`,
+ and Spark will pick the score function based on the specified
`featureType` and `labelType`.
The following combination of `featureType` and `labelType` are supported:
- `featureType` `categorical` and `labelType` `categorical`, Spark uses
chi-squared,
i.e. chi2 in sklearn.
- - `featureType` `continuous` and `labelType` `categorical`, Spark uses
ANOVATest,
+ - `featureType` `continuous` and `labelType` `categorical`, Spark uses
ANOVA F-test,
i.e. f_classif in sklearn.
- `featureType` `continuous` and `labelType` `continuous`, Spark uses
F-value,
i.e. f_regression in sklearn.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]