Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r162043736
--- Diff: docs/ml-features.md ---
@@ -783,11 +783,11 @@ Because this existing `OneHotEncoder` is a stateless
transformer, it is not usab
## OneHotEncoderEstimator
-[One-hot encoding](http://en.wikipedia.org/wiki/One-hot) maps a column of
label indices to a column of binary vectors, and each output binary vector
includes at most a single one-value. This encoding allows algorithms which
expect continuous features, such as Logistic Regression, to use categorical
features. For string type input data, it is common to encode categorical
features using [StringIndexer](ml-features.html#stringindexer) first.
+[One-hot encoding](http://en.wikipedia.org/wiki/One-hot) maps a
categorical feature, represented as a label index, to a binary vector with at
most a single one-value indicating the presence of a specific feature value
from among the set of all feature values.
--- End diff --
No problem. Added it back.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]