Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161740882
--- Diff: examples/src/main/python/ml/onehot_encoder_estimator_example.py
---
@@ -18,32 +18,31 @@
from __future__ import print_function
# $example on$
-from pyspark.ml.feature import OneHotEncoder, StringIndexer
+from pyspark.ml.feature import OneHotEncoderEstimator
# $example off$
from pyspark.sql import SparkSession
if __name__ == "__main__":
spark = SparkSession\
.builder\
- .appName("OneHotEncoderExample")\
+ .appName("OneHotEncoderEstimatorExample")\
.getOrCreate()
# $example on$
+ # Notice: this categorical features are usually encoded with
`StringIndexer`.
--- End diff --
Same applies here
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]