Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/20257#discussion_r161928937
--- Diff:
examples/src/main/java/org/apache/spark/examples/ml/JavaOneHotEncoderEstimatorExample.java
---
@@ -35,41 +34,37 @@
import org.apache.spark.sql.types.StructType;
// $example off$
-public class JavaOneHotEncoderExample {
+public class JavaOneHotEncoderEstimatorExample {
public static void main(String[] args) {
SparkSession spark = SparkSession
.builder()
- .appName("JavaOneHotEncoderExample")
+ .appName("JavaOneHotEncoderEstimatorExample")
.getOrCreate();
// $example on$
+ // Notice: this categorical features are usually encoded with
`StringIndexer`.
List<Row> data = Arrays.asList(
- RowFactory.create(0, "a"),
- RowFactory.create(1, "b"),
- RowFactory.create(2, "c"),
- RowFactory.create(3, "a"),
- RowFactory.create(4, "a"),
- RowFactory.create(5, "c")
+ RowFactory.create(0.0, 1.0),
+ RowFactory.create(1.0, 0.0),
+ RowFactory.create(2.0, 1.0),
+ RowFactory.create(0.0, 2.0),
+ RowFactory.create(0.0, 1.0),
+ RowFactory.create(2.0, 0.0)
);
StructType schema = new StructType(new StructField[]{
- new StructField("id", DataTypes.IntegerType, false,
Metadata.empty()),
- new StructField("category", DataTypes.StringType, false,
Metadata.empty())
+ new StructField("categoryIndex1", DataTypes.DoubleType, false,
Metadata.empty()),
+ new StructField("categoryIndex2", DataTypes.DoubleType, false,
Metadata.empty())
--- End diff --
Since this is java example, the default param seems don't work:
```scala
error: no suitable constructor found for
StructField(String,DataType,boolean)
[error] new StructField("categoryIndex1", DataTypes.DoubleType,
false),
[error] ^
[error] /root/repos/spark-1/constructor
StructField.StructField(String,DataType,boolean,Metadata) is not applicable
[error] (actual and formal argument lists differ in length)
[error] constructor StructField.StructField() is not applicable
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]