Github user WeichenXu123 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20257#discussion_r161857103
  
    --- Diff: 
examples/src/main/java/org/apache/spark/examples/ml/JavaOneHotEncoderEstimatorExample.java
 ---
    @@ -35,41 +34,37 @@
     import org.apache.spark.sql.types.StructType;
     // $example off$
     
    -public class JavaOneHotEncoderExample {
    +public class JavaOneHotEncoderEstimatorExample {
       public static void main(String[] args) {
         SparkSession spark = SparkSession
           .builder()
    -      .appName("JavaOneHotEncoderExample")
    +      .appName("JavaOneHotEncoderEstimatorExample")
           .getOrCreate();
     
         // $example on$
    +    // Notice: this categorical features are usually encoded with 
`StringIndexer`.
         List<Row> data = Arrays.asList(
    -      RowFactory.create(0, "a"),
    -      RowFactory.create(1, "b"),
    -      RowFactory.create(2, "c"),
    -      RowFactory.create(3, "a"),
    -      RowFactory.create(4, "a"),
    -      RowFactory.create(5, "c")
    +      RowFactory.create(0.0, 1.0),
    +      RowFactory.create(1.0, 0.0),
    +      RowFactory.create(2.0, 1.0),
    +      RowFactory.create(0.0, 2.0),
    +      RowFactory.create(0.0, 1.0),
    +      RowFactory.create(2.0, 0.0)
         );
     
         StructType schema = new StructType(new StructField[]{
    -      new StructField("id", DataTypes.IntegerType, false, 
Metadata.empty()),
    -      new StructField("category", DataTypes.StringType, false, 
Metadata.empty())
    +      new StructField("categoryIndex1", DataTypes.DoubleType, false, 
Metadata.empty()),
    +      new StructField("categoryIndex2", DataTypes.DoubleType, false, 
Metadata.empty())
    --- End diff --
    
    Don't need to pass `Metadata.empty()` param, it's a default value.
    We'd better to make the example code simpler.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to