[GitHub] spark pull request #20257: [SPARK-23048][ML] Add OneHotEncoderEstimator docu...

MLnick Tue, 16 Jan 2018 04:26:59 -0800

Github user MLnick commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20257#discussion_r161740612
  
    --- Diff: 
examples/src/main/java/org/apache/spark/examples/ml/JavaOneHotEncoderEstimatorExample.java
 ---
    @@ -35,41 +34,37 @@
     import org.apache.spark.sql.types.StructType;
     // $example off$
     
    -public class JavaOneHotEncoderExample {
    +public class JavaOneHotEncoderEstimatorExample {
       public static void main(String[] args) {
         SparkSession spark = SparkSession
           .builder()
    -      .appName("JavaOneHotEncoderExample")
    +      .appName("JavaOneHotEncoderEstimatorExample")
           .getOrCreate();
     
         // $example on$
    +    // Notice: this categorical features are usually encoded with 
`StringIndexer`.
    --- End diff --
    
    Perhaps we can move the note above the `$example on$` - I don't think it is 
necessary for it to appear in the user guide as we've mentioned it above.
    
    Also perhaps rather: `Note: categorical features are usually first encoded 
with StringIndexer`



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #20257: [SPARK-23048][ML] Add OneHotEncoderEstimator docu...

Reply via email to