xuyang1706 commented on issue #9355: [FLINK-13577][ml] Add an util class to 
build result row and generate …
URL: https://github.com/apache/flink/pull/9355#issuecomment-539352393
 
 
   > Hi @xuyang1706, I read the javadoc again and I think I have some confusion 
on what you are trying to achieve. please kindly take a look at my comments and 
see if you can provide some examples in test to better clarify? thanks -Rong
   
   Thanks @walterddr, I have refined the javadoc with an example, and hope that 
it could make things clear. Further more, I’d like to give 3 use cases 
following your example.
   
   Input data schema : ` ["a":INT, "b":INT, "c":BOOL, “d”:DOUBLE]`
   Input data values:
   ```
   {{1,3,true,1.0},
    {2,4,false,3.0},
    {3,5,false,5.0}}
   ```
   
   Apply this operator:
   ```
   new MinMaxScaler()
        .setSelectedCols("a","b","d")
        .setOutputCols("a_std","b_std","d_std");
   ```
   For operator MinMaxScaler’s result types are DOUBLE, and if not defined 
reserved Column Names, it will reserve all the original input data cols.
   output data schema : ` ["a":INT, "b":INT, "c":BOOL, “d”:DOUBLE, 
“a_std”:DOUBLE, “b_std”:DOUBLE, “d_std”:DOUBLE]`
   output data values:
   ```
   {{1,3,true,1.0, 0.0,0.0,0.0},
    {2,4,false,3.0, 0.5,0.5,0.5},
    {3,5,false,5.0, 1.0,1.0,1.0}}
   ```
   
   Apply this operator for the origin input data
   ```
   new MinMaxScaler()
        .setSelectedCols("a","b","d");
   ```
   
   For operator MinMaxScaler’s result types are DOUBLE, and if not defined the 
output column names, it will use the selected operating columns to save the 
results, and changed the according column types.
   output data schema : ` [“a”:DOUBLE, “b”:DOUBLE, "c":BOOL, “d”:DOUBLE]`
   output data values:
   ```
   {{0.0,0.0,true,0.0},
    {0.5,0.5,false,0.5},
    {1.0,1.0,false,1.0}}
   ```
   
   Apply this operator for the origin input data
   ```
   new LogisticRegressionModel()
        .setFeatureCols("a","b","c","d")
        .setPredictionCol("pred")
        .setReservedCols("a", “d”);
   ```
   
   The prediction result type is decided by the model, e.g. BOOL type. 
   output data schema :  `["a":INT, “d”:DOUBLE, “pred”:BOOL]`
   output data values:
   ```
   {{1,1.0,false},
    {2,3.0,true},
    {3,5.0,false}}
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to