[flink-ml] branch master updated: [FLINK-29171] Add documents for MaxAbs, FeatureHasher, Interaction, VectorSlicer, ElementwiseProduct and Binarizer

zhangzp Mon, 05 Sep 2022 02:35:14 -0700

This is an automated email from the ASF dual-hosted git repository.

zhangzp pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink-ml.git



The following commit(s) were added to refs/heads/master by this push:
     new e007030  [FLINK-29171] Add documents for MaxAbs, FeatureHasher, 
Interaction, VectorSlicer, ElementwiseProduct and Binarizer
e007030 is described below

commit e007030da6e4a35ef0d11c3ad0c76cf723f17d63
Author: weibo <wbz...@pku.edu.cn>
AuthorDate: Mon Sep 5 17:33:55 2022 +0800

    [FLINK-29171] Add documents for MaxAbs, FeatureHasher, Interaction, 
VectorSlicer, ElementwiseProduct and Binarizer
    
    This closes #151.
---
 docs/content/docs/operators/classification/knn.md  |  14 +-
 .../docs/operators/classification/linearsvc.md     |  28 ++--
 .../operators/classification/logisticregression.md |  40 ++---
 .../docs/operators/classification/naivebayes.md    |  24 +--
 .../clustering/agglomerativeclustering.md          |   3 +-
 docs/content/docs/operators/clustering/kmeans.md   |  12 +-
 .../evaluation/binaryclassificationevaluator.md    |  34 ++--
 docs/content/docs/operators/feature/binarizer.md   | 183 +++++++++++++++++++++
 docs/content/docs/operators/feature/bucketizer.md  |  24 +--
 .../docs/operators/feature/elementwiseproduct.md   | 157 ++++++++++++++++++
 .../docs/operators/feature/featurehasher.md        | 177 ++++++++++++++++++++
 docs/content/docs/operators/feature/hashingtf.md   | 165 +++++++++++++++++++
 docs/content/docs/operators/feature/interaction.md | 169 +++++++++++++++++++
 .../feature/{minmaxscaler.md => maxabsscaler.md}   |  76 ++++-----
 .../content/docs/operators/feature/minmaxscaler.md |  14 +-
 .../docs/operators/feature/onehotencoder.md        |  24 +--
 .../docs/operators/feature/standardscaler.md       |  14 +-
 .../docs/operators/feature/stringindexer.md        |  22 +--
 .../docs/operators/feature/vectorassembler.md      |  14 +-
 .../content/docs/operators/feature/vectorslicer.md | 158 ++++++++++++++++++
 .../docs/operators/regression/linearregression.md  |  16 +-
 21 files changed, 1188 insertions(+), 180 deletions(-)

diff --git a/docs/content/docs/operators/classification/knn.md 
b/docs/content/docs/operators/classification/knn.md
index 5af83f3..9d51c4e 100644
--- a/docs/content/docs/operators/classification/knn.md
+++ b/docs/content/docs/operators/classification/knn.md
@@ -32,16 +32,16 @@ to that label.
 
 ### Input Columns
 
-| Param name  | Type    | Default      | Description      |
-| :---------- | :------ | :----------- | :--------------- |
-| featuresCol | Vector  | `"features"` | Feature vector   |
-| labelCol    | Integer | `"label"`    | Label to predict |
+| Param name  | Type    | Default      | Description       |
+| :---------- | :------ | :----------- |:------------------|
+| featuresCol | Vector  | `"features"` | Feature vector.   |
+| labelCol    | Integer | `"label"`    | Label to predict. |
 
 ### Output Columns
 
-| Param name    | Type    | Default        | Description     |
-| :------------ | :------ | :------------- | :-------------- |
-| predictionCol | Integer | `"prediction"` | Predicted label |
+| Param name    | Type    | Default        | Description      |
+| :------------ | :------ | :------------- |:-----------------|
+| predictionCol | Integer | `"prediction"` | Predicted label. |
 
 ### Parameters
 
diff --git a/docs/content/docs/operators/classification/linearsvc.md 
b/docs/content/docs/operators/classification/linearsvc.md
index c2530d5..6808578 100644
--- a/docs/content/docs/operators/classification/linearsvc.md
+++ b/docs/content/docs/operators/classification/linearsvc.md
@@ -31,28 +31,28 @@ a hyperplane to maximize the distance between classified 
samples.
 
 ### Input Columns
 
-| Param name  | Type    | Default      | Description      |
-| :---------- | :------ | :----------- | :--------------- |
-| featuresCol | Vector  | `"features"` | Feature vector   |
-| labelCol    | Integer | `"label"`    | Label to predict |
-| weightCol   | Double  | `"weight"`   | Weight of sample |
+| Param name  | Type    | Default      | Description       |
+| :---------- | :------ | :----------- |:------------------|
+| featuresCol | Vector  | `"features"` | Feature vector.   |
+| labelCol    | Integer | `"label"`    | Label to predict. |
+| weightCol   | Double  | `"weight"`   | Weight of sample. |
 
 ### Output Columns
 
-| Param name       | Type    | Default           | Description                 
            |
-| :--------------- | :------ | :---------------- | 
:-------------------------------------- |
-| predictionCol    | Integer | `"prediction"`    | Label of the max 
probability            |
-| rawPredictionCol | Vector  | `"rawPrediction"` | Vector of the probability 
of each label |
+| Param name       | Type    | Default           | Description                 
             |
+| :--------------- | :------ | :---------------- 
|:-----------------------------------------|
+| predictionCol    | Integer | `"prediction"`    | Label of the max 
probability.            |
+| rawPredictionCol | Vector  | `"rawPrediction"` | Vector of the probability 
of each label. |
 
 ### Parameters
 
 Below are the parameters required by `LinearSVCModel`.
 
-| Key              | Default           | Type   | Required | Description       
                                           |
-| ---------------- | ----------------- | ------ | -------- | 
------------------------------------------------------------ |
-| featuresCol      | `"features"`      | String | no       | Features column 
name.                                        |
-| predictionCol    | `"prediction"`    | String | no       | Prediction column 
name.                                      |
-| rawPredictionCol | `"rawPrediction"` | String | no       | Raw prediction 
column name.                                  |
+| Key              | Default           | Type   | Required | Description       
                                                      |
+|------------------|-------------------|--------|----------|-------------------------------------------------------------------------|
+| featuresCol      | `"features"`      | String | no       | Features column 
name.                                                   |
+| predictionCol    | `"prediction"`    | String | no       | Prediction column 
name.                                                 |
+| rawPredictionCol | `"rawPrediction"` | String | no       | Raw prediction 
column name.                                             |
 | threshold        | `0.0`             | Double | no       | Threshold in 
binary classification prediction applied to rawPrediction. |
 
 `LinearSVC` needs parameters above and also below.
diff --git a/docs/content/docs/operators/classification/logisticregression.md 
b/docs/content/docs/operators/classification/logisticregression.md
index 26818e3..7f73664 100644
--- a/docs/content/docs/operators/classification/logisticregression.md
+++ b/docs/content/docs/operators/classification/logisticregression.md
@@ -30,18 +30,18 @@ widely used to predict a binary response.
 
 ### Input Columns
 
-| Param name  | Type    | Default      | Description      |
-| :---------- | :------ | :----------- | :--------------- |
-| featuresCol | Vector  | `"features"` | Feature vector   |
-| labelCol    | Integer | `"label"`    | Label to predict |
-| weightCol   | Double  | `"weight"`   | Weight of sample |
+| Param name  | Type    | Default      | Description       |
+| :---------- | :------ | :----------- |:------------------|
+| featuresCol | Vector  | `"features"` | Feature vector.   |
+| labelCol    | Integer | `"label"`    | Label to predict. |
+| weightCol   | Double  | `"weight"`   | Weight of sample. |
 
 ### Output Columns
 
-| Param name       | Type    | Default           | Description                 
            |
-| :--------------- | :------ | :---------------- | 
:-------------------------------------- |
-| predictionCol    | Integer | `"prediction"`    | Label of the max 
probability            |
-| rawPredictionCol | Vector  | `"rawPrediction"` | Vector of the probability 
of each label |
+| Param name       | Type    | Default           | Description                 
             |
+| :--------------- | :------ | :---------------- 
|:-----------------------------------------|
+| predictionCol    | Integer | `"prediction"`    | Label of the max 
probability.            |
+| rawPredictionCol | Vector  | `"rawPrediction"` | Vector of the probability 
of each label. |
 
 ### Parameters
 
@@ -55,17 +55,17 @@ Below are the parameters required by 
`LogisticRegressionModel`.
 
 `LogisticRegression` needs parameters above and also below.
 
-| Key             | Default   | Type    | Required | Description               
                                   |
-| --------------- | --------- | ------- | -------- | 
------------------------------------------------------------ |
-| labelCol        | `"label"` | String  | no       | Label column name.        
                                   |
-| weightCol       | `null`    | String  | no       | Weight column name.       
                                   |
-| maxIter         | `20`      | Integer | no       | Maximum number of 
iterations.                                |
-| reg             | `0.`      | Double  | no       | Regularization parameter. 
                                   |
-| elasticNet      | `0.`      | Double  | no       | ElasticNet parameter.     
                                   |
-| learningRate    | `0.1`     | Double  | no       | Learning rate of 
optimization method.                        |
-| globalBatchSize | `32`      | Integer | no       | Global batch size of 
training algorithms.                    |
-| tol             | `1e-6`    | Double  | no       | Convergence tolerance for 
iterative algorithms.              |
-| multiClass      | `"auto"`  | String  | no       | Classification type. 
Supported values: "auto", "binomial", "multinomial" |
+| Key             | Default   | Type    | Required | Description               
                                                |
+|-----------------|-----------|---------|----------|---------------------------------------------------------------------------|
+| labelCol        | `"label"` | String  | no       | Label column name.        
                                                |
+| weightCol       | `null`    | String  | no       | Weight column name.       
                                                |
+| maxIter         | `20`      | Integer | no       | Maximum number of 
iterations.                                             |
+| reg             | `0.`      | Double  | no       | Regularization parameter. 
                                                |
+| elasticNet      | `0.`      | Double  | no       | ElasticNet parameter.     
                                                |
+| learningRate    | `0.1`     | Double  | no       | Learning rate of 
optimization method.                                     |
+| globalBatchSize | `32`      | Integer | no       | Global batch size of 
training algorithms.                                 |
+| tol             | `1e-6`    | Double  | no       | Convergence tolerance for 
iterative algorithms.                           |
+| multiClass      | `"auto"`  | String  | no       | Classification type. 
Supported values: "auto", "binomial", "multinomial". |
 
 ### Examples
 {{< tabs examples >}}
diff --git a/docs/content/docs/operators/classification/naivebayes.md 
b/docs/content/docs/operators/classification/naivebayes.md
index b913c3f..756c3c0 100644
--- a/docs/content/docs/operators/classification/naivebayes.md
+++ b/docs/content/docs/operators/classification/naivebayes.md
@@ -30,26 +30,26 @@ there is strong (naive) independence between every pair of 
features.
 
 ### Input Columns
 
-| Param name  | Type    | Default      | Description      |
-| :---------- | :------ | :----------- | :--------------- |
-| featuresCol | Vector  | `"features"` | Feature vector   |
-| labelCol    | Integer | `"label"`    | Label to predict |
+| Param name  | Type    | Default      | Description       |
+| :---------- | :------ | :----------- |:------------------|
+| featuresCol | Vector  | `"features"` | Feature vector.   |
+| labelCol    | Integer | `"label"`    | Label to predict. |
 
 ### Output Columns
 
-| Param name    | Type    | Default        | Description     |
-| :------------ | :------ | :------------- | :-------------- |
-| predictionCol | Integer | `"prediction"` | Predicted label |
+| Param name    | Type    | Default        | Description      |
+| :------------ | :------ | :------------- |:-----------------|
+| predictionCol | Integer | `"prediction"` | Predicted label. |
 
 ### Parameters
 
 Below are parameters required by `NaiveBayesModel`.
 
-| Key           | Default         | Type   | Required | Description            
                         |
-| ------------- | --------------- | ------ | -------- | 
----------------------------------------------- |
-| modelType     | `"multinomial"` | String | no       | The model type. 
Supported values: "multinomial" |
-| featuresCol   | `"features"`    | String | no       | Features column name.  
                         |
-| predictionCol | `"prediction"`  | String | no       | Prediction column 
name.                         |
+| Key           | Default         | Type   | Required | Description            
                          |
+| ------------- | --------------- | ------ | -------- 
|--------------------------------------------------|
+| modelType     | `"multinomial"` | String | no       | The model type. 
Supported values: "multinomial". |
+| featuresCol   | `"features"`    | String | no       | Features column name.  
                          |
+| predictionCol | `"prediction"`  | String | no       | Prediction column 
name.                          |
 
 `NaiveBayes` needs parameters above and also below.
 
diff --git a/docs/content/docs/operators/clustering/agglomerativeclustering.md 
b/docs/content/docs/operators/clustering/agglomerativeclustering.md
index c2988b5..7bca4b9 100644
--- a/docs/content/docs/operators/clustering/agglomerativeclustering.md
+++ b/docs/content/docs/operators/clustering/agglomerativeclustering.md
@@ -120,8 +120,7 @@ public class AgglomerativeClusteringExample {
 
 {{< tab "Python">}}
 ```python
-# Simple program that creates a Bucketizer instance and uses it for feature
-# engineering.
+# Simple program that creates an agglomerativeclustering instance and uses it 
for clustering.
 
 from pyflink.common import Types
 from pyflink.datastream import StreamExecutionEnvironment
diff --git a/docs/content/docs/operators/clustering/kmeans.md 
b/docs/content/docs/operators/clustering/kmeans.md
index f5c4bd2..68ee678 100644
--- a/docs/content/docs/operators/clustering/kmeans.md
+++ b/docs/content/docs/operators/clustering/kmeans.md
@@ -30,15 +30,15 @@ into a predefined number of clusters.
 
 ### Input Columns
 
-| Param name  | Type   | Default      | Description    |
-|:------------|:-------|:-------------|:---------------|
-| featuresCol | Vector | `"features"` | Feature vector |
+| Param name  | Type   | Default      | Description     |
+|:------------|:-------|:-------------|:----------------|
+| featuresCol | Vector | `"features"` | Feature vector. |
 
 ### Output Columns
 
-| Param name    | Type    | Default        | Description              |
-|:--------------|:--------|:---------------|:-------------------------|
-| predictionCol | Integer | `"prediction"` | Predicted cluster center |
+| Param name    | Type    | Default        | Description               |
+|:--------------|:--------|:---------------|:--------------------------|
+| predictionCol | Integer | `"prediction"` | Predicted cluster center. |
 
 ### Parameters
 
diff --git 
a/docs/content/docs/operators/evaluation/binaryclassificationevaluator.md 
b/docs/content/docs/operators/evaluation/binaryclassificationevaluator.md
index a30189a..0fd657e 100644
--- a/docs/content/docs/operators/evaluation/binaryclassificationevaluator.md
+++ b/docs/content/docs/operators/evaluation/binaryclassificationevaluator.md
@@ -35,29 +35,29 @@ predictions, scores, or label probabilities). The output 
may contain different
 metrics defined by the parameter `MetricsNames`.
 ### Input Columns
 
-| Param name       | Type          | Default         | Description             
  |
-| :--------------- | :------------ | :-------------- | 
:------------------------ |
-| labelCol         | Number        | `"label"`       | The label of this entry 
  |
-| rawPredictionCol | Vector/Number | `rawPrediction` | The raw prediction 
result |
-| weightCol        | Number        | `null`          | The weight of this 
entry  |
+| Param name       | Type          | Default         | Description             
   |
+| :--------------- | :------------ | :-------------- 
|:---------------------------|
+| labelCol         | Number        | `"label"`       | The label of this 
entry.   |
+| rawPredictionCol | Vector/Number | `rawPrediction` | The raw prediction 
result. |
+| weightCol        | Number        | `null`          | The weight of this 
entry.  |
 
 ### Output Columns
 
-| Column name       | Type   | Description                                     
             |
-| ----------------- | ------ | 
------------------------------------------------------------ |
-| "areaUnderROC"    | Double | the area under the receiver operating 
characteristic (ROC) curve |
-| "areaUnderPR"     | Double | the area under the precision-recall curve       
             |
-| "areaUnderLorenz" | Double | Kolmogorov-Smirnov, measures the ability of the 
model to separate positive and negative samples |
-| "ks"              | Double | the area under the lorenz curve                 
             |
+| Column name       | Type   | Description                                     
                                                 |
+| ----------------- | ------ 
|--------------------------------------------------------------------------------------------------|
+| "areaUnderROC"    | Double | The area under the receiver operating 
characteristic (ROC) curve.                                |
+| "areaUnderPR"     | Double | The area under the precision-recall curve.      
                                                 |
+| "areaUnderLorenz" | Double | Kolmogorov-Smirnov, measures the ability of the 
model to separate positive and negative samples. |
+| "ks"              | Double | The area under the lorenz curve.                
                                                 |
 
 ### Parameters
 
-| Key              | Default                                                   
   | Type         | Required | Description                  |
-| ---------------- | 
------------------------------------------------------------ | ------------ | 
-------- | ---------------------------- |
-| labelCol         | `"label"`                                                 
   | String       | no       | Label column name.           |
-| weightCol        | `null`                                                    
   | String       | no       | Weight column name.          |
-| rawPredictionCol | `"rawPrediction"`                                         
   | String       | no       | Raw prediction column name.  |
-| metricsNames     | `[BinaryClassificationEvaluatorParams.AREA_UNDER_ROC, 
BinaryClassificationEvaluatorParams.AREA_UNDER_PR]` | String Array | no       | 
Names of the output metrics. |
+| Key              | Default                           | Type     | Required | 
Description                                                                     
                       |
+|------------------|-----------------------------------|----------|----------|--------------------------------------------------------------------------------------------------------|
+| labelCol         | `"label"`                         | String   | no       | 
Label column name.                                                              
                       |
+| weightCol        | `null`                            | String   | no       | 
Weight column name.                                                             
                       |
+| rawPredictionCol | `"rawPrediction"`                 | String   | no       | 
Raw prediction column name.                                                     
                       |
+| metricsNames     | `["areaUnderROC", "areaUnderPR"]` | String[] | no       | 
Names of the output metrics. Supported values: 'areaUnderROC', 'areaUnderPR', 
'areaUnderLorenz', 'ks'. |
 
 ### Examples
 
diff --git a/docs/content/docs/operators/feature/binarizer.md 
b/docs/content/docs/operators/feature/binarizer.md
new file mode 100644
index 0000000..74388b4
--- /dev/null
+++ b/docs/content/docs/operators/feature/binarizer.md
@@ -0,0 +1,183 @@
+---
+title: "Binarizer"
+weight: 1
+type: docs
+aliases:
+- /operators/feature/binarizer.html
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## Binarizer
+
+Binarizer binarizes the columns of continuous features by the given thresholds.
+The continuous features may be DenseVector, SparseVector, or Numerical Value.
+
+### Input Columns
+
+| Param name | Type          | Default | Description                     |
+|:-----------|:--------------|:--------|:--------------------------------|
+| inputCols  | Number/Vector | `null`  | Number/Vectors to be binarized. |
+
+### Output Columns
+
+| Param name | Type          | Default | Description               |
+|:-----------|:--------------|:--------|:--------------------------|
+| outputCols | Number/Vector | `null`  | Binarized Number/Vectors. |
+
+### Parameters
+
+| Key         | Default   | Type     | Required | Description                  
                        |
+|-------------|-----------|----------|----------|------------------------------------------------------|
+| inputCols   | `null`    | String[] | yes      | Input column names.          
                        |
+| outputCols  | `null`    | String[] | yes      | Output column name.          
                        |
+| thresholds  | `null`    | Double[] | yes      | The thresholds used to 
binarize continuous features. |
+
+### Examples
+
+{{< tabs examples >}}
+
+{{< tab "Java">}}
+
+```java
+import org.apache.flink.ml.feature.binarizer.Binarizer;
+import org.apache.flink.ml.linalg.Vectors;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.CloseableIterator;
+
+import java.util.Arrays;
+
+/** Simple program that creates a Binarizer instance and uses it for feature 
engineering. */
+public class BinarizerExample {
+    public static void main(String[] args) {
+        StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
+        StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);
+
+        // Generates input data.
+        DataStream<Row> inputStream =
+                env.fromElements(
+                        Row.of(
+                                1,
+                                Vectors.dense(1, 2),
+                                Vectors.sparse(
+                                        17, new int[] {0, 3, 9}, new double[] 
{1.0, 2.0, 7.0})),
+                        Row.of(
+                                2,
+                                Vectors.dense(2, 1),
+                                Vectors.sparse(
+                                        17, new int[] {0, 2, 14}, new double[] 
{5.0, 4.0, 1.0})),
+                        Row.of(
+                                3,
+                                Vectors.dense(5, 18),
+                                Vectors.sparse(
+                                        17, new int[] {0, 11, 12}, new 
double[] {2.0, 4.0, 4.0})));
+
+        Table inputTable = tEnv.fromDataStream(inputStream).as("f0", "f1", 
"f2");
+
+        // Creates a Binarizer object and initializes its parameters.
+        Binarizer binarizer =
+                new Binarizer()
+                        .setInputCols("f0", "f1", "f2")
+                        .setOutputCols("of0", "of1", "of2")
+                        .setThresholds(0.0, 0.0, 0.0);
+
+        // Transforms input data.
+        Table outputTable = binarizer.transform(inputTable)[0];
+
+        // Extracts and displays the results.
+        for (CloseableIterator<Row> it = outputTable.execute().collect(); 
it.hasNext(); ) {
+            Row row = it.next();
+
+            Object[] inputValues = new Object[binarizer.getInputCols().length];
+            Object[] outputValues = new 
Object[binarizer.getInputCols().length];
+            for (int i = 0; i < inputValues.length; i++) {
+                inputValues[i] = row.getField(binarizer.getInputCols()[i]);
+                outputValues[i] = row.getField(binarizer.getOutputCols()[i]);
+            }
+
+            System.out.printf(
+                    "Input Values: %s\tOutput Values: %s\n",
+                    Arrays.toString(inputValues), 
Arrays.toString(outputValues));
+        }
+    }
+}
+
+```
+
+{{< /tab>}}
+
+{{< tab "Python">}}
+
+```python
+# Simple program that creates a Binarizer instance and uses it for feature
+# engineering.
+
+from pyflink.common import Types
+from pyflink.datastream import StreamExecutionEnvironment
+from pyflink.ml.core.linalg import Vectors, DenseVectorTypeInfo
+from pyflink.ml.lib.feature.binarizer import Binarizer
+from pyflink.table import StreamTableEnvironment
+
+# create a new StreamExecutionEnvironment
+env = StreamExecutionEnvironment.get_execution_environment()
+
+# create a StreamTableEnvironment
+t_env = StreamTableEnvironment.create(env)
+
+# generate input data
+input_data_table = t_env.from_data_stream(
+    env.from_collection([
+        (1,
+         Vectors.dense(3, 4)),
+        (2,
+         Vectors.dense(6, 2))
+    ],
+        type_info=Types.ROW_NAMED(
+            ['f0', 'f1'],
+            [Types.INT(), DenseVectorTypeInfo()])))
+
+# create an binarizer object and initialize its parameters
+binarizer = Binarizer() \
+    .set_input_cols('f0', 'f1') \
+    .set_output_cols('of0', 'of1') \
+    .set_thresholds(1.5, 3.5)
+
+# use the binarizer for feature engineering
+output = binarizer.transform(input_data_table)[0]
+
+# extract and display the results
+field_names = output.get_schema().get_field_names()
+input_values = [None for _ in binarizer.get_input_cols()]
+output_values = [None for _ in binarizer.get_output_cols()]
+for result in t_env.to_data_stream(output).execute_and_collect():
+    for i in range(len(binarizer.get_input_cols())):
+        input_values[i] = 
result[field_names.index(binarizer.get_input_cols()[i])]
+        output_values[i] = 
result[field_names.index(binarizer.get_output_cols()[i])]
+    print('Input Values: ' + str(input_values) + '\tOutput Values: ' + 
str(output_values))
+
+```
+
+{{< /tab>}}
+
+{{< /tabs>}}
diff --git a/docs/content/docs/operators/feature/bucketizer.md 
b/docs/content/docs/operators/feature/bucketizer.md
index 9430691..c898ab3 100644
--- a/docs/content/docs/operators/feature/bucketizer.md
+++ b/docs/content/docs/operators/feature/bucketizer.md
@@ -32,24 +32,24 @@ multiple columns of discrete features, i.e., buckets 
indices. The indices are in
 [0, numSplitsInThisColumn - 1].
 ### Input Columns
 
-| Param name | Type   | Default | Description                          |
-| :--------- | :----- | :------ | :----------------------------------- |
-| inputCols  | Number | `null`  | Continuous features to be bucketized |
+| Param name | Type   | Default | Description                           |
+|:-----------|:-------|:--------|:--------------------------------------|
+| inputCols  | Number | `null`  | Continuous features to be bucketized. |
 
 ### Output Columns
 
-| Param name | Type   | Default | Description                  |
-| :--------- | :----- | :------ | :--------------------------- |
-| outputCols | Double | `null`  | Discrete bucketized features |
+| Param name | Type   | Default | Description           |
+|:-----------|:-------|:--------|:----------------------|
+| outputCols | Double | `null`  | Discretized features. |
 
 ### Parameters
 
-| Key           | Default                          | Type        | Required | 
Description                                                  |
-| ------------- | -------------------------------- | ----------- | -------- | 
------------------------------------------------------------ |
-| inputCols     | `null`                           | String      | yes      | 
Input column names.                                          |
-| outputCols    | `null`                           | String      | yes      | 
Output column names.                                         |
-| handleInvalid | `HasHandleInvalid.ERROR_INVALID` | String      | No       | 
Strategy to handle invalid entries.                          |
-| splitsArray   | `null`                           | Double\[][] | yes      | 
Array of split points for mapping continuous features into buckets. |
+| Key           | Default   | Type        | Required | Description             
                                                       |
+|---------------|-----------|-------------|----------|--------------------------------------------------------------------------------|
+| inputCols     | `null`    | String[]    | yes      | Input column names.     
                                                       |
+| outputCols    | `null`    | String[]    | yes      | Output column names.    
                                                       |
+| handleInvalid | `"error"` | String      | no       | Strategy to handle 
invalid entries. Supported values: 'error', 'skip', 'keep'. |
+| splitsArray   | `null`    | Double\[][] | yes      | Array of split points 
for mapping continuous features into buckets.            |
 
 ### Examples
 
diff --git a/docs/content/docs/operators/feature/elementwiseproduct.md 
b/docs/content/docs/operators/feature/elementwiseproduct.md
new file mode 100644
index 0000000..1d33634
--- /dev/null
+++ b/docs/content/docs/operators/feature/elementwiseproduct.md
@@ -0,0 +1,157 @@
+---
+title: "Elementwise Product"
+weight: 1
+type: docs
+aliases:
+- /operators/feature/elementwiseproduct.html
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## Elementwise Product
+
+Elementwise Product multiplies each input vector with a given scaling vector 
using 
+Hadamard product. If the size of the input vector does not equal the size of 
the 
+scaling vector, the transformer will throw an IllegalArgumentException.
+
+### Input Columns
+
+| Param name | Type   | Default   | Description            |
+|:-----------|:-------|:----------|:-----------------------|
+| inputCol   | Vector | `"input"` | Features to be scaled. |
+
+### Output Columns
+
+| Param name | Type   | Default    | Description      |
+|:-----------|:-------|:-----------|:-----------------|
+| outputCol  | Vector | `"output"` | Scaled features. |
+
+### Parameters
+
+| Key        | Default    | Type   | Required | Description         |
+|------------|------------|--------|----------|---------------------|
+| inputCol   | `"input"`  | String | no       | Input column name.  |
+| outputCol  | `"output"` | String | no       | Output column name. |
+| scalingVec | `null`     | String | yes      | The scaling vector. |
+### Examples
+
+{{< tabs examples >}}
+
+{{< tab "Java">}}
+
+```java
+import org.apache.flink.ml.feature.elementwiseproduct.ElementwiseProduct;
+import org.apache.flink.ml.linalg.Vector;
+import org.apache.flink.ml.linalg.Vectors;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.CloseableIterator;
+
+/**
+ * Simple program that creates an ElementwiseProduct instance and uses it for 
feature engineering.
+ */
+public class ElementwiseProductExample {
+    public static void main(String[] args) {
+        StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
+        StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);
+
+        // Generates input data.
+        DataStream<Row> inputStream =
+                env.fromElements(
+                        Row.of(0, Vectors.dense(1.1, 3.2)), Row.of(1, 
Vectors.dense(2.1, 3.1)));
+
+        Table inputTable = tEnv.fromDataStream(inputStream).as("id", "vec");
+
+        // Creates an ElementwiseProduct object and initializes its parameters.
+        ElementwiseProduct elementwiseProduct =
+                new ElementwiseProduct()
+                        .setInputCol("vec")
+                        .setOutputCol("outputVec")
+                        .setScalingVec(Vectors.dense(1.1, 1.1));
+
+        // Transforms input data.
+        Table outputTable = elementwiseProduct.transform(inputTable)[0];
+
+        // Extracts and displays the results.
+        for (CloseableIterator<Row> it = outputTable.execute().collect(); 
it.hasNext(); ) {
+            Row row = it.next();
+            Vector inputValue = (Vector) 
row.getField(elementwiseProduct.getInputCol());
+            Vector outputValue = (Vector) 
row.getField(elementwiseProduct.getOutputCol());
+            System.out.printf("Input Value: %s \tOutput Value: %s\n", 
inputValue, outputValue);
+        }
+    }
+}
+
+```
+
+{{< /tab>}}
+
+{{< tab "Python">}}
+
+```python
+# Simple program that creates an ElementwiseProduct instance and uses it for 
feature
+# engineering.
+
+from pyflink.common import Types
+from pyflink.datastream import StreamExecutionEnvironment
+from pyflink.ml.core.linalg import Vectors, DenseVectorTypeInfo
+from pyflink.ml.lib.feature.elementwiseproduct import ElementwiseProduct
+from pyflink.table import StreamTableEnvironment
+
+# create a new StreamExecutionEnvironment
+env = StreamExecutionEnvironment.get_execution_environment()
+
+# create a StreamTableEnvironment
+t_env = StreamTableEnvironment.create(env)
+
+# generate input data
+input_data_table = t_env.from_data_stream(
+    env.from_collection([
+        (1, Vectors.dense(2.1, 3.1)),
+        (2, Vectors.dense(1.1, 3.3))
+    ],
+        type_info=Types.ROW_NAMED(
+            ['id', 'vec'],
+            [Types.INT(), DenseVectorTypeInfo()])))
+
+# create an elementwise product object and initialize its parameters
+elementwise_product = ElementwiseProduct() \
+    .set_input_col('vec') \
+    .set_output_col('output_vec') \
+    .set_scaling_vec(Vectors.dense(1.1, 1.1))
+
+# use the elementwise product object for feature engineering
+output = elementwise_product.transform(input_data_table)[0]
+
+# extract and display the results
+field_names = output.get_schema().get_field_names()
+for result in t_env.to_data_stream(output).execute_and_collect():
+    input_value = 
result[field_names.index(elementwise_product.get_input_col())]
+    output_value = 
result[field_names.index(elementwise_product.get_output_col())]
+    print('Input Value: ' + str(input_value) + '\tOutput Value: ' + 
str(output_value))
+
+```
+
+{{< /tab>}}
+
+{{< /tabs>}}
diff --git a/docs/content/docs/operators/feature/featurehasher.md 
b/docs/content/docs/operators/feature/featurehasher.md
new file mode 100644
index 0000000..78ade1f
--- /dev/null
+++ b/docs/content/docs/operators/feature/featurehasher.md
@@ -0,0 +1,177 @@
+---
+title: "Feature Hasher"
+weight: 1
+type: docs
+aliases:
+- /operators/feature/featurehasher.html
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## Feature Hasher
+
+Feature Hasher transforms a set of categorical or numerical features into a 
sparse vector of
+a specified dimension. The rules of hashing categorical columns and numerical 
columns are as
+follows:
+
+<ul>
+<li>For numerical columns, the index of this feature in the output vector is 
the hash value of
+      the column name and its correponding value is the same as the input.
+<li>For categorical columns, the index of this feature in the output vector is 
the hash value
+      of the string "column_name=value" and the corresponding value is 1.0.
+</ul>
+
+<p>If multiple features are projected into the same column, the output values 
are accumulated.
+For the hashing trick, see https://en.wikipedia.org/wiki/Feature_hashing for 
details.
+
+### Input Columns
+
+| Param name | Type                  | Default | Description           |
+|:-----------|:----------------------|:--------|:----------------------|
+| inputCols  | Number/String/Boolean | `null`  | Columns to be hashed. |
+
+### Output Columns
+
+| Param name | Type   | Default    | Description    |
+|:-----------|:-------|:-----------|:---------------|
+| outputCol  | Vector | `"output"` | Output vector. |
+
+### Parameters
+
+| Key             | Default    | Type      | Required | Description            
   |
+|-----------------|------------|-----------|----------|---------------------------|
+| inputCols       | `null`     | String[]  | yes      | Input column names.    
   |
+| outputCol       | `"output"` | String    | no       | Output column name.    
   |
+| categoricalCols | `[]`       | String[]  | no       | Categorical column 
names. |
+| numFeatures     | `262144`   | Integer   | no       | The number of 
features.   |
+### Examples
+
+{{< tabs examples >}}
+
+{{< tab "Java">}}
+
+```java
+import org.apache.flink.ml.feature.featurehasher.FeatureHasher;
+import org.apache.flink.ml.linalg.Vector;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.CloseableIterator;
+
+import java.util.Arrays;
+
+/** Simple program that creates a FeatureHasher instance and uses it for 
feature engineering. */
+public class FeatureHasherExample {
+    public static void main(String[] args) {
+
+        StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
+        StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);
+
+        // Generates input data.
+        DataStream<Row> dataStream =
+                env.fromCollection(
+                        Arrays.asList(Row.of(0, "a", 1.0, true), Row.of(1, 
"c", 1.0, false)));
+        Table inputDataTable = tEnv.fromDataStream(dataStream).as("id", "f0", 
"f1", "f2");
+
+        // Creates a FeatureHasher object and initializes its parameters.
+        FeatureHasher featureHash =
+                new FeatureHasher()
+                        .setInputCols("f0", "f1", "f2")
+                        .setCategoricalCols("f0", "f2")
+                        .setOutputCol("vec")
+                        .setNumFeatures(1000);
+
+        // Uses the FeatureHasher object for feature transformations.
+        Table outputTable = featureHash.transform(inputDataTable)[0];
+
+        // Extracts and displays the results.
+        for (CloseableIterator<Row> it = outputTable.execute().collect(); 
it.hasNext(); ) {
+            Row row = it.next();
+
+            Object[] inputValues = new 
Object[featureHash.getInputCols().length];
+            for (int i = 0; i < inputValues.length; i++) {
+                inputValues[i] = row.getField(featureHash.getInputCols()[i]);
+            }
+            Vector outputValue = (Vector) 
row.getField(featureHash.getOutputCol());
+
+            System.out.printf(
+                    "Input Values: %s \tOutput Value: %s\n",
+                    Arrays.toString(inputValues), outputValue);
+        }
+    }
+}
+
+```
+
+{{< /tab>}}
+
+{{< tab "Python">}}
+
+```python
+# Simple program that creates a FeatureHasher instance and uses it for feature
+# engineering.
+
+from pyflink.common import Types
+from pyflink.datastream import StreamExecutionEnvironment
+from pyflink.ml.lib.feature.featurehasher import FeatureHasher
+from pyflink.table import StreamTableEnvironment
+
+# create a new StreamExecutionEnvironment
+env = StreamExecutionEnvironment.get_execution_environment()
+
+# create a StreamTableEnvironment
+t_env = StreamTableEnvironment.create(env)
+
+# generate input data
+input_data_table = t_env.from_data_stream(
+    env.from_collection([
+        (0, 'a', 1.0, True),
+        (1, 'c', 1.0, False),
+    ],
+        type_info=Types.ROW_NAMED(
+            ['id', 'f0', 'f1', 'f2'],
+            [Types.INT(), Types.STRING(), Types.DOUBLE(), Types.BOOLEAN()])))
+
+# create a feature hasher object and initialize its parameters
+feature_hasher = FeatureHasher() \
+    .set_input_cols('f0', 'f1', 'f2') \
+    .set_categorical_cols('f0', 'f2') \
+    .set_output_col('vec') \
+    .set_num_features(1000)
+
+# use the feature hasher for feature engineering
+output = feature_hasher.transform(input_data_table)[0]
+
+# extract and display the results
+field_names = output.get_schema().get_field_names()
+input_values = [None for _ in feature_hasher.get_input_cols()]
+for result in t_env.to_data_stream(output).execute_and_collect():
+    for i in range(len(feature_hasher.get_input_cols())):
+        input_values[i] = 
result[field_names.index(feature_hasher.get_input_cols()[i])]
+    output_value = result[field_names.index(feature_hasher.get_output_col())]
+    print('Input Values: ' + str(input_values) + '\tOutput Value: ' + 
str(output_value))
+
+```
+
+{{< /tab>}}
+
+{{< /tabs>}}
diff --git a/docs/content/docs/operators/feature/hashingtf.md 
b/docs/content/docs/operators/feature/hashingtf.md
new file mode 100644
index 0000000..088176a
--- /dev/null
+++ b/docs/content/docs/operators/feature/hashingtf.md
@@ -0,0 +1,165 @@
+---
+title: "HashingTF"
+weight: 1
+type: docs
+aliases:
+- /operators/feature/hashingtf.html
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## HashingTF
+
+HashingTF maps a sequence of terms(strings, numbers, booleans)
+to a sparse vector with a specified dimension using the hashing
+trick. If multiple features are projected into the same column,
+the output values are accumulated by default.
+
+### Input Columns
+
+| Param name | Type                                          | Default   | 
Description              |
+|:-----------|:----------------------------------------------|:----------|:-------------------------|
+| inputCol   | List/Array of primitive data types or strings | `"input"` | 
Input sequence of terms. |
+
+### Output Columns
+
+| Param name | Type         | Default    | Description           |
+|:-----------|:-------------|:-----------|:----------------------|
+| outputCol  | SparseVector | `"output"` | Output sparse vector. |
+
+### Parameters
+
+| Key         | Default    | Type    | Required | Description                  
                                       |
+|:------------|:-----------|:--------|:---------|:--------------------------------------------------------------------|
+| binary      | `false`    | Boolean | no       | Whether each dimension of 
the output vector is binary or not.       |
+| inputCol    | `"input"`  | String  | no       | Input column name.           
                                       |
+| outputCol   | `"output"` | String  | no       | Output column name.          
                                       |
+| numFeatures | `262144`   | Integer | no       | The number of features. It 
will be the length of the output vector. |
+
+
+### Examples
+
+{{< tabs examples >}}
+
+{{< tab "Java">}}
+
+```java
+
+import org.apache.flink.ml.feature.hashingtf.HashingTF;
+import org.apache.flink.ml.linalg.SparseVector;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.CloseableIterator;
+
+import java.util.Arrays;
+import java.util.List;
+
+/** Simple program that creates a HashingTF instance and uses it for feature 
engineering. */
+public class HashingTFExample {
+       public static void main(String[] args) {
+               StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
+               StreamTableEnvironment tEnv = 
StreamTableEnvironment.create(env);
+
+               // Generates input data.
+               DataStream<Row> inputStream =
+                       env.fromElements(
+                               Row.of(
+                                       Arrays.asList(
+                                               "HashingTFTest", "Hashing", 
"Term", "Frequency", "Test")),
+                               Row.of(
+                                       Arrays.asList(
+                                               "HashingTFTest", "Hashing", 
"Hashing", "Test", "Test")));
+
+               Table inputTable = tEnv.fromDataStream(inputStream).as("input");
+
+               // Creates a HashingTF object and initializes its parameters.
+               HashingTF hashingTF =
+                       new 
HashingTF().setInputCol("input").setOutputCol("output").setNumFeatures(128);
+
+               // Uses the HashingTF object for feature transformations.
+               Table outputTable = hashingTF.transform(inputTable)[0];
+
+               // Extracts and displays the results.
+               for (CloseableIterator<Row> it = 
outputTable.execute().collect(); it.hasNext(); ) {
+                       Row row = it.next();
+
+                       List<Object> inputValue = (List<Object>) 
row.getField(hashingTF.getInputCol());
+                       SparseVector outputValue = (SparseVector) 
row.getField(hashingTF.getOutputCol());
+
+                       System.out.printf(
+                               "Input Value: %s \tOutput Value: %s\n",
+                               Arrays.toString(inputValue.stream().toArray()), 
outputValue);
+               }
+       }
+}
+
+```
+
+{{< /tab>}}
+
+{{< tab "Python">}}
+
+```python
+# Simple program that creates a HashingTF instance and uses it for feature
+# engineering.
+
+from pyflink.common import Types
+from pyflink.datastream import StreamExecutionEnvironment
+from pyflink.ml.lib.feature.hashingtf import HashingTF
+from pyflink.table import StreamTableEnvironment
+
+env = StreamExecutionEnvironment.get_execution_environment()
+
+t_env = StreamTableEnvironment.create(env)
+
+# Generates input data.
+input_data_table = t_env.from_data_stream(
+    env.from_collection([
+        (['HashingTFTest', 'Hashing', 'Term', 'Frequency', 'Test'],),
+        (['HashingTFTest', 'Hashing', 'Hashing', 'Test', 'Test'],),
+    ],
+        type_info=Types.ROW_NAMED(
+            ["input", ],
+            [Types.OBJECT_ARRAY(Types.STRING())])))
+
+# Creates a HashingTF object and initializes its parameters.
+hashing_tf = HashingTF() \
+    .set_input_col('input') \
+    .set_num_features(128) \
+    .set_output_col('output')
+
+# Uses the HashingTF object for feature transformations.
+output = hashing_tf.transform(input_data_table)[0]
+
+# Extracts and displays the results.
+field_names = output.get_schema().get_field_names()
+for result in t_env.to_data_stream(output).execute_and_collect():
+    input_value = result[field_names.index(hashing_tf.get_input_col())]
+    output_value = result[field_names.index(hashing_tf.get_output_col())]
+    print('Input Value: ' + ' '.join(input_value) + '\tOutput Value: ' + 
str(output_value))
+
+```
+
+{{< /tab>}}
+
+{{< /tabs>}}
diff --git a/docs/content/docs/operators/feature/interaction.md 
b/docs/content/docs/operators/feature/interaction.md
new file mode 100644
index 0000000..bc109f4
--- /dev/null
+++ b/docs/content/docs/operators/feature/interaction.md
@@ -0,0 +1,169 @@
+---
+title: "Interaction"
+weight: 1
+type: docs
+aliases:
+- /operators/feature/interaction.html
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## Interaction
+
+Interaction takes vector or numerical columns, and generates a single vector 
column that contains
+the product of all combinations of one value from each input column.
+
+For example, when the input feature values are Double(2) and Vector(3, 4), the 
output would be 
+Vector(6, 8). When the input feature values are Vector(1, 2) and Vector(3, 4), 
the output would
+be Vector(3, 4, 6, 8). If you change the position of these two input Vectors, 
the output would 
+be Vector(3, 6, 4, 8).
+
+### Input Columns
+
+| Param name | Type   | Default | Description               |
+|:-----------|:-------|:--------|:--------------------------|
+| inputCols  | Vector | `null`  | Columns to be interacted. |
+
+### Output Columns
+
+| Param name | Type   | Default    | Description        |
+|:-----------|:-------|:-----------|:-------------------|
+| outputCol  | Vector | `"output"` | Interacted vector. |
+
+### Parameters
+
+| Key             | Default    | Type      | Required | Description            
    |
+|-----------------|------------|-----------|----------|----------------------------|
+| inputCols       | `null`     | String[]  | yes      | Input column names.    
    |
+| outputCol       | `"output"` | String    | no       | Output column name.    
    |
+
+### Examples
+
+{{< tabs examples >}}
+
+{{< tab "Java">}}
+
+```java
+import org.apache.flink.ml.feature.interaction.Interaction;
+import org.apache.flink.ml.linalg.Vector;
+import org.apache.flink.ml.linalg.Vectors;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.CloseableIterator;
+
+import java.util.Arrays;
+
+/** Simple program that creates an Interaction instance and uses it for 
feature engineering. */
+public class InteractionExample {
+    public static void main(String[] args) {
+        StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
+        StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);
+
+        // Generates input data.
+        DataStream<Row> inputStream =
+                env.fromElements(
+                        Row.of(0, Vectors.dense(1.1, 3.2), Vectors.dense(2, 
3)),
+                        Row.of(1, Vectors.dense(2.1, 3.1), Vectors.dense(1, 
3)));
+
+        Table inputTable = tEnv.fromDataStream(inputStream).as("f0", "f1", 
"f2");
+
+        // Creates an Interaction object and initializes its parameters.
+        Interaction interaction =
+                new Interaction().setInputCols("f0", "f1", 
"f2").setOutputCol("outputVec");
+
+        // Transforms input data.
+        Table outputTable = interaction.transform(inputTable)[0];
+
+        // Extracts and displays the results.
+        for (CloseableIterator<Row> it = outputTable.execute().collect(); 
it.hasNext(); ) {
+            Row row = it.next();
+            Object[] inputValues = new 
Object[interaction.getInputCols().length];
+            for (int i = 0; i < inputValues.length; i++) {
+                inputValues[i] = row.getField(interaction.getInputCols()[i]);
+            }
+            Vector outputValue = (Vector) 
row.getField(interaction.getOutputCol());
+            System.out.printf(
+                    "Input Values: %s \tOutput Value: %s\n",
+                    Arrays.toString(inputValues), outputValue);
+        }
+    }
+}
+
+```
+
+{{< /tab>}}
+
+{{< tab "Python">}}
+
+```python
+# Simple program that creates an Interaction instance and uses it for feature
+# engineering.
+
+from pyflink.common import Types
+from pyflink.datastream import StreamExecutionEnvironment
+from pyflink.ml.core.linalg import Vectors, DenseVectorTypeInfo
+from pyflink.ml.lib.feature.interaction import Interaction
+from pyflink.table import StreamTableEnvironment
+
+# create a new StreamExecutionEnvironment
+env = StreamExecutionEnvironment.get_execution_environment()
+
+# create a StreamTableEnvironment
+t_env = StreamTableEnvironment.create(env)
+
+# generate input data
+input_data_table = t_env.from_data_stream(
+    env.from_collection([
+        (1,
+         Vectors.dense(1, 2),
+         Vectors.dense(3, 4)),
+        (2,
+         Vectors.dense(2, 8),
+         Vectors.dense(3, 4))
+    ],
+        type_info=Types.ROW_NAMED(
+            ['f0', 'f1', 'f2'],
+            [Types.INT(), DenseVectorTypeInfo(), DenseVectorTypeInfo()])))
+
+# create an interaction object and initialize its parameters
+interaction = Interaction() \
+    .set_input_cols('f0', 'f1', 'f2') \
+    .set_output_col('interaction_vec')
+
+# use the interaction for feature engineering
+output = interaction.transform(input_data_table)[0]
+
+# extract and display the results
+field_names = output.get_schema().get_field_names()
+input_values = [None for _ in interaction.get_input_cols()]
+for result in t_env.to_data_stream(output).execute_and_collect():
+    for i in range(len(interaction.get_input_cols())):
+        input_values[i] = 
result[field_names.index(interaction.get_input_cols()[i])]
+    output_value = result[field_names.index(interaction.get_output_col())]
+    print('Input Values: ' + str(input_values) + '\tOutput Value: ' + 
str(output_value))
+
+```
+
+{{< /tab>}}
+
+{{< /tabs>}}
diff --git a/docs/content/docs/operators/feature/minmaxscaler.md 
b/docs/content/docs/operators/feature/maxabsscaler.md
similarity index 69%
copy from docs/content/docs/operators/feature/minmaxscaler.md
copy to docs/content/docs/operators/feature/maxabsscaler.md
index 8b1ff6c..b57490e 100644
--- a/docs/content/docs/operators/feature/minmaxscaler.md
+++ b/docs/content/docs/operators/feature/maxabsscaler.md
@@ -1,9 +1,9 @@
 ---
-title: "Min Max Scaler"
+title: "Max Abs Scaler"
 weight: 1
 type: docs
 aliases:
-- /operators/feature/minmaxscaler.html
+- /operators/feature/maxabsscaler.html
 ---
 
 <!--
@@ -25,30 +25,30 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-## Min Max Scaler
+## Max Abs Scaler
+
+Max Abs Scaler is an algorithm rescales feature values to the range [-1, 1] 
+by dividing through the largest maximum absolute value in each feature. 
+It does not shift/center the data and thus does not destroy any sparsity.
 
-Min Max Scaler is an algorithm that rescales feature values to a common range
-[min, max] which defined by user.
 ### Input Columns
 
-| Param name | Type   | Default   | Description           |
-| :--------- | :----- | :-------- | :-------------------- |
-| inputCol   | Vector | `"input"` | features to be scaled |
+| Param name | Type   | Default   | Description            |
+|:-----------|:-------|:----------|:-----------------------|
+| inputCol   | Vector | `"input"` | Features to be scaled. |
 
 ### Output Columns
 
-| Param name | Type   | Default    | Description     |
-| :--------- | :----- | :--------- | :-------------- |
-| outputCol  | Vector | `"output"` | scaled features |
+| Param name | Type   | Default    | Description      |
+|:-----------|:-------|:-----------|:-----------------|
+| outputCol  | Vector | `"output"` | Scaled features. |
 
 ### Parameters
 
-| Key       | Default    | Type   | Required | Description                     
         |
-| --------- | ---------- | ------ | -------- | 
---------------------------------------- |
-| inputCol  | `"input"`  | String | no       | Input column name.              
         |
-| outputCol | `"output"` | String | no       | Output column name.             
         |
-| min       | `0.0`      | Double | no       | Lower bound of the output 
feature range. |
-| max       | `1.0`      | Double | no       | Upper bound of the output 
feature range. |
+| Key       | Default    | Type   | Required | Description         |
+|-----------|------------|--------|----------|---------------------|
+| inputCol  | `"input"`  | String | no       | Input column name.  |
+| outputCol | `"output"` | String | no       | Output column name. |
 
 ### Examples
 
@@ -57,8 +57,8 @@ Min Max Scaler is an algorithm that rescales feature values 
to a common range
 {{< tab "Java">}}
 
 ```java
-import org.apache.flink.ml.feature.minmaxscaler.MinMaxScaler;
-import org.apache.flink.ml.feature.minmaxscaler.MinMaxScalerModel;
+import org.apache.flink.ml.feature.maxabsscaler.MaxAbsScaler;
+import org.apache.flink.ml.feature.maxabsscaler.MaxAbsScalerModel;
 import org.apache.flink.ml.linalg.DenseVector;
 import org.apache.flink.ml.linalg.Vectors;
 import org.apache.flink.streaming.api.datastream.DataStream;
@@ -68,8 +68,8 @@ import 
org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
 import org.apache.flink.types.Row;
 import org.apache.flink.util.CloseableIterator;
 
-/** Simple program that trains a MinMaxScaler model and uses it for feature 
engineering. */
-public class MinMaxScalerExample {
+/** Simple program that trains a MaxAbsScaler model and uses it for feature 
engineering. */
+public class MaxAbsScalerExample {
     public static void main(String[] args) {
         StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
         StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);
@@ -91,20 +91,20 @@ public class MinMaxScalerExample {
                         Row.of(Vectors.dense(100.0, 50.0)));
         Table predictTable = tEnv.fromDataStream(predictStream).as("input");
 
-        // Creates a MinMaxScaler object and initializes its parameters.
-        MinMaxScaler minMaxScaler = new MinMaxScaler();
+        // Creates a MaxAbsScaler object and initializes its parameters.
+        MaxAbsScaler maxAbsScaler = new MaxAbsScaler();
 
-        // Trains the MinMaxScaler Model.
-        MinMaxScalerModel minMaxScalerModel = minMaxScaler.fit(trainTable);
+        // Trains the MaxAbsScaler Model.
+        MaxAbsScalerModel maxAbsScalerModel = maxAbsScaler.fit(trainTable);
 
-        // Uses the MinMaxScaler Model for predictions.
-        Table outputTable = minMaxScalerModel.transform(predictTable)[0];
+        // Uses the MaxAbsScaler Model for predictions.
+        Table outputTable = maxAbsScalerModel.transform(predictTable)[0];
 
         // Extracts and displays the results.
         for (CloseableIterator<Row> it = outputTable.execute().collect(); 
it.hasNext(); ) {
             Row row = it.next();
-            DenseVector inputValue = (DenseVector) 
row.getField(minMaxScaler.getInputCol());
-            DenseVector outputValue = (DenseVector) 
row.getField(minMaxScaler.getOutputCol());
+            DenseVector inputValue = (DenseVector) 
row.getField(maxAbsScaler.getInputCol());
+            DenseVector outputValue = (DenseVector) 
row.getField(maxAbsScaler.getOutputCol());
             System.out.printf("Input Value: %-15s\tOutput Value: %s\n", 
inputValue, outputValue);
         }
     }
@@ -117,13 +117,13 @@ public class MinMaxScalerExample {
 {{< tab "Python">}}
 
 ```python
-# Simple program that trains a MinMaxScaler model and uses it for feature
+# Simple program that trains a MaxAbsScaler model and uses it for feature
 # engineering.
 
 from pyflink.common import Types
 from pyflink.datastream import StreamExecutionEnvironment
 from pyflink.ml.core.linalg import Vectors, DenseVectorTypeInfo
-from pyflink.ml.lib.feature.minmaxscaler import MinMaxScaler
+from pyflink.ml.lib.feature.maxabsscaler import MaxAbsScaler
 from pyflink.table import StreamTableEnvironment
 
 # create a new StreamExecutionEnvironment
@@ -157,20 +157,20 @@ predict_data = t_env.from_data_stream(
             [DenseVectorTypeInfo()])
     ))
 
-# create a min-max-scaler object and initialize its parameters
-min_max_scaler = MinMaxScaler()
+# create a maxabs scaler object and initialize its parameters
+max_abs_scaler = MaxAbsScaler()
 
-# train the min-max-scaler model
-model = min_max_scaler.fit(train_data)
+# train the maxabs scaler model
+model = max_abs_scaler.fit(train_data)
 
-# use the min-max-scaler model for predictions
+# use the maxabs scaler model for predictions
 output = model.transform(predict_data)[0]
 
 # extract and display the results
 field_names = output.get_schema().get_field_names()
 for result in t_env.to_data_stream(output).execute_and_collect():
-    input_value = result[field_names.index(min_max_scaler.get_input_col())]
-    output_value = result[field_names.index(min_max_scaler.get_output_col())]
+    input_value = result[field_names.index(max_abs_scaler.get_input_col())]
+    output_value = result[field_names.index(max_abs_scaler.get_output_col())]
     print('Input Value: ' + str(input_value) + ' \tOutput Value: ' + 
str(output_value))
 
 ```
diff --git a/docs/content/docs/operators/feature/minmaxscaler.md 
b/docs/content/docs/operators/feature/minmaxscaler.md
index 8b1ff6c..3e7b17f 100644
--- a/docs/content/docs/operators/feature/minmaxscaler.md
+++ b/docs/content/docs/operators/feature/minmaxscaler.md
@@ -31,20 +31,20 @@ Min Max Scaler is an algorithm that rescales feature values 
to a common range
 [min, max] which defined by user.
 ### Input Columns
 
-| Param name | Type   | Default   | Description           |
-| :--------- | :----- | :-------- | :-------------------- |
-| inputCol   | Vector | `"input"` | features to be scaled |
+| Param name | Type   | Default   | Description            |
+|:-----------|:-------|:----------|:-----------------------|
+| inputCol   | Vector | `"input"` | Features to be scaled. |
 
 ### Output Columns
 
-| Param name | Type   | Default    | Description     |
-| :--------- | :----- | :--------- | :-------------- |
-| outputCol  | Vector | `"output"` | scaled features |
+| Param name | Type   | Default    | Description      |
+|:-----------|:-------|:-----------|:-----------------|
+| outputCol  | Vector | `"output"` | Scaled features. |
 
 ### Parameters
 
 | Key       | Default    | Type   | Required | Description                     
         |
-| --------- | ---------- | ------ | -------- | 
---------------------------------------- |
+|-----------|------------|--------|----------|------------------------------------------|
 | inputCol  | `"input"`  | String | no       | Input column name.              
         |
 | outputCol | `"output"` | String | no       | Output column name.             
         |
 | min       | `0.0`      | Double | no       | Lower bound of the output 
feature range. |
diff --git a/docs/content/docs/operators/feature/onehotencoder.md 
b/docs/content/docs/operators/feature/onehotencoder.md
index 53884f4..d0344d4 100644
--- a/docs/content/docs/operators/feature/onehotencoder.md
+++ b/docs/content/docs/operators/feature/onehotencoder.md
@@ -37,24 +37,24 @@ vector column for each input column.
 
 ### Input Columns
 
-| Param name | Type    | Default | Description |
-| :--------- | :------ | :------ | :---------- |
-| inputCols  | Integer | `null`  | Label index |
+| Param name | Type    | Default | Description  |
+| :--------- | :------ | :------ |:-------------|
+| inputCols  | Integer | `null`  | Label index. |
 
 ### Output Columns
 
-| Param name | Type   | Default | Description           |
-| :--------- | :----- | :------ | :-------------------- |
-| outputCols | Vector | `null`  | Encoded binary vector |
+| Param name | Type   | Default | Description            |
+| :--------- | :----- | :------ |:-----------------------|
+| outputCols | Vector | `null`  | Encoded binary vector. |
 
 ### Parameters
 
-| Key           | Default                          | Type    | Required | 
Description                                                  |
-| ------------- | -------------------------------- | ------- | -------- | 
------------------------------------------------------------ |
-| inputCols     | `null`                           | String  | yes      | 
Input column names.                                          |
-| outputCols    | `null`                           | String  | yes      | 
Output column names.                                         |
-| handleInvalid | `HasHandleInvalid.ERROR_INVALID` | String  | No       | 
Strategy to handle invalid entries. Supported values: 
`HasHandleInvalid.ERROR_INVALID`, `HasHandleInvalid.SKIP_INVALID` |
-| dropLast      | `true`                           | Boolean | no       | 
Whether to drop the last category.                           |
+| Key           | Default   | Type     | Required | Description                
                                                    |
+|---------------|-----------|----------|----------|--------------------------------------------------------------------------------|
+| inputCols     | `null`    | String[] | yes      | Input column names.        
                                                    |
+| outputCols    | `null`    | String[] | yes      | Output column names.       
                                                    |
+| handleInvalid | `"error"` | String   | no       | Strategy to handle invalid 
entries. Supported values: 'error', 'skip', 'keep'. |
+| dropLast      | `true`    | Boolean  | no       | Whether to drop the last 
category.                                             |
 
 ### Examples
 
diff --git a/docs/content/docs/operators/feature/standardscaler.md 
b/docs/content/docs/operators/feature/standardscaler.md
index 9bf17c1..ab17f56 100644
--- a/docs/content/docs/operators/feature/standardscaler.md
+++ b/docs/content/docs/operators/feature/standardscaler.md
@@ -31,20 +31,20 @@ Standard Scaler is an algorithm that standardizes the input 
features by removing
 the mean and scaling each dimension to unit variance.
 ### Input Columns
 
-| Param name | Type   | Default   | Description           |
-| :--------- | :----- | :-------- | :-------------------- |
-| inputCol   | Vector | `"input"` | features to be scaled |
+| Param name | Type   | Default   | Description            |
+|:-----------|:-------|:----------|:-----------------------|
+| inputCol   | Vector | `"input"` | Features to be scaled. |
 
 ### Output Columns
 
-| Param name | Type   | Default    | Description     |
-| :--------- | :----- | :--------- | :-------------- |
-| outputCol  | Vector | `"output"` | scaled features |
+| Param name | Type   | Default    | Description      |
+|:-----------|:-------|:-----------|:-----------------|
+| outputCol  | Vector | `"output"` | Scaled features. |
 
 ### Parameters
 
 | Key       | Default    | Type    | Required | Description                    
                    |
-| --------- | ---------- | ------- | -------- | 
-------------------------------------------------- |
+|-----------|------------|---------|----------|----------------------------------------------------|
 | inputCol  | `"input"`  | String  | no       | Input column name.             
                    |
 | outputCol | `"output"` | String  | no       | Output column name.            
                    |
 | withMean  | `false`    | Boolean | no       | Whether centers the data with 
mean before scaling. |
diff --git a/docs/content/docs/operators/feature/stringindexer.md 
b/docs/content/docs/operators/feature/stringindexer.md
index 94096ba..36b8d80 100644
--- a/docs/content/docs/operators/feature/stringindexer.md
+++ b/docs/content/docs/operators/feature/stringindexer.md
@@ -38,30 +38,30 @@ StringIndexerModel.
 ### Input Columns
 
 | Param name | Type          | Default | Description                           
 |
-| :--------- | :------------ | :------ | 
:------------------------------------- |
-| inputCols  | Number/String | `null`  | string/numerical values to be 
indexed. |
+| :--------- | :------------ | :------ 
|:---------------------------------------|
+| inputCols  | Number/String | `null`  | String/Numerical values to be 
indexed. |
 
 ### Output Columns
 
 | Param name | Type   | Default | Description                         |
-| :--------- | :----- | :------ | :---------------------------------- |
+|:-----------|:-------|:--------|:------------------------------------|
 | outputCols | Double | `null`  | Indices of string/numerical values. |
 
 ### Parameters
 
 Below are the parameters required by `StringIndexerModel`.
 
-| Key           | Default                          | Type   | Required | 
Description                         |
-| ------------- | -------------------------------- | ------ | -------- | 
----------------------------------- |
-| inputCols     | `null`                           | String | yes      | Input 
column names.                 |
-| outputCols    | `null`                           | String | yes      | 
Output column names.                |
-| handleInvalid | `HasHandleInvalid.ERROR_INVALID` | String | No       | 
Strategy to handle invalid entries. |
+| Key           | Default   | Type     | Required | Description                
                                                    |
+|---------------|-----------|----------|----------|--------------------------------------------------------------------------------|
+| inputCols     | `null`    | String[] | yes      | Input column names.        
                                                    |
+| outputCols    | `null`    | String[] | yes      | Output column names.       
                                                    |
+| handleInvalid | `"error"` | String   | no       | Strategy to handle invalid 
entries. Supported values: 'error', 'skip', 'keep'. |
 
 `StringIndexer` needs parameters above and also below.
 
-| Key             | Default                               | Type   | Required 
| Description                          |
-| --------------- | ------------------------------------- | ------ | -------- 
| ------------------------------------ |
-| stringOrderType | `StringIndexerParams.ARBITRARY_ORDER` | String | no       
| How to order strings of each column. |
+| Key             | Default       | Type   | Required | Description            
                                                                                
                             |
+|-----------------|---------------|--------|----------|-------------------------------------------------------------------------------------------------------------------------------------|
+| stringOrderType | `"arbitrary"` | String | no       | How to order strings 
of each column. Supported values: 'arbitrary', 'frequencyDesc', 'frequencyAsc', 
'alphabetDesc', 'alphabetAsc'. |
 
 ### Examples
 
diff --git a/docs/content/docs/operators/feature/vectorassembler.md 
b/docs/content/docs/operators/feature/vectorassembler.md
index f5af483..10f3fc3 100644
--- a/docs/content/docs/operators/feature/vectorassembler.md
+++ b/docs/content/docs/operators/feature/vectorassembler.md
@@ -33,22 +33,22 @@ Types of input columns must be either vector or numerical 
value.
 ### Input Columns
 
 | Param name | Type          | Default | Description                     |
-| :--------- | :------------ | :------ | :------------------------------ |
+|:-----------|:--------------|:--------|:--------------------------------|
 | inputCols  | Number/Vector | `null`  | Number/Vectors to be assembled. |
 
 ### Output Columns
 
 | Param name | Type   | Default    | Description       |
-| :--------- | :----- | :--------- | :---------------- |
+|:-----------|:-------|:-----------|:------------------|
 | outputCol  | Vector | `"output"` | Assembled vector. |
 
 ### Parameters
 
-| Key           | Default                          | Type   | Required | 
Description                         |
-| ------------- | -------------------------------- | ------ | -------- | 
----------------------------------- |
-| inputCols     | `null`                           | String | yes      | Input 
column names.                 |
-| outputCol     | `"output"`                       | String | No       | 
Output column name.                 |
-| handleInvalid | `HasHandleInvalid.ERROR_INVALID` | String | No       | 
Strategy to handle invalid entries. |
+| Key           | Default    | Type     | Required | Description               
                                                     |
+|---------------|------------|----------|----------|--------------------------------------------------------------------------------|
+| inputCols     | `null`     | String[] | yes      | Input column names.       
                                                     |
+| outputCol     | `"output"` | String   | no       | Output column name.       
                                                     |
+| handleInvalid | `"error"`  | String   | no       | Strategy to handle 
invalid entries. Supported values: 'error', 'skip', 'keep'. |
 
 ### Examples
 
diff --git a/docs/content/docs/operators/feature/vectorslicer.md 
b/docs/content/docs/operators/feature/vectorslicer.md
new file mode 100644
index 0000000..66a921b
--- /dev/null
+++ b/docs/content/docs/operators/feature/vectorslicer.md
@@ -0,0 +1,158 @@
+---
+title: "Vector Slicer"
+weight: 1
+type: docs
+aliases:
+- /operators/feature/vectorslicer.html
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## Vector Slicer
+
+Vector Slicer transforms a vector to a new feature, which is a sub-array of 
the original
+feature. It is useful for extracting features from a given vector.
+
+Note that duplicate features are not allowed, so there can be no overlap 
between selected
+indices. If the max value of the indices is greater than the size of the input 
vector, 
+it throws an IllegalArgumentException.
+
+### Input Columns
+
+| Param name | Type   | Default   | Description          |
+|:-----------|:-------|:----------|:---------------------|
+| inputCol   | Vector | `"input"` | Vector to be sliced. |
+
+### Output Columns
+
+| Param name | Type   | Default    | Description    |
+|:-----------|:-------|:-----------|:---------------|
+| outputCol  | Vector | `"output"` | Sliced vector. |
+
+### Parameters
+
+| Key       | Default    | Type      | Required | Description                  
                                 |
+|-----------|------------|-----------|----------|---------------------------------------------------------------|
+| inputCol  | `"input"`  | String    | no       | Input column name.           
                                 |
+| outputCol | `"output"` | String    | no       | Output column name.          
                                 |
+| indices   | `null`     | Integer[] | yes      | An array of indices to 
select features from a vector column.  |
+### Examples
+
+{{< tabs examples >}}
+
+{{< tab "Java">}}
+
+```java
+import org.apache.flink.ml.feature.vectorslicer.VectorSlicer;
+import org.apache.flink.ml.linalg.Vector;
+import org.apache.flink.ml.linalg.Vectors;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.table.api.Table;
+import org.apache.flink.table.api.bridge.java.StreamTableEnvironment;
+import org.apache.flink.types.Row;
+import org.apache.flink.util.CloseableIterator;
+
+/** Simple program that creates a VectorSlicer instance and uses it for 
feature engineering. */
+public class VectorSlicerExample {
+    public static void main(String[] args) {
+        StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
+        StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);
+
+        // Generates input data.
+        DataStream<Row> inputStream =
+                env.fromElements(
+                        Row.of(Vectors.dense(2.1, 3.1, 1.2, 3.1, 4.6)),
+                        Row.of(Vectors.dense(1.2, 3.1, 4.6, 2.1, 3.1)));
+        Table inputTable = tEnv.fromDataStream(inputStream).as("vec");
+
+        // Creates a VectorSlicer object and initializes its parameters.
+        VectorSlicer vectorSlicer =
+                new VectorSlicer().setInputCol("vec").setIndices(1, 2, 
3).setOutputCol("slicedVec");
+
+        // Uses the VectorSlicer object for feature transformations.
+        Table outputTable = vectorSlicer.transform(inputTable)[0];
+
+        // Extracts and displays the results.
+        for (CloseableIterator<Row> it = outputTable.execute().collect(); 
it.hasNext(); ) {
+            Row row = it.next();
+
+            Vector inputValue = (Vector) 
row.getField(vectorSlicer.getInputCol());
+
+            Vector outputValue = (Vector) 
row.getField(vectorSlicer.getOutputCol());
+
+            System.out.printf("Input Value: %s \tOutput Value: %s\n", 
inputValue, outputValue);
+        }
+    }
+}
+
+```
+
+{{< /tab>}}
+
+{{< tab "Python">}}
+
+```python
+# Simple program that creates a VectorSlicer instance and uses it for feature
+# engineering.
+
+from pyflink.common import Types
+from pyflink.datastream import StreamExecutionEnvironment
+from pyflink.ml.core.linalg import Vectors, DenseVectorTypeInfo
+from pyflink.ml.lib.feature.vectorslicer import VectorSlicer
+from pyflink.table import StreamTableEnvironment
+
+# create a new StreamExecutionEnvironment
+env = StreamExecutionEnvironment.get_execution_environment()
+
+# create a StreamTableEnvironment
+t_env = StreamTableEnvironment.create(env)
+
+# generate input data
+input_data_table = t_env.from_data_stream(
+    env.from_collection([
+        (1, Vectors.dense(2.1, 3.1, 1.2, 2.1)),
+        (2, Vectors.dense(2.3, 2.1, 1.3, 1.2)),
+    ],
+        type_info=Types.ROW_NAMED(
+            ['id', 'vec'],
+            [Types.INT(), DenseVectorTypeInfo()])))
+
+# create a vector slicer object and initialize its parameters
+vector_slicer = VectorSlicer() \
+    .set_input_col('vec') \
+    .set_indices(1, 2, 3) \
+    .set_output_col('sub_vec')
+
+# use the vector slicer model for feature engineering
+output = vector_slicer.transform(input_data_table)[0]
+
+# extract and display the results
+field_names = output.get_schema().get_field_names()
+for result in t_env.to_data_stream(output).execute_and_collect():
+    input_value = result[field_names.index(vector_slicer.get_input_col())]
+    output_value = result[field_names.index(vector_slicer.get_output_col())]
+    print('Input Value: ' + str(input_value) + '\tOutput Value: ' + 
str(output_value))
+
+```
+
+{{< /tab>}}
+
+{{< /tabs>}}
diff --git a/docs/content/docs/operators/regression/linearregression.md 
b/docs/content/docs/operators/regression/linearregression.md
index ff8c5e9..596b927 100644
--- a/docs/content/docs/operators/regression/linearregression.md
+++ b/docs/content/docs/operators/regression/linearregression.md
@@ -31,17 +31,17 @@ between a scalar response and one or more explanatory 
variables.
 
 ### Input Columns
 
-| Param name  | Type    | Default      | Description      |
-| :---------- | :------ | :----------- | :--------------- |
-| featuresCol | Vector  | `"features"` | Feature vector   |
-| labelCol    | Integer | `"label"`    | Label to predict |
-| weightCol   | Double  | `"weight"`   | Weight of sample |
+| Param name  | Type    | Default      | Description       |
+| :---------- | :------ | :----------- |:------------------|
+| featuresCol | Vector  | `"features"` | Feature vector.   |
+| labelCol    | Integer | `"label"`    | Label to predict. |
+| weightCol   | Double  | `"weight"`   | Weight of sample. |
 
 ### Output Columns
 
-| Param name    | Type    | Default        | Description                  |
-| :------------ | :------ | :------------- | :--------------------------- |
-| predictionCol | Integer | `"prediction"` | Label of the max probability |
+| Param name    | Type    | Default        | Description                   |
+| :------------ | :------ | :------------- |:------------------------------|
+| predictionCol | Integer | `"prediction"` | Label of the max probability. |
 
 ### Parameters

[flink-ml] branch master updated: [FLINK-29171] Add documents for MaxAbs, FeatureHasher, Interaction, VectorSlicer, ElementwiseProduct and Binarizer

Reply via email to