This is an automated email from the ASF dual-hosted git repository. hxb pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/flink-ml.git
commit c8177c73c1b294baefbf63cdd9a247c4f1659d7b Author: yunfengzhou-hub <[email protected]> AuthorDate: Mon Aug 29 19:02:43 2022 +0800 [FLINK-29115] Improve Python quickstart document This closes #149. --- README.md | 2 +- docs/content/_index.md | 2 +- docs/content/docs/development/building.md | 100 +++++++++ docs/content/docs/operators/classification/knn.md | 6 - .../docs/operators/classification/linearsvc.md | 6 - .../operators/classification/logisticregression.md | 6 - .../docs/operators/classification/naivebayes.md | 6 - docs/content/docs/operators/clustering/kmeans.md | 6 - .../evaluation/binaryclassificationevaluator.md | 6 - docs/content/docs/operators/feature/bucketizer.md | 6 - .../content/docs/operators/feature/minmaxscaler.md | 6 - .../docs/operators/feature/onehotencoder.md | 6 - .../docs/operators/feature/standardscaler.md | 6 - .../docs/operators/feature/stringindexer.md | 12 -- .../docs/operators/feature/vectorassembler.md | 6 - .../docs/operators/regression/linearregression.md | 6 - docs/content/docs/try-flink-ml/java/_index.md | 23 ++ .../{ => java}/build-your-own-project.md | 2 +- .../docs/try-flink-ml/{ => java}/quick-start.md | 2 +- docs/content/docs/try-flink-ml/python/_index.md | 23 ++ .../try-flink-ml/python/build-your-own-project.md | 232 +++++++++++++++++++++ .../docs/try-flink-ml/python/quick-start.md | 94 +++++++++ .../examples/ml/classification/knn_example.py | 6 - .../ml/classification/linearsvc_example.py | 6 - .../classification/logisticregression_example.py | 6 - .../ml/classification/naivebayes_example.py | 6 - .../examples/ml/clustering/kmeans_example.py | 6 - .../binaryclassificationevaluator_example.py | 6 - .../examples/ml/feature/bucketizer_example.py | 6 - .../pyflink/examples/ml/feature/dct_example.py | 8 +- .../ml/feature/elementwiseproduct_example.py | 6 - .../examples/ml/feature/featurehasher_example.py | 6 - .../examples/ml/feature/hashingtf_example.py | 6 - .../ml/feature/indextostringmodel_example.py | 6 - .../examples/ml/feature/interaction_example.py | 6 - .../ml/feature/kbinsdiscreteizer_example.py | 6 - .../examples/ml/feature/maxabsscaler_example.py | 6 - .../examples/ml/feature/minmaxscaler_example.py | 6 - .../examples/ml/feature/onehotencoder_example.py | 6 - .../examples/ml/feature/regextokenizer_example.py | 6 - .../examples/ml/feature/standardscaler_example.py | 6 - .../examples/ml/feature/stringindexer_example.py | 6 - .../examples/ml/feature/tokenizer_example.py | 6 - .../examples/ml/feature/vectorassembler_example.py | 6 - .../examples/ml/feature/vectorindexer_example.py | 6 - .../examples/ml/feature/vectorslicer_example.py | 6 - .../ml/regression/linearregression_example.py | 6 - 47 files changed, 477 insertions(+), 239 deletions(-) diff --git a/README.md b/README.md index 41ea744..b3f89a4 100644 --- a/README.md +++ b/README.md @@ -9,7 +9,7 @@ Flink](https://flink.apache.org/). ## <a name="start"></a>Getting Started You can follow this [quick -start](https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/) +start](https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/java/quick-start/) guideline to get hands-on experience with Flink ML. ## <a name="build"></a>Building the Project diff --git a/docs/content/_index.md b/docs/content/_index.md index e7afa36..b64b14b 100644 --- a/docs/content/_index.md +++ b/docs/content/_index.md @@ -33,7 +33,7 @@ build ML pipelines for both training and inference jobs. ## Try Flink ML If you’re interested in playing around with Flink ML, check out our [quick -start]({{< ref "docs/try-flink-ml/quick-start" >}}). It provides a simple +start]({{< ref "docs/try-flink-ml/java/quick-start" >}}). It provides a simple example to submit and execute a Flink ML job on a Flink cluster. <---> diff --git a/docs/content/docs/development/building.md b/docs/content/docs/development/building.md new file mode 100644 index 0000000..a91f7a1 --- /dev/null +++ b/docs/content/docs/development/building.md @@ -0,0 +1,100 @@ +--- +title: "Building Flink ML from Source" +weight: 999 +type: docs +aliases: +- /development/building.html + +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Building Flink ML from Source + +This page covers how to build Flink ML from sources. + +## Build Flink ML Java SDK + +In order to build Flink ML you need the source code. Either [download the source +of a release](https://flink.apache.org/downloads.html) or [clone the git +repository](https://github.com/apache/flink-ml.git). + +In addition, you need **Maven 3** and a **JDK** (Java Development Kit). Flink ML +requires **at least Java 8** to build. + +To clone from git, enter: + +```bash +git clone https://github.com/apache/flink-ml.git +``` + +The simplest way of building Flink ML is by running: + +```bash +mvn clean install -DskipTests +``` + +This instructs [Maven](http://maven.apache.org/) (`mvn`) to first remove all +existing builds (`clean`) and then create a new Flink binary (`install`). + +After the build finishes, you can acquire the build result in the following path +from the root directory of Flink ML: + +``` +./flink-ml-dist/target/flink-ml-*-bin/flink-ml*/ +``` + +## Build Flink ML Python SDK + +### Prerequisites + +1. Building Flink ML Java SDK + + If you want to build Flink ML's Python SDK that can be used for pip + installation, you must first build the Java SDK, as described in the section + above. + +2. Python version(3.6, 3.7, or 3.8) is required + ```shell + $ python --version + # the version printed here must be 3.6, 3.7 or 3.8 + ``` + +3. Install the dependencies with the following command: + ```shell + $ python -m pip install -r flink-ml-python/dev/dev-requirements.txt + ``` + +### Installation + +Then go to the root directory of Flink ML source code and run this command to +build the sdist package of `apache-flink-ml`: + +```shell +cd flink-ml-python; python setup.py sdist; cd ..; +``` + +The sdist package of `apache-flink-ml` will be found under +`./flink-ml-python/dist/`. It could be installed as follows: + +```shell +python -m pip install flink-ml-python/dist/*.tar.gz +``` + diff --git a/docs/content/docs/operators/classification/knn.md b/docs/content/docs/operators/classification/knn.md index 0a980f5..5af83f3 100644 --- a/docs/content/docs/operators/classification/knn.md +++ b/docs/content/docs/operators/classification/knn.md @@ -141,12 +141,6 @@ public class KnnExample { {{< tab "Python">}} ```python # Simple program that trains a Knn model and uses it for classification. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/docs/content/docs/operators/classification/linearsvc.md b/docs/content/docs/operators/classification/linearsvc.md index e5ddb39..c2530d5 100644 --- a/docs/content/docs/operators/classification/linearsvc.md +++ b/docs/content/docs/operators/classification/linearsvc.md @@ -139,12 +139,6 @@ public class LinearSVCExample { ```python # Simple program that trains a LinearSVC model and uses it for classification. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/docs/content/docs/operators/classification/logisticregression.md b/docs/content/docs/operators/classification/logisticregression.md index 298e1fa..26818e3 100644 --- a/docs/content/docs/operators/classification/logisticregression.md +++ b/docs/content/docs/operators/classification/logisticregression.md @@ -134,12 +134,6 @@ public class LogisticRegressionExample { ```python # Simple program that trains a LogisticRegression model and uses it for # classification. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/docs/content/docs/operators/classification/naivebayes.md b/docs/content/docs/operators/classification/naivebayes.md index df4a6e0..b913c3f 100644 --- a/docs/content/docs/operators/classification/naivebayes.md +++ b/docs/content/docs/operators/classification/naivebayes.md @@ -130,12 +130,6 @@ public class NaiveBayesExample { ```python # Simple program that trains a NaiveBayes model and uses it for classification. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/docs/content/docs/operators/clustering/kmeans.md b/docs/content/docs/operators/clustering/kmeans.md index e62b582..0883729 100644 --- a/docs/content/docs/operators/clustering/kmeans.md +++ b/docs/content/docs/operators/clustering/kmeans.md @@ -118,12 +118,6 @@ public class KMeansExample { {{< tab "Python">}} ```python # Simple program that trains a KMeans model and uses it for clustering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/docs/content/docs/operators/evaluation/binaryclassificationevaluator.md b/docs/content/docs/operators/evaluation/binaryclassificationevaluator.md index b63f9c1..a30189a 100644 --- a/docs/content/docs/operators/evaluation/binaryclassificationevaluator.md +++ b/docs/content/docs/operators/evaluation/binaryclassificationevaluator.md @@ -134,12 +134,6 @@ public class BinaryClassificationEvaluatorExample { ```python # Simple program that creates a BinaryClassificationEvaluator instance and uses # it for evaluation. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/docs/content/docs/operators/feature/bucketizer.md b/docs/content/docs/operators/feature/bucketizer.md index 929c6fb..9430691 100644 --- a/docs/content/docs/operators/feature/bucketizer.md +++ b/docs/content/docs/operators/feature/bucketizer.md @@ -124,12 +124,6 @@ public class BucketizerExample { ```python # Simple program that creates a Bucketizer instance and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/docs/content/docs/operators/feature/minmaxscaler.md b/docs/content/docs/operators/feature/minmaxscaler.md index d3a5b81..8b1ff6c 100644 --- a/docs/content/docs/operators/feature/minmaxscaler.md +++ b/docs/content/docs/operators/feature/minmaxscaler.md @@ -119,12 +119,6 @@ public class MinMaxScalerExample { ```python # Simple program that trains a MinMaxScaler model and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/docs/content/docs/operators/feature/onehotencoder.md b/docs/content/docs/operators/feature/onehotencoder.md index 2c3b8c9..53884f4 100644 --- a/docs/content/docs/operators/feature/onehotencoder.md +++ b/docs/content/docs/operators/feature/onehotencoder.md @@ -114,12 +114,6 @@ public class OneHotEncoderExample { ```python # Simple program that trains a OneHotEncoder model and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Row from pyflink.datastream import StreamExecutionEnvironment diff --git a/docs/content/docs/operators/feature/standardscaler.md b/docs/content/docs/operators/feature/standardscaler.md index a70344f..9bf17c1 100644 --- a/docs/content/docs/operators/feature/standardscaler.md +++ b/docs/content/docs/operators/feature/standardscaler.md @@ -110,12 +110,6 @@ public class StandardScalerExample { ```python # Simple program that trains a StandardScaler model and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/docs/content/docs/operators/feature/stringindexer.md b/docs/content/docs/operators/feature/stringindexer.md index 68a0aeb..94096ba 100644 --- a/docs/content/docs/operators/feature/stringindexer.md +++ b/docs/content/docs/operators/feature/stringindexer.md @@ -147,12 +147,6 @@ public class StringIndexerExample { ```python # Simple program that trains a StringIndexer model and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment @@ -324,12 +318,6 @@ public class IndexToStringModelExample { ```python # Simple program that creates an IndexToStringModelExample instance and uses it # for feature engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/docs/content/docs/operators/feature/vectorassembler.md b/docs/content/docs/operators/feature/vectorassembler.md index ab13213..f5af483 100644 --- a/docs/content/docs/operators/feature/vectorassembler.md +++ b/docs/content/docs/operators/feature/vectorassembler.md @@ -127,12 +127,6 @@ public class VectorAssemblerExample { ```python # Simple program that creates a VectorAssembler instance and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/docs/content/docs/operators/regression/linearregression.md b/docs/content/docs/operators/regression/linearregression.md index 0b2f418..ff8c5e9 100644 --- a/docs/content/docs/operators/regression/linearregression.md +++ b/docs/content/docs/operators/regression/linearregression.md @@ -133,12 +133,6 @@ public class LinearRegressionExample { ```python # Simple program that trains a LinearRegression model and uses it for # regression. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/docs/content/docs/try-flink-ml/java/_index.md b/docs/content/docs/try-flink-ml/java/_index.md new file mode 100644 index 0000000..7a56525 --- /dev/null +++ b/docs/content/docs/try-flink-ml/java/_index.md @@ -0,0 +1,23 @@ +--- +title: Java +bookCollapseSection: true +weight: 1 +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> diff --git a/docs/content/docs/try-flink-ml/build-your-own-project.md b/docs/content/docs/try-flink-ml/java/build-your-own-project.md similarity index 99% rename from docs/content/docs/try-flink-ml/build-your-own-project.md rename to docs/content/docs/try-flink-ml/java/build-your-own-project.md index 84a811e..5525548 100644 --- a/docs/content/docs/try-flink-ml/build-your-own-project.md +++ b/docs/content/docs/try-flink-ml/java/build-your-own-project.md @@ -3,7 +3,7 @@ title: "Building your own Flink ML project" weight: 2 type: docs aliases: -- /try-flink-ml/building-your-own-project.html +- /try-flink-ml/java/building-your-own-project.html --- <!-- Licensed to the Apache Software Foundation (ASF) under one diff --git a/docs/content/docs/try-flink-ml/quick-start.md b/docs/content/docs/try-flink-ml/java/quick-start.md similarity index 98% rename from docs/content/docs/try-flink-ml/quick-start.md rename to docs/content/docs/try-flink-ml/java/quick-start.md index aca287f..645a8ec 100644 --- a/docs/content/docs/try-flink-ml/quick-start.md +++ b/docs/content/docs/try-flink-ml/java/quick-start.md @@ -3,7 +3,7 @@ title: "Quick Start" weight: 1 type: docs aliases: -- /try-flink-ml/quick-start.html +- /try-flink-ml/java/quick-start.html --- <!-- Licensed to the Apache Software Foundation (ASF) under one diff --git a/docs/content/docs/try-flink-ml/python/_index.md b/docs/content/docs/try-flink-ml/python/_index.md new file mode 100644 index 0000000..f86e4cc --- /dev/null +++ b/docs/content/docs/try-flink-ml/python/_index.md @@ -0,0 +1,23 @@ +--- +title: Python +bookCollapseSection: true +weight: 1 +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> diff --git a/docs/content/docs/try-flink-ml/python/build-your-own-project.md b/docs/content/docs/try-flink-ml/python/build-your-own-project.md new file mode 100644 index 0000000..e13a687 --- /dev/null +++ b/docs/content/docs/try-flink-ml/python/build-your-own-project.md @@ -0,0 +1,232 @@ +--- +title: "Building your own Flink ML project" +weight: 2 +type: docs +aliases: +- /try-flink-ml/python/building-your-own-project.html + +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Building your own Flink ML project + +This document provides a quick introduction to using Flink ML. Readers of this +document will be guided to create a simple Flink job that trains a Machine +Learning Model and use it to provide prediction service. + +## Prerequisites + +Python version (3.6, 3.7, or 3.8) is required for Flink ML. Please run the +following command to make sure that it meets the requirements: + +```shell +$ python --version +# the version printed here must be 3.6, 3.7 or 3.8 +``` + +## Installation of Flink ML Python SDK + +Flink ML Python SDK is available in +[PyPi](https://pypi.org/project/apache-flink-ml/) and can be installed as +follows: + +{{< stable >}} + +```bash +$ python -m pip install apache-flink-ml=={{< version >}} +``` + +{{< /stable >}} {{< unstable >}} + +```bash +$ python -m pip install apache-flink-ml +``` + +{{< /unstable >}} + +You can also build Flink ML Python SDK from sources by following the +[development guide]({{< ref "docs/development/building" >}}). + +## Flink ML Example + +Kmeans is a widely-used clustering algorithm and has been supported by Flink ML. +The example code below creates a Flink job with Flink ML that initializes and +trains a Kmeans model, and finally uses it to predict the cluster id of certain +data points. + +```python +from pyflink.common import Types +from pyflink.datastream import StreamExecutionEnvironment +from pyflink.ml.core.linalg import Vectors, DenseVectorTypeInfo +from pyflink.ml.lib.clustering.kmeans import KMeans +from pyflink.table import StreamTableEnvironment + +# create a new StreamExecutionEnvironment +env = StreamExecutionEnvironment.get_execution_environment() + +# create a StreamTableEnvironment +t_env = StreamTableEnvironment.create(env) + +# generate input data +input_data = t_env.from_data_stream( + env.from_collection([ + (Vectors.dense([0.0, 0.0]),), + (Vectors.dense([0.0, 0.3]),), + (Vectors.dense([0.3, 3.0]),), + (Vectors.dense([9.0, 0.0]),), + (Vectors.dense([9.0, 0.6]),), + (Vectors.dense([9.6, 0.0]),), + ], + type_info=Types.ROW_NAMED( + ['features'], + [DenseVectorTypeInfo()]))) + +# create a kmeans object and initialize its parameters +kmeans = KMeans().set_k(2).set_seed(1) + +# train the kmeans model +model = kmeans.fit(input_data) + +# use the kmeans model for predictions +output = model.transform(input_data)[0] + +# extract and display the results +field_names = output.get_schema().get_field_names() +for result in t_env.to_data_stream(output).execute_and_collect(): + features = result[field_names.index(kmeans.get_features_col())] + cluster_id = result[field_names.index(kmeans.get_prediction_col())] + print('Features: ' + str(features) + ' \tCluster Id: ' + str(cluster_id)) +``` + +After placing the code above into your Python file and executing it, information +like the below will be printed out to your terminal window. + +``` +Vector: [0.3, 0.0] Cluster ID: 1 +Vector: [9.6, 0.0] Cluster ID: 0 +Vector: [9.0, 0.6] Cluster ID: 0 +Vector: [0.0, 0.0] Cluster ID: 1 +Vector: [0.0, 0.3] Cluster ID: 1 +Vector: [9.0, 0.0] Cluster ID: 0 +``` + +## Breaking Down The Code + +### The Execution Environment + +The first lines set up the `StreamExecutionEnvironment` to execute the Flink ML +job. You would have been familiar with this concept if you have experience using +Flink. For the example program in this document, a simple +`StreamExecutionEnvironment` without specific configurations would be enough. + +Given that Flink ML uses Flink's Table API, a `StreamTableEnvironment` would +also be necessary for the following program. + +```python +# create a new StreamExecutionEnvironment +env = StreamExecutionEnvironment.get_execution_environment() + +# create a StreamTableEnvironment +t_env = StreamTableEnvironment.create(env) +``` + +### Creating Training & Inference Data Table + +Then the program creates the Table containing data for the training and +prediction process of the following Kmeans algorithm. Flink ML operators search +the names of the columns of the input table for input data, and produce +prediction results to designated column of the output Table. + +```python +# generate input data +input_data = t_env.from_data_stream( + env.from_collection([ + (Vectors.dense([0.0, 0.0]),), + (Vectors.dense([0.0, 0.3]),), + (Vectors.dense([0.3, 3.0]),), + (Vectors.dense([9.0, 0.0]),), + (Vectors.dense([9.0, 0.6]),), + (Vectors.dense([9.6, 0.0]),), + ], + type_info=Types.ROW_NAMED( + ['features'], + [DenseVectorTypeInfo()]))) +``` + +### Creating, Configuring, Training & Using Kmeans + +Flink ML classes for Kmeans algorithm include `KMeans` and `KMeansModel`. +`KMeans` implements the training process of Kmeans algorithm based on the +provided training data, and finally generates a `KMeansModel`. +`KmeansModel.transform()` method encodes the Transformation logic of this +algorithm and is used for predictions. + +Both `KMeans` and `KMeansModel` provides getter/setter methods for Kmeans +algorithm's configuration parameters. The example program explicitly sets the +following parameters, and other configuration parameters will have their default +values used. + +- `k`, the number of clusters to create +- `seed`, the random seed to initialize cluster centers + +When the program invokes `KMeans.fit()` to generate a `KMeansModel`, the +`KMeansModel` will inherit the `KMeans` object's configuration parameters. Thus +it is supported to set `KMeansModel`'s parameters directly in `KMeans` object. + +```python +# create a kmeans object and initialize its parameters +kmeans = KMeans().set_k(2).set_seed(1) + +# train the kmeans model +model = kmeans.fit(input_data) + +# use the kmeans model for predictions +output = model.transform(input_data)[0] + +``` + +### Collecting Prediction Result + +Like all other Flink programs, the codes described in the sections above only +configures the computation graph of a Flink job, and the program only evaluates +the computation logic and collects outputs after the `execute()` method is +invoked. Collected outputs from the output table would be `Row`s in which +`featuresCol` contains input feature vectors, and `predictionCol` contains +output prediction results, i.e., cluster IDs. + +```python +# extract and display the results +field_names = output.get_schema().get_field_names() +for result in t_env.to_data_stream(output).execute_and_collect(): + features = result[field_names.index(kmeans.get_features_col())] + cluster_id = result[field_names.index(kmeans.get_prediction_col())] + print('Features: ' + str(features) + ' \tCluster Id: ' + str(cluster_id)) +``` + +``` +Features: [9.6,0.0] Cluster Id: 0 +Features: [9.0,0.6] Cluster Id: 0 +Features: [0.0,0.3] Cluster Id: 1 +Features: [0.0,0.0] Cluster Id: 1 +Features: [0.3,3.0] Cluster Id: 1 +Features: [9.0,0.0] Cluster Id: 0 +``` + diff --git a/docs/content/docs/try-flink-ml/python/quick-start.md b/docs/content/docs/try-flink-ml/python/quick-start.md new file mode 100644 index 0000000..bb632fa --- /dev/null +++ b/docs/content/docs/try-flink-ml/python/quick-start.md @@ -0,0 +1,94 @@ +--- +title: "Quick Start" +weight: 1 +type: docs +aliases: +- /try-flink-ml/python/quick-start.html + +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Quick Start + +This document provides a quick introduction to using Flink ML. Readers of this +document will be guided to submit a simple Flink job that trains a Machine +Learning Model and use it to provide prediction service. + +## Prerequisites + +Python version (3.6, 3.7, or 3.8) is required for Flink ML. Please run the +following command to make sure that it meets the requirements: + +```shell +$ python --version +# the version printed here must be 3.6, 3.7 or 3.8 +``` + +## Installation of Flink ML Python SDK + +Flink ML Python SDK is available in +[PyPi](https://pypi.org/project/apache-flink-ml/) and can be installed as +follows: + +{{< stable >}} + +```bash +$ python -m pip install apache-flink-ml=={{< version >}} +``` + +{{< /stable >}} {{< unstable >}} + +```bash +$ python -m pip install apache-flink-ml +``` + +{{< /unstable >}} + +You can also build Flink ML Python SDK from sources by following the +[development guide]({{< ref "docs/development/building" >}}). + +## Run Flink ML example job + +After setting up Flink ML Python SDK, you can run a Flink ML example job as +follows. + +```shell +$ python -m pyflink.examples.ml.clustering.kmeans_example +``` + +The command above would create a Flink mini-cluster and execute Flink ML’s +`kmeans_example` job. There are also example jobs for other Flink ML algorithms, +and you can find them in `pyflink.ml.examples` module. + +A sample output in your terminal is as follows. + +``` +Features: [9.6,0.0] Cluster Id: 0 +Features: [9.0,0.6] Cluster Id: 0 +Features: [0.0,0.3] Cluster Id: 1 +Features: [0.0,0.0] Cluster Id: 1 +Features: [0.3,3.0] Cluster Id: 1 +Features: [9.0,0.0] Cluster Id: 0 + +``` + +Now you have successfully run a Flink ML job. + diff --git a/flink-ml-python/pyflink/examples/ml/classification/knn_example.py b/flink-ml-python/pyflink/examples/ml/classification/knn_example.py index 2d6c5a3..008aad5 100644 --- a/flink-ml-python/pyflink/examples/ml/classification/knn_example.py +++ b/flink-ml-python/pyflink/examples/ml/classification/knn_example.py @@ -17,12 +17,6 @@ ################################################################################ # Simple program that trains a Knn model and uses it for classification. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/classification/linearsvc_example.py b/flink-ml-python/pyflink/examples/ml/classification/linearsvc_example.py index d48a0fd..a877f19 100644 --- a/flink-ml-python/pyflink/examples/ml/classification/linearsvc_example.py +++ b/flink-ml-python/pyflink/examples/ml/classification/linearsvc_example.py @@ -17,12 +17,6 @@ ################################################################################ # Simple program that trains a LinearSVC model and uses it for classification. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/classification/logisticregression_example.py b/flink-ml-python/pyflink/examples/ml/classification/logisticregression_example.py index 45a54e3..3a14a71 100644 --- a/flink-ml-python/pyflink/examples/ml/classification/logisticregression_example.py +++ b/flink-ml-python/pyflink/examples/ml/classification/logisticregression_example.py @@ -18,12 +18,6 @@ # Simple program that trains a LogisticRegression model and uses it for # classification. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/classification/naivebayes_example.py b/flink-ml-python/pyflink/examples/ml/classification/naivebayes_example.py index ebf9b80..90786d1 100644 --- a/flink-ml-python/pyflink/examples/ml/classification/naivebayes_example.py +++ b/flink-ml-python/pyflink/examples/ml/classification/naivebayes_example.py @@ -17,12 +17,6 @@ ################################################################################ # Simple program that trains a NaiveBayes model and uses it for classification. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/clustering/kmeans_example.py b/flink-ml-python/pyflink/examples/ml/clustering/kmeans_example.py index 857863d..b5eb7f0 100644 --- a/flink-ml-python/pyflink/examples/ml/clustering/kmeans_example.py +++ b/flink-ml-python/pyflink/examples/ml/clustering/kmeans_example.py @@ -17,12 +17,6 @@ ################################################################################ # Simple program that trains a KMeans model and uses it for clustering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/evaluation/binaryclassificationevaluator_example.py b/flink-ml-python/pyflink/examples/ml/evaluation/binaryclassificationevaluator_example.py index 99ebf53..3b39150 100644 --- a/flink-ml-python/pyflink/examples/ml/evaluation/binaryclassificationevaluator_example.py +++ b/flink-ml-python/pyflink/examples/ml/evaluation/binaryclassificationevaluator_example.py @@ -18,12 +18,6 @@ # Simple program that creates a BinaryClassificationEvaluator instance and uses # it for evaluation. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/feature/bucketizer_example.py b/flink-ml-python/pyflink/examples/ml/feature/bucketizer_example.py index f475b23..aeacd84 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/bucketizer_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/bucketizer_example.py @@ -18,12 +18,6 @@ # Simple program that creates a Bucketizer instance and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/feature/dct_example.py b/flink-ml-python/pyflink/examples/ml/feature/dct_example.py index 1ecff6d..31c1a92 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/dct_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/dct_example.py @@ -16,14 +16,8 @@ # limitations under the License. ################################################################################ -# Simple program that creates a Bucketizer instance and uses it for feature +# Simple program that creates a DCT instance and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/feature/elementwiseproduct_example.py b/flink-ml-python/pyflink/examples/ml/feature/elementwiseproduct_example.py index 9b893c4..a5cabcd 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/elementwiseproduct_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/elementwiseproduct_example.py @@ -18,12 +18,6 @@ # Simple program that creates a ElementwiseProduct instance and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/feature/featurehasher_example.py b/flink-ml-python/pyflink/examples/ml/feature/featurehasher_example.py index 582429c..67b50a7 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/featurehasher_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/featurehasher_example.py @@ -18,12 +18,6 @@ # Simple program that creates a FeatureHasher instance and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/feature/hashingtf_example.py b/flink-ml-python/pyflink/examples/ml/feature/hashingtf_example.py index 9352c81..50dce84 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/hashingtf_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/hashingtf_example.py @@ -18,12 +18,6 @@ # Simple program that creates a VectorAssembler instance and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/feature/indextostringmodel_example.py b/flink-ml-python/pyflink/examples/ml/feature/indextostringmodel_example.py index 9351d71..5d9b41a 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/indextostringmodel_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/indextostringmodel_example.py @@ -18,12 +18,6 @@ # Simple program that creates an IndexToStringModelExample instance and uses it # for feature engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/feature/interaction_example.py b/flink-ml-python/pyflink/examples/ml/feature/interaction_example.py index 8c7917c..0a9ee7d 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/interaction_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/interaction_example.py @@ -18,12 +18,6 @@ # Simple program that creates a Interaction instance and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/feature/kbinsdiscreteizer_example.py b/flink-ml-python/pyflink/examples/ml/feature/kbinsdiscreteizer_example.py index d33d613..412e6d9 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/kbinsdiscreteizer_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/kbinsdiscreteizer_example.py @@ -18,12 +18,6 @@ # Simple program that trains a StringIndexer model and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.ml.core.linalg import Vectors, DenseVectorTypeInfo diff --git a/flink-ml-python/pyflink/examples/ml/feature/maxabsscaler_example.py b/flink-ml-python/pyflink/examples/ml/feature/maxabsscaler_example.py index f6c0008..c248702 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/maxabsscaler_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/maxabsscaler_example.py @@ -18,12 +18,6 @@ # Simple program that trains a MaxAbsScaler model and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/feature/minmaxscaler_example.py b/flink-ml-python/pyflink/examples/ml/feature/minmaxscaler_example.py index 5635140..d4d5bc2 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/minmaxscaler_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/minmaxscaler_example.py @@ -18,12 +18,6 @@ # Simple program that trains a MinMaxScaler model and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/feature/onehotencoder_example.py b/flink-ml-python/pyflink/examples/ml/feature/onehotencoder_example.py index 4dd8987..34ec752 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/onehotencoder_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/onehotencoder_example.py @@ -18,12 +18,6 @@ # Simple program that trains a OneHotEncoder model and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Row from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/feature/regextokenizer_example.py b/flink-ml-python/pyflink/examples/ml/feature/regextokenizer_example.py index 0a5b2a4..73c1da7 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/regextokenizer_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/regextokenizer_example.py @@ -18,12 +18,6 @@ # Simple program that creates a VectorAssembler instance and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/feature/standardscaler_example.py b/flink-ml-python/pyflink/examples/ml/feature/standardscaler_example.py index 5d0d40b..41c0021 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/standardscaler_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/standardscaler_example.py @@ -18,12 +18,6 @@ # Simple program that trains a StandardScaler model and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/feature/stringindexer_example.py b/flink-ml-python/pyflink/examples/ml/feature/stringindexer_example.py index f6d411d..5952cb4 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/stringindexer_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/stringindexer_example.py @@ -18,12 +18,6 @@ # Simple program that trains a StringIndexer model and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/feature/tokenizer_example.py b/flink-ml-python/pyflink/examples/ml/feature/tokenizer_example.py index b0f8308..05e56da 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/tokenizer_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/tokenizer_example.py @@ -18,12 +18,6 @@ # Simple program that creates a VectorAssembler instance and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/feature/vectorassembler_example.py b/flink-ml-python/pyflink/examples/ml/feature/vectorassembler_example.py index 7ae15ce..eb12679 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/vectorassembler_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/vectorassembler_example.py @@ -18,12 +18,6 @@ # Simple program that creates a VectorAssembler instance and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/feature/vectorindexer_example.py b/flink-ml-python/pyflink/examples/ml/feature/vectorindexer_example.py index a913444..f2bf9c1 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/vectorindexer_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/vectorindexer_example.py @@ -18,12 +18,6 @@ # Simple program that trains a StringIndexer model and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.ml.core.linalg import Vectors, DenseVectorTypeInfo diff --git a/flink-ml-python/pyflink/examples/ml/feature/vectorslicer_example.py b/flink-ml-python/pyflink/examples/ml/feature/vectorslicer_example.py index af41715..ba233cc 100644 --- a/flink-ml-python/pyflink/examples/ml/feature/vectorslicer_example.py +++ b/flink-ml-python/pyflink/examples/ml/feature/vectorslicer_example.py @@ -18,12 +18,6 @@ # Simple program that creates a VectorSlicer instance and uses it for feature # engineering. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment diff --git a/flink-ml-python/pyflink/examples/ml/regression/linearregression_example.py b/flink-ml-python/pyflink/examples/ml/regression/linearregression_example.py index cfd5b6e..cec6920 100644 --- a/flink-ml-python/pyflink/examples/ml/regression/linearregression_example.py +++ b/flink-ml-python/pyflink/examples/ml/regression/linearregression_example.py @@ -18,12 +18,6 @@ # Simple program that trains a LinearRegression model and uses it for # regression. -# -# Before executing this program, please make sure you have followed Flink ML's -# quick start guideline to set up Flink ML and Flink environment. The guideline -# can be found at -# -# https://nightlies.apache.org/flink/flink-ml-docs-master/docs/try-flink-ml/quick-start/ from pyflink.common import Types from pyflink.datastream import StreamExecutionEnvironment
