rszper commented on code in PR #27709:
URL: https://github.com/apache/beam/pull/27709#discussion_r1280925325


##########
website/www/site/content/en/documentation/transforms/python/overview.md:
##########
@@ -27,6 +27,7 @@ limitations under the License.
   <tr><td><a 
href="/documentation/transforms/python/elementwise/keys">Keys</a></td><td>Extracts
 the key from each element in a collection of key-value pairs.</td></tr>
   <tr><td><a 
href="/documentation/transforms/python/elementwise/kvswap">KvSwap</a></td><td>Swaps
 the key and value of each element in a collection of key-value pairs.</td></tr>
   <tr><td><a 
href="/documentation/transforms/python/elementwise/map">Map</a></td><td>Applies 
a function to every element in the input and outputs the result.</td></tr>
+  <tr><td><a 
href="/documentation/transforms/python/elementwise/mltransform">MLTransform</a></td><td>Applies
 Data processing transforms on the dataset.</td></tr>

Review Comment:
   ```suggestion
     <tr><td><a 
href="/documentation/transforms/python/elementwise/mltransform">MLTransform</a></td><td>Applies
 data processing transforms to the dataset.</td></tr>
   ```



##########
website/www/site/layouts/partials/section-menu/en/documentation.html:
##########
@@ -291,6 +291,7 @@
             <li><a 
href="/documentation/transforms/python/elementwise/keys/">Keys</a></li>
             <li><a 
href="/documentation/transforms/python/elementwise/kvswap/">KvSwap</a></li>
             <li><a 
href="/documentation/transforms/python/elementwise/map/">Map</a></li>
+            <li><a 
href="/documentation/transforms/python/elementwise/mltransform/">Map</a></li>

Review Comment:
   ```suggestion
               <li><a 
href="/documentation/transforms/python/elementwise/mltransform/">MLTransform</a></li>
   ```



##########
website/www/site/content/en/documentation/transforms/python/elementwise/mltransform.md:
##########
@@ -0,0 +1,111 @@
+---
+title: "MLTransform"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# MLTransform for data processing
+
+{{< localstorage language language-py >}}
+
+
+<table>
+  <tr>
+    <td>
+      <a>
+      {{< button-pydoc path="apache_beam.ml.transforms" class="MLTransform" >}}
+      </a>
+   </td>
+  </tr>
+</table>
+
+
+`MLTransform` is used to apply common machine learning processing tasks on 
keyed data. Apache Beam provides ML data processing transformations which can 
be used with `MLTransform`. You can find the full list of available data

Review Comment:
   ```suggestion
   Use `MLTransform` to apply common machine learning (ML) processing tasks on 
keyed data. Apache Beam provides ML data processing transformations that you 
can use with `MLTransform`. For the full list of available data
   ```



##########
website/www/site/content/en/documentation/transforms/python/elementwise/mltransform.md:
##########
@@ -0,0 +1,111 @@
+---
+title: "MLTransform"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# MLTransform for data processing
+
+{{< localstorage language language-py >}}
+
+
+<table>
+  <tr>
+    <td>
+      <a>
+      {{< button-pydoc path="apache_beam.ml.transforms" class="MLTransform" >}}
+      </a>
+   </td>
+  </tr>
+</table>
+
+
+`MLTransform` is used to apply common machine learning processing tasks on 
keyed data. Apache Beam provides ML data processing transformations which can 
be used with `MLTransform`. You can find the full list of available data
+processing transformations in the 
((GitHub)[https://github.com/apache/beam/blob/ab93fb1988051baac6c3b9dd1031f4d68bd9a149/sdks/python/apache_beam/ml/transforms/tft.py#L52])
 repository.

Review Comment:
   ```suggestion
   processing transformations, see the [tft.py 
file](https://github.com/apache/beam/blob/ab93fb1988051baac6c3b9dd1031f4d68bd9a149/sdks/python/apache_beam/ml/transforms/tft.py#L52)
 in GitHub.
   ```



##########
website/www/site/content/en/documentation/transforms/python/elementwise/mltransform.md:
##########
@@ -0,0 +1,111 @@
+---
+title: "MLTransform"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# MLTransform for data processing
+
+{{< localstorage language language-py >}}
+
+
+<table>
+  <tr>
+    <td>
+      <a>
+      {{< button-pydoc path="apache_beam.ml.transforms" class="MLTransform" >}}
+      </a>
+   </td>
+  </tr>
+</table>
+
+
+`MLTransform` is used to apply common machine learning processing tasks on 
keyed data. Apache Beam provides ML data processing transformations which can 
be used with `MLTransform`. You can find the full list of available data
+processing transformations in the 
((GitHub)[https://github.com/apache/beam/blob/ab93fb1988051baac6c3b9dd1031f4d68bd9a149/sdks/python/apache_beam/ml/transforms/tft.py#L52])
 repository.
+
+
+To define a data processing transformation using `MLTransform`, you need to 
create instances of data processing transforms with `columns` as input 
paramter. The data in the specified `columns` will be transformed and outputted 
in the `beam.Row` object.
+
+Let's look at an example

Review Comment:
   ```suggestion
   The following example demonstrates how to use `MLTransform` to normalize 
your data between 0 and 1 by using the minimum and maximum values from your 
entire dataset. `MLTransform` uses the `ScaleTo01` transformation.
   
   ```



##########
website/www/site/content/en/documentation/transforms/python/elementwise/mltransform.md:
##########
@@ -0,0 +1,111 @@
+---
+title: "MLTransform"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# MLTransform for data processing
+
+{{< localstorage language language-py >}}
+
+
+<table>
+  <tr>
+    <td>
+      <a>
+      {{< button-pydoc path="apache_beam.ml.transforms" class="MLTransform" >}}
+      </a>
+   </td>
+  </tr>
+</table>
+
+
+`MLTransform` is used to apply common machine learning processing tasks on 
keyed data. Apache Beam provides ML data processing transformations which can 
be used with `MLTransform`. You can find the full list of available data
+processing transformations in the 
((GitHub)[https://github.com/apache/beam/blob/ab93fb1988051baac6c3b9dd1031f4d68bd9a149/sdks/python/apache_beam/ml/transforms/tft.py#L52])
 repository.
+
+
+To define a data processing transformation using `MLTransform`, you need to 
create instances of data processing transforms with `columns` as input 
paramter. The data in the specified `columns` will be transformed and outputted 
in the `beam.Row` object.
+
+Let's look at an example
+
+```
+scale_to_z_score_transform = ScaleToZScore(columns=['x', 'y'])
+with beam.Pipeline() as p:
+  (data | 
MLTransform(artifact_location).with_transform(scale_to_z_score_transform))
+```
+
+In this example, `MLTransform` receives `artifact_location`. This location is 
used to store artifacts generated by the `MLTransform`. We will talk about the 
`artifacts` later. The data processing transform can be passed using the 
`with_transform` method of `MLTransform` or transforms can be passed as list to 
the MLTransform.

Review Comment:
   ```suggestion
   In this example, `MLTransform` receives a value for `artifact_location`. 
`MLTransform` then uses this location value to store artifacts generated by the 
transform. To pass the data processing transform, you can use either the 
with_transform` method of `MLTransform` or a list.
   ```



##########
website/www/site/content/en/documentation/transforms/python/elementwise/mltransform.md:
##########
@@ -0,0 +1,111 @@
+---
+title: "MLTransform"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# MLTransform for data processing
+
+{{< localstorage language language-py >}}
+
+
+<table>
+  <tr>
+    <td>
+      <a>
+      {{< button-pydoc path="apache_beam.ml.transforms" class="MLTransform" >}}
+      </a>
+   </td>
+  </tr>
+</table>
+
+
+`MLTransform` is used to apply common machine learning processing tasks on 
keyed data. Apache Beam provides ML data processing transformations which can 
be used with `MLTransform`. You can find the full list of available data
+processing transformations in the 
((GitHub)[https://github.com/apache/beam/blob/ab93fb1988051baac6c3b9dd1031f4d68bd9a149/sdks/python/apache_beam/ml/transforms/tft.py#L52])
 repository.
+
+
+To define a data processing transformation using `MLTransform`, you need to 
create instances of data processing transforms with `columns` as input 
paramter. The data in the specified `columns` will be transformed and outputted 
in the `beam.Row` object.
+
+Let's look at an example
+
+```
+scale_to_z_score_transform = ScaleToZScore(columns=['x', 'y'])
+with beam.Pipeline() as p:
+  (data | 
MLTransform(artifact_location).with_transform(scale_to_z_score_transform))
+```
+
+In this example, `MLTransform` receives `artifact_location`. This location is 
used to store artifacts generated by the `MLTransform`. We will talk about the 
`artifacts` later. The data processing transform can be passed using the 
`with_transform` method of `MLTransform` or transforms can be passed as list to 
the MLTransform.
+
+```
+MLTransform(transforms=transforms, artifact_location=artifact_location)
+```
+
+All the transforms passed to `MLTransform` are applied sequentially on the 
dataset. `MLTransform` expects a keyed data (Link to the supported inputs).

Review Comment:
   ```suggestion
   The transforms passed to `MLTransform` are applied sequentially on the 
dataset. `MLTransform` expects keyed data (Link to the supported inputs).
   ```



##########
website/www/site/content/en/documentation/transforms/python/elementwise/mltransform.md:
##########
@@ -0,0 +1,111 @@
+---
+title: "MLTransform"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# MLTransform for data processing
+
+{{< localstorage language language-py >}}
+
+
+<table>
+  <tr>
+    <td>
+      <a>
+      {{< button-pydoc path="apache_beam.ml.transforms" class="MLTransform" >}}
+      </a>
+   </td>
+  </tr>
+</table>
+
+
+`MLTransform` is used to apply common machine learning processing tasks on 
keyed data. Apache Beam provides ML data processing transformations which can 
be used with `MLTransform`. You can find the full list of available data
+processing transformations in the 
((GitHub)[https://github.com/apache/beam/blob/ab93fb1988051baac6c3b9dd1031f4d68bd9a149/sdks/python/apache_beam/ml/transforms/tft.py#L52])
 repository.
+
+
+To define a data processing transformation using `MLTransform`, you need to 
create instances of data processing transforms with `columns` as input 
paramter. The data in the specified `columns` will be transformed and outputted 
in the `beam.Row` object.
+
+Let's look at an example
+
+```
+scale_to_z_score_transform = ScaleToZScore(columns=['x', 'y'])
+with beam.Pipeline() as p:
+  (data | 
MLTransform(artifact_location).with_transform(scale_to_z_score_transform))
+```
+
+In this example, `MLTransform` receives `artifact_location`. This location is 
used to store artifacts generated by the `MLTransform`. We will talk about the 
`artifacts` later. The data processing transform can be passed using the 
`with_transform` method of `MLTransform` or transforms can be passed as list to 
the MLTransform.
+
+```
+MLTransform(transforms=transforms, artifact_location=artifact_location)
+```
+
+All the transforms passed to `MLTransform` are applied sequentially on the 
dataset. `MLTransform` expects a keyed data (Link to the supported inputs).
+
+
+## Example 1
+
+In the example we will create a pipeline that uses MLTransform to scale the 
data between 0 and 1.

Review Comment:
   ```suggestion
   This example creates a pipeline that uses `MLTransform` to scale data 
between 0 and 1.
   The example takes a list of ints and converts them into the range of 0 to 1 
using the transform `ScaleTo01`.
   ```



##########
website/www/site/content/en/documentation/transforms/python/elementwise/mltransform.md:
##########
@@ -0,0 +1,111 @@
+---
+title: "MLTransform"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# MLTransform for data processing
+
+{{< localstorage language language-py >}}
+
+
+<table>
+  <tr>
+    <td>
+      <a>
+      {{< button-pydoc path="apache_beam.ml.transforms" class="MLTransform" >}}
+      </a>
+   </td>
+  </tr>
+</table>
+
+
+`MLTransform` is used to apply common machine learning processing tasks on 
keyed data. Apache Beam provides ML data processing transformations which can 
be used with `MLTransform`. You can find the full list of available data
+processing transformations in the 
((GitHub)[https://github.com/apache/beam/blob/ab93fb1988051baac6c3b9dd1031f4d68bd9a149/sdks/python/apache_beam/ml/transforms/tft.py#L52])
 repository.
+
+
+To define a data processing transformation using `MLTransform`, you need to 
create instances of data processing transforms with `columns` as input 
paramter. The data in the specified `columns` will be transformed and outputted 
in the `beam.Row` object.
+
+Let's look at an example
+
+```
+scale_to_z_score_transform = ScaleToZScore(columns=['x', 'y'])
+with beam.Pipeline() as p:
+  (data | 
MLTransform(artifact_location).with_transform(scale_to_z_score_transform))
+```
+
+In this example, `MLTransform` receives `artifact_location`. This location is 
used to store artifacts generated by the `MLTransform`. We will talk about the 
`artifacts` later. The data processing transform can be passed using the 
`with_transform` method of `MLTransform` or transforms can be passed as list to 
the MLTransform.
+
+```
+MLTransform(transforms=transforms, artifact_location=artifact_location)
+```
+
+All the transforms passed to `MLTransform` are applied sequentially on the 
dataset. `MLTransform` expects a keyed data (Link to the supported inputs).
+
+
+## Example 1
+
+In the example we will create a pipeline that uses MLTransform to scale the 
data between 0 and 1.
+
+{{< highlight language="py" 
file="sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
+  class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
 mltransform_scale_to_0_1 >}}
+{{</ highlight >}}
+
+{{< paragraph class="notebook-skip" >}}
+Output:
+{{< /paragraph >}}
+{{< highlight class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform_test.py"
 mltransform_scale_to_0_1 >}}
+{{< /highlight >}}
+
+
+The example takes a list of ints and converts them into the range of 0 to 1 
using the transform `ScaleTo01`.

Review Comment:
   Remove; I moved this content to the intro.



##########
website/www/site/content/en/documentation/transforms/python/elementwise/mltransform.md:
##########
@@ -0,0 +1,111 @@
+---
+title: "MLTransform"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# MLTransform for data processing
+
+{{< localstorage language language-py >}}
+
+
+<table>
+  <tr>
+    <td>
+      <a>
+      {{< button-pydoc path="apache_beam.ml.transforms" class="MLTransform" >}}
+      </a>
+   </td>
+  </tr>
+</table>
+
+
+`MLTransform` is used to apply common machine learning processing tasks on 
keyed data. Apache Beam provides ML data processing transformations which can 
be used with `MLTransform`. You can find the full list of available data
+processing transformations in the 
((GitHub)[https://github.com/apache/beam/blob/ab93fb1988051baac6c3b9dd1031f4d68bd9a149/sdks/python/apache_beam/ml/transforms/tft.py#L52])
 repository.
+
+
+To define a data processing transformation using `MLTransform`, you need to 
create instances of data processing transforms with `columns` as input 
paramter. The data in the specified `columns` will be transformed and outputted 
in the `beam.Row` object.
+
+Let's look at an example
+
+```
+scale_to_z_score_transform = ScaleToZScore(columns=['x', 'y'])
+with beam.Pipeline() as p:
+  (data | 
MLTransform(artifact_location).with_transform(scale_to_z_score_transform))
+```
+
+In this example, `MLTransform` receives `artifact_location`. This location is 
used to store artifacts generated by the `MLTransform`. We will talk about the 
`artifacts` later. The data processing transform can be passed using the 
`with_transform` method of `MLTransform` or transforms can be passed as list to 
the MLTransform.
+
+```
+MLTransform(transforms=transforms, artifact_location=artifact_location)
+```
+
+All the transforms passed to `MLTransform` are applied sequentially on the 
dataset. `MLTransform` expects a keyed data (Link to the supported inputs).
+
+
+## Example 1
+
+In the example we will create a pipeline that uses MLTransform to scale the 
data between 0 and 1.
+
+{{< highlight language="py" 
file="sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
+  class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
 mltransform_scale_to_0_1 >}}
+{{</ highlight >}}
+
+{{< paragraph class="notebook-skip" >}}
+Output:
+{{< /paragraph >}}
+{{< highlight class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform_test.py"
 mltransform_scale_to_0_1 >}}
+{{< /highlight >}}
+
+
+The example takes a list of ints and converts them into the range of 0 to 1 
using the transform `ScaleTo01`.
+
+## Example 2
+
+In the example, we will create a pipeline that use MLTransform to compute 
vocabulary on the entire dataset and assign indices to each unique vocab.

Review Comment:
   ```suggestion
   This example creates a pipeline that use `MLTransform` to compute vocabulary 
on the entire dataset and assign indices to each unique vocabulary item.
   It takes a list of strings, computes vocabulary over the entire dataset, and 
then applies a unique index to each vocabulary item.
   ```



##########
website/www/site/content/en/documentation/transforms/python/elementwise/mltransform.md:
##########
@@ -0,0 +1,111 @@
+---
+title: "MLTransform"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# MLTransform for data processing
+
+{{< localstorage language language-py >}}
+
+
+<table>
+  <tr>
+    <td>
+      <a>
+      {{< button-pydoc path="apache_beam.ml.transforms" class="MLTransform" >}}
+      </a>
+   </td>
+  </tr>
+</table>
+
+
+`MLTransform` is used to apply common machine learning processing tasks on 
keyed data. Apache Beam provides ML data processing transformations which can 
be used with `MLTransform`. You can find the full list of available data
+processing transformations in the 
((GitHub)[https://github.com/apache/beam/blob/ab93fb1988051baac6c3b9dd1031f4d68bd9a149/sdks/python/apache_beam/ml/transforms/tft.py#L52])
 repository.
+
+
+To define a data processing transformation using `MLTransform`, you need to 
create instances of data processing transforms with `columns` as input 
paramter. The data in the specified `columns` will be transformed and outputted 
in the `beam.Row` object.
+
+Let's look at an example
+
+```
+scale_to_z_score_transform = ScaleToZScore(columns=['x', 'y'])
+with beam.Pipeline() as p:
+  (data | 
MLTransform(artifact_location).with_transform(scale_to_z_score_transform))
+```
+
+In this example, `MLTransform` receives `artifact_location`. This location is 
used to store artifacts generated by the `MLTransform`. We will talk about the 
`artifacts` later. The data processing transform can be passed using the 
`with_transform` method of `MLTransform` or transforms can be passed as list to 
the MLTransform.
+
+```
+MLTransform(transforms=transforms, artifact_location=artifact_location)
+```
+
+All the transforms passed to `MLTransform` are applied sequentially on the 
dataset. `MLTransform` expects a keyed data (Link to the supported inputs).
+
+
+## Example 1
+
+In the example we will create a pipeline that uses MLTransform to scale the 
data between 0 and 1.
+
+{{< highlight language="py" 
file="sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
+  class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
 mltransform_scale_to_0_1 >}}
+{{</ highlight >}}
+
+{{< paragraph class="notebook-skip" >}}
+Output:
+{{< /paragraph >}}
+{{< highlight class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform_test.py"
 mltransform_scale_to_0_1 >}}
+{{< /highlight >}}
+
+
+The example takes a list of ints and converts them into the range of 0 to 1 
using the transform `ScaleTo01`.
+
+## Example 2
+
+In the example, we will create a pipeline that use MLTransform to compute 
vocabulary on the entire dataset and assign indices to each unique vocab.
+
+{{< highlight language="py" 
file="sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
+  class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
 mltransform_compute_and_apply_vocabulary >}}
+{{</ highlight >}}
+
+{{< paragraph class="notebook-skip" >}}
+Output:
+{{< /paragraph >}}
+{{< highlight class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform_test.py"
 mltransform_compute_and_apply_vocabulary >}}
+{{< /highlight >}}
+
+
+This example takes a list of strings, computes vocab over the entire dataset 
and applies a unique index to each vocab.
+
+
+The above two examples requires a full pass over the dataset to transform the 
dataset. For `ComputeAndApplyVocabulary`, all the unqiue words in the dataset 
needs to be known before transforming the data. For `ScaleTo01`, the minimum 
and maximum of the dataset needs to be known before transforming the dataset. 
This is acheived by `MLTransform`.
+
+
+In this example, we will create a pipeline that use MLTransform to compute 
vocabulary on the entire dataset and assign indices to each unique vocab. This 
pipeline takes a single element as input instead of list of elements.

Review Comment:
   ```suggestion
   This example creates a pipeline that uses `MLTransform` to compute 
vocabulary on the entire dataset and assign indices to each unique vocabulary 
item. This pipeline takes a single element as input instead of a list of 
elements.
   ```



##########
website/www/site/content/en/documentation/transforms/python/elementwise/mltransform.md:
##########
@@ -0,0 +1,111 @@
+---
+title: "MLTransform"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# MLTransform for data processing
+
+{{< localstorage language language-py >}}
+
+
+<table>
+  <tr>
+    <td>
+      <a>
+      {{< button-pydoc path="apache_beam.ml.transforms" class="MLTransform" >}}
+      </a>
+   </td>
+  </tr>
+</table>
+
+
+`MLTransform` is used to apply common machine learning processing tasks on 
keyed data. Apache Beam provides ML data processing transformations which can 
be used with `MLTransform`. You can find the full list of available data
+processing transformations in the 
((GitHub)[https://github.com/apache/beam/blob/ab93fb1988051baac6c3b9dd1031f4d68bd9a149/sdks/python/apache_beam/ml/transforms/tft.py#L52])
 repository.
+
+
+To define a data processing transformation using `MLTransform`, you need to 
create instances of data processing transforms with `columns` as input 
paramter. The data in the specified `columns` will be transformed and outputted 
in the `beam.Row` object.
+
+Let's look at an example
+
+```
+scale_to_z_score_transform = ScaleToZScore(columns=['x', 'y'])
+with beam.Pipeline() as p:
+  (data | 
MLTransform(artifact_location).with_transform(scale_to_z_score_transform))
+```
+
+In this example, `MLTransform` receives `artifact_location`. This location is 
used to store artifacts generated by the `MLTransform`. We will talk about the 
`artifacts` later. The data processing transform can be passed using the 
`with_transform` method of `MLTransform` or transforms can be passed as list to 
the MLTransform.

Review Comment:
   If we include this example on this page, it should be in the example section 
and formatted to match the other examples on the page.



##########
website/www/site/content/en/documentation/transforms/python/elementwise/mltransform.md:
##########
@@ -0,0 +1,111 @@
+---
+title: "MLTransform"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# MLTransform for data processing
+
+{{< localstorage language language-py >}}
+
+
+<table>
+  <tr>
+    <td>
+      <a>
+      {{< button-pydoc path="apache_beam.ml.transforms" class="MLTransform" >}}
+      </a>
+   </td>
+  </tr>
+</table>
+
+
+`MLTransform` is used to apply common machine learning processing tasks on 
keyed data. Apache Beam provides ML data processing transformations which can 
be used with `MLTransform`. You can find the full list of available data
+processing transformations in the 
((GitHub)[https://github.com/apache/beam/blob/ab93fb1988051baac6c3b9dd1031f4d68bd9a149/sdks/python/apache_beam/ml/transforms/tft.py#L52])
 repository.
+
+
+To define a data processing transformation using `MLTransform`, you need to 
create instances of data processing transforms with `columns` as input 
paramter. The data in the specified `columns` will be transformed and outputted 
in the `beam.Row` object.
+
+Let's look at an example
+
+```
+scale_to_z_score_transform = ScaleToZScore(columns=['x', 'y'])
+with beam.Pipeline() as p:
+  (data | 
MLTransform(artifact_location).with_transform(scale_to_z_score_transform))
+```
+
+In this example, `MLTransform` receives `artifact_location`. This location is 
used to store artifacts generated by the `MLTransform`. We will talk about the 
`artifacts` later. The data processing transform can be passed using the 
`with_transform` method of `MLTransform` or transforms can be passed as list to 
the MLTransform.
+
+```
+MLTransform(transforms=transforms, artifact_location=artifact_location)
+```
+
+All the transforms passed to `MLTransform` are applied sequentially on the 
dataset. `MLTransform` expects a keyed data (Link to the supported inputs).
+

Review Comment:
   ```suggestion
   ## Examples
   
   The following examples demonstrate how to to create pipelines that use 
`MLTransform` to preprocess data.
   
   MLTransform can do a full pass on the dataset, which is useful when you need 
to transform a single element only after analyzing the entire dataset.
   The first two examples require a full pass over the dataset to complete the 
data transformation.
   
   * For the `ComputeAndApplyVocabulary` transform, the transform needs access 
to all of the unique words in the dataset.
   * For the `ScaleTo01` transform, the transform needs to know the minimum and 
maximum values in the dataset.
   ```



##########
website/www/site/content/en/documentation/transforms/python/elementwise/mltransform.md:
##########
@@ -0,0 +1,111 @@
+---
+title: "MLTransform"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# MLTransform for data processing
+
+{{< localstorage language language-py >}}
+
+
+<table>
+  <tr>
+    <td>
+      <a>
+      {{< button-pydoc path="apache_beam.ml.transforms" class="MLTransform" >}}
+      </a>
+   </td>
+  </tr>
+</table>
+
+
+`MLTransform` is used to apply common machine learning processing tasks on 
keyed data. Apache Beam provides ML data processing transformations which can 
be used with `MLTransform`. You can find the full list of available data
+processing transformations in the 
((GitHub)[https://github.com/apache/beam/blob/ab93fb1988051baac6c3b9dd1031f4d68bd9a149/sdks/python/apache_beam/ml/transforms/tft.py#L52])
 repository.
+
+
+To define a data processing transformation using `MLTransform`, you need to 
create instances of data processing transforms with `columns` as input 
paramter. The data in the specified `columns` will be transformed and outputted 
in the `beam.Row` object.
+
+Let's look at an example
+
+```
+scale_to_z_score_transform = ScaleToZScore(columns=['x', 'y'])
+with beam.Pipeline() as p:
+  (data | 
MLTransform(artifact_location).with_transform(scale_to_z_score_transform))
+```
+
+In this example, `MLTransform` receives `artifact_location`. This location is 
used to store artifacts generated by the `MLTransform`. We will talk about the 
`artifacts` later. The data processing transform can be passed using the 
`with_transform` method of `MLTransform` or transforms can be passed as list to 
the MLTransform.
+
+```
+MLTransform(transforms=transforms, artifact_location=artifact_location)
+```
+
+All the transforms passed to `MLTransform` are applied sequentially on the 
dataset. `MLTransform` expects a keyed data (Link to the supported inputs).
+
+
+## Example 1

Review Comment:
   ```suggestion
   ### Example 1
   ```



##########
website/www/site/content/en/documentation/transforms/python/elementwise/mltransform.md:
##########
@@ -0,0 +1,111 @@
+---
+title: "MLTransform"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# MLTransform for data processing
+
+{{< localstorage language language-py >}}
+
+
+<table>
+  <tr>
+    <td>
+      <a>
+      {{< button-pydoc path="apache_beam.ml.transforms" class="MLTransform" >}}
+      </a>
+   </td>
+  </tr>
+</table>
+
+
+`MLTransform` is used to apply common machine learning processing tasks on 
keyed data. Apache Beam provides ML data processing transformations which can 
be used with `MLTransform`. You can find the full list of available data
+processing transformations in the 
((GitHub)[https://github.com/apache/beam/blob/ab93fb1988051baac6c3b9dd1031f4d68bd9a149/sdks/python/apache_beam/ml/transforms/tft.py#L52])
 repository.
+
+
+To define a data processing transformation using `MLTransform`, you need to 
create instances of data processing transforms with `columns` as input 
paramter. The data in the specified `columns` will be transformed and outputted 
in the `beam.Row` object.
+
+Let's look at an example
+
+```
+scale_to_z_score_transform = ScaleToZScore(columns=['x', 'y'])
+with beam.Pipeline() as p:
+  (data | 
MLTransform(artifact_location).with_transform(scale_to_z_score_transform))
+```
+
+In this example, `MLTransform` receives `artifact_location`. This location is 
used to store artifacts generated by the `MLTransform`. We will talk about the 
`artifacts` later. The data processing transform can be passed using the 
`with_transform` method of `MLTransform` or transforms can be passed as list to 
the MLTransform.
+
+```
+MLTransform(transforms=transforms, artifact_location=artifact_location)
+```
+
+All the transforms passed to `MLTransform` are applied sequentially on the 
dataset. `MLTransform` expects a keyed data (Link to the supported inputs).
+
+
+## Example 1
+
+In the example we will create a pipeline that uses MLTransform to scale the 
data between 0 and 1.
+
+{{< highlight language="py" 
file="sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
+  class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
 mltransform_scale_to_0_1 >}}
+{{</ highlight >}}
+
+{{< paragraph class="notebook-skip" >}}
+Output:
+{{< /paragraph >}}
+{{< highlight class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform_test.py"
 mltransform_scale_to_0_1 >}}
+{{< /highlight >}}
+
+
+The example takes a list of ints and converts them into the range of 0 to 1 
using the transform `ScaleTo01`.
+
+## Example 2
+
+In the example, we will create a pipeline that use MLTransform to compute 
vocabulary on the entire dataset and assign indices to each unique vocab.
+
+{{< highlight language="py" 
file="sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
+  class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
 mltransform_compute_and_apply_vocabulary >}}
+{{</ highlight >}}
+
+{{< paragraph class="notebook-skip" >}}
+Output:
+{{< /paragraph >}}
+{{< highlight class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform_test.py"
 mltransform_compute_and_apply_vocabulary >}}
+{{< /highlight >}}
+
+
+This example takes a list of strings, computes vocab over the entire dataset 
and applies a unique index to each vocab.

Review Comment:
   Remove; I moved this content to the intro.



##########
website/www/site/content/en/documentation/transforms/python/elementwise/mltransform.md:
##########
@@ -0,0 +1,111 @@
+---
+title: "MLTransform"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# MLTransform for data processing
+
+{{< localstorage language language-py >}}
+
+
+<table>
+  <tr>
+    <td>
+      <a>
+      {{< button-pydoc path="apache_beam.ml.transforms" class="MLTransform" >}}
+      </a>
+   </td>
+  </tr>
+</table>
+
+
+`MLTransform` is used to apply common machine learning processing tasks on 
keyed data. Apache Beam provides ML data processing transformations which can 
be used with `MLTransform`. You can find the full list of available data
+processing transformations in the 
((GitHub)[https://github.com/apache/beam/blob/ab93fb1988051baac6c3b9dd1031f4d68bd9a149/sdks/python/apache_beam/ml/transforms/tft.py#L52])
 repository.
+
+
+To define a data processing transformation using `MLTransform`, you need to 
create instances of data processing transforms with `columns` as input 
paramter. The data in the specified `columns` will be transformed and outputted 
in the `beam.Row` object.
+
+Let's look at an example
+
+```
+scale_to_z_score_transform = ScaleToZScore(columns=['x', 'y'])
+with beam.Pipeline() as p:
+  (data | 
MLTransform(artifact_location).with_transform(scale_to_z_score_transform))
+```
+
+In this example, `MLTransform` receives `artifact_location`. This location is 
used to store artifacts generated by the `MLTransform`. We will talk about the 
`artifacts` later. The data processing transform can be passed using the 
`with_transform` method of `MLTransform` or transforms can be passed as list to 
the MLTransform.
+
+```
+MLTransform(transforms=transforms, artifact_location=artifact_location)
+```
+
+All the transforms passed to `MLTransform` are applied sequentially on the 
dataset. `MLTransform` expects a keyed data (Link to the supported inputs).
+
+
+## Example 1
+
+In the example we will create a pipeline that uses MLTransform to scale the 
data between 0 and 1.
+
+{{< highlight language="py" 
file="sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
+  class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
 mltransform_scale_to_0_1 >}}
+{{</ highlight >}}
+
+{{< paragraph class="notebook-skip" >}}
+Output:
+{{< /paragraph >}}
+{{< highlight class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform_test.py"
 mltransform_scale_to_0_1 >}}
+{{< /highlight >}}
+
+
+The example takes a list of ints and converts them into the range of 0 to 1 
using the transform `ScaleTo01`.
+
+## Example 2
+
+In the example, we will create a pipeline that use MLTransform to compute 
vocabulary on the entire dataset and assign indices to each unique vocab.
+
+{{< highlight language="py" 
file="sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
+  class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
 mltransform_compute_and_apply_vocabulary >}}
+{{</ highlight >}}
+
+{{< paragraph class="notebook-skip" >}}
+Output:
+{{< /paragraph >}}
+{{< highlight class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform_test.py"
 mltransform_compute_and_apply_vocabulary >}}
+{{< /highlight >}}
+
+
+This example takes a list of strings, computes vocab over the entire dataset 
and applies a unique index to each vocab.
+
+
+The above two examples requires a full pass over the dataset to transform the 
dataset. For `ComputeAndApplyVocabulary`, all the unqiue words in the dataset 
needs to be known before transforming the data. For `ScaleTo01`, the minimum 
and maximum of the dataset needs to be known before transforming the dataset. 
This is acheived by `MLTransform`.
+

Review Comment:
   ```suggestion
   ### Example 3
   ```



##########
website/www/site/content/en/documentation/transforms/python/elementwise/mltransform.md:
##########
@@ -0,0 +1,111 @@
+---
+title: "MLTransform"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# MLTransform for data processing
+
+{{< localstorage language language-py >}}
+
+
+<table>
+  <tr>
+    <td>
+      <a>
+      {{< button-pydoc path="apache_beam.ml.transforms" class="MLTransform" >}}
+      </a>
+   </td>
+  </tr>
+</table>
+
+
+`MLTransform` is used to apply common machine learning processing tasks on 
keyed data. Apache Beam provides ML data processing transformations which can 
be used with `MLTransform`. You can find the full list of available data
+processing transformations in the 
((GitHub)[https://github.com/apache/beam/blob/ab93fb1988051baac6c3b9dd1031f4d68bd9a149/sdks/python/apache_beam/ml/transforms/tft.py#L52])
 repository.
+
+
+To define a data processing transformation using `MLTransform`, you need to 
create instances of data processing transforms with `columns` as input 
paramter. The data in the specified `columns` will be transformed and outputted 
in the `beam.Row` object.

Review Comment:
   ```suggestion
   To define a data processing transformation by using `MLTransform`, create 
instances of data processing transforms with `columns` as input parameters. The 
data in the specified `columns` is transformed and outputted to the `beam.Row` 
object.
   ```



##########
website/www/site/content/en/documentation/transforms/python/elementwise/mltransform.md:
##########
@@ -0,0 +1,111 @@
+---
+title: "MLTransform"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# MLTransform for data processing
+
+{{< localstorage language language-py >}}
+
+
+<table>
+  <tr>
+    <td>
+      <a>
+      {{< button-pydoc path="apache_beam.ml.transforms" class="MLTransform" >}}
+      </a>
+   </td>
+  </tr>
+</table>
+
+
+`MLTransform` is used to apply common machine learning processing tasks on 
keyed data. Apache Beam provides ML data processing transformations which can 
be used with `MLTransform`. You can find the full list of available data
+processing transformations in the 
((GitHub)[https://github.com/apache/beam/blob/ab93fb1988051baac6c3b9dd1031f4d68bd9a149/sdks/python/apache_beam/ml/transforms/tft.py#L52])
 repository.
+
+
+To define a data processing transformation using `MLTransform`, you need to 
create instances of data processing transforms with `columns` as input 
paramter. The data in the specified `columns` will be transformed and outputted 
in the `beam.Row` object.
+
+Let's look at an example
+
+```
+scale_to_z_score_transform = ScaleToZScore(columns=['x', 'y'])
+with beam.Pipeline() as p:
+  (data | 
MLTransform(artifact_location).with_transform(scale_to_z_score_transform))
+```
+
+In this example, `MLTransform` receives `artifact_location`. This location is 
used to store artifacts generated by the `MLTransform`. We will talk about the 
`artifacts` later. The data processing transform can be passed using the 
`with_transform` method of `MLTransform` or transforms can be passed as list to 
the MLTransform.
+
+```
+MLTransform(transforms=transforms, artifact_location=artifact_location)
+```
+
+All the transforms passed to `MLTransform` are applied sequentially on the 
dataset. `MLTransform` expects a keyed data (Link to the supported inputs).
+
+
+## Example 1
+
+In the example we will create a pipeline that uses MLTransform to scale the 
data between 0 and 1.
+
+{{< highlight language="py" 
file="sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
+  class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
 mltransform_scale_to_0_1 >}}
+{{</ highlight >}}
+
+{{< paragraph class="notebook-skip" >}}
+Output:
+{{< /paragraph >}}
+{{< highlight class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform_test.py"
 mltransform_scale_to_0_1 >}}
+{{< /highlight >}}
+
+
+The example takes a list of ints and converts them into the range of 0 to 1 
using the transform `ScaleTo01`.
+
+## Example 2
+
+In the example, we will create a pipeline that use MLTransform to compute 
vocabulary on the entire dataset and assign indices to each unique vocab.
+
+{{< highlight language="py" 
file="sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
+  class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform.py"
 mltransform_compute_and_apply_vocabulary >}}
+{{</ highlight >}}
+
+{{< paragraph class="notebook-skip" >}}
+Output:
+{{< /paragraph >}}
+{{< highlight class="notebook-skip" >}}
+{{< code_sample 
"sdks/python/apache_beam/examples/snippets/transforms/elementwise/mltransform_test.py"
 mltransform_compute_and_apply_vocabulary >}}
+{{< /highlight >}}
+
+
+This example takes a list of strings, computes vocab over the entire dataset 
and applies a unique index to each vocab.
+
+
+The above two examples requires a full pass over the dataset to transform the 
dataset. For `ComputeAndApplyVocabulary`, all the unqiue words in the dataset 
needs to be known before transforming the data. For `ScaleTo01`, the minimum 
and maximum of the dataset needs to be known before transforming the dataset. 
This is acheived by `MLTransform`.

Review Comment:
   Remove; I moved this content to the intro of the Example section.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to