ihji commented on code in PR #23837:
URL: https://github.com/apache/beam/pull/23837#discussion_r1006080365


##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -7282,7 +7282,16 @@ To create an SDK wrapper for use in a Python pipeline, 
do the following:
 
 #### 13.1.2. Creating cross-language Python transforms
 
-To make your Python transform usable with different SDK languages, you must 
create a Python module that registers an existing Python transform as a 
cross-language transform for use with the Python expansion service and calls 
into that existing transform to perform its intended operation.
+Any Python transforms defined in the scope of the expansion service should be 
accessible by specifying their fully qualified names. For example, you could 
use Python's `ReadFromText` transform in a Java pipeline with its fully 
qualified name `apache_beam.io.ReadFromText`:
+
+```java

Review Comment:
   I think it deserves to be a separate PR.



##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -7393,7 +7402,29 @@ Depending on the SDK language of the pipeline, you can 
use a high-level SDK-wrap
 
 #### 13.2.1. Using cross-language transforms in a Java pipeline
 
-Currently, to access cross-language transforms from the Java SDK, you have to 
use the lower-level 
[External](https://github.com/apache/beam/blob/master/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/External.java)
 class.
+Users have three options to use cross-language transforms in a Java pipeline. 
At the highest level of abstraction, some popular Python transforms are 
accessible through dedicated Java wrapper transforms. For example, the Java SDK 
has the `DataframeTransform` class, which uses the Python SDK's 
`DataframeTransform`, and it has the `RunInference` class, which uses the 
Python SDK's `RunInference`, and so on. When an SDK-specific wrapper transform 
is not available for a target Python transform, you can use the lower-level 
[PythonExternalTransform](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/PythonExternalTransform.java)
 class instead by specifying the fully qualified name of the Python transform. 
If you want to try external transforms from SDKs other than Python, you can 
also use the lowest-level 
[External](https://github.com/apache/beam/blob/master/runners/core-construction-java/src/main/java/org/apache/beam/
 runners/core/construction/External.java) class.

Review Comment:
   Users should use `External.of` with raw `ExpansionRequest` proto for 
expanding external transforms other than Python. We don't have enough utility 
functions for such use cases. While it's possible to use arbitrary transforms 
from any SDKs, I think we don't need to emphasize it in the doc by creating a 
separate subsection.



##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -7282,7 +7282,16 @@ To create an SDK wrapper for use in a Python pipeline, 
do the following:
 
 #### 13.1.2. Creating cross-language Python transforms
 
-To make your Python transform usable with different SDK languages, you must 
create a Python module that registers an existing Python transform as a 
cross-language transform for use with the Python expansion service and calls 
into that existing transform to perform its intended operation.
+Any Python transforms defined in the scope of the expansion service should be 
accessible by specifying their fully qualified names. For example, you could 
use Python's `ReadFromText` transform in a Java pipeline with its fully 
qualified name `apache_beam.io.ReadFromText`:
+
+```java
+p.apply("Read",
+    PythonExternalTransform.<PBegin, 
PCollection<String>>from("apache_beam.io.ReadFromText")
+    .withKwarg("file_pattern", options.getInputFile())
+    .withKwarg("validate", false))
+```
+
+Alternatively, you may want to create a Python module that registers an 
existing Python transform as a cross-language transform for use with the Python 
expansion service and calls into that existing transform to perform its 
intended operation. A registered URN can be used later in an expansion request 
for indicating an expansion target.

Review Comment:
   The following sections already provide step-by-step examples for registering 
existing transforms. This sentence just introduces what to come next.



##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -7282,7 +7282,16 @@ To create an SDK wrapper for use in a Python pipeline, 
do the following:
 
 #### 13.1.2. Creating cross-language Python transforms
 
-To make your Python transform usable with different SDK languages, you must 
create a Python module that registers an existing Python transform as a 
cross-language transform for use with the Python expansion service and calls 
into that existing transform to perform its intended operation.
+Any Python transforms defined in the scope of the expansion service should be 
accessible by specifying their fully qualified names. For example, you could 
use Python's `ReadFromText` transform in a Java pipeline with its fully 
qualified name `apache_beam.io.ReadFromText`:

Review Comment:
   You don't need those methods in every transform since we have a proxy class 
here for looking and wiring the fully qualified transform classes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to