chamikaramj commented on code in PR #23837:
URL: https://github.com/apache/beam/pull/23837#discussion_r1006030776


##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -7282,7 +7282,16 @@ To create an SDK wrapper for use in a Python pipeline, 
do the following:
 
 #### 13.1.2. Creating cross-language Python transforms
 
-To make your Python transform usable with different SDK languages, you must 
create a Python module that registers an existing Python transform as a 
cross-language transform for use with the Python expansion service and calls 
into that existing transform to perform its intended operation.
+Any Python transforms defined in the scope of the expansion service should be 
accessible by specifying their fully qualified names. For example, you could 
use Python's `ReadFromText` transform in a Java pipeline with its fully 
qualified name `apache_beam.io.ReadFromText`:
+
+```java
+p.apply("Read",
+    PythonExternalTransform.<PBegin, 
PCollection<String>>from("apache_beam.io.ReadFromText")
+    .withKwarg("file_pattern", options.getInputFile())
+    .withKwarg("validate", false))
+```
+
+Alternatively, you may want to create a Python module that registers an 
existing Python transform as a cross-language transform for use with the Python 
expansion service and calls into that existing transform to perform its 
intended operation. A registered URN can be used later in an expansion request 
for indicating an expansion target.

Review Comment:
   Should we provide an example snippet for this as well ?



##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -7282,7 +7282,16 @@ To create an SDK wrapper for use in a Python pipeline, 
do the following:
 
 #### 13.1.2. Creating cross-language Python transforms
 
-To make your Python transform usable with different SDK languages, you must 
create a Python module that registers an existing Python transform as a 
cross-language transform for use with the Python expansion service and calls 
into that existing transform to perform its intended operation.
+Any Python transforms defined in the scope of the expansion service should be 
accessible by specifying their fully qualified names. For example, you could 
use Python's `ReadFromText` transform in a Java pipeline with its fully 
qualified name `apache_beam.io.ReadFromText`:

Review Comment:
   This still requires "from_runner_api_proto" " "to_runner_api_proto" methods 
to be available in the Python transform, correct ?
   
   If so we should note that here.



##########
website/www/site/content/en/documentation/programming-guide.md:
##########
@@ -7393,7 +7402,29 @@ Depending on the SDK language of the pipeline, you can 
use a high-level SDK-wrap
 
 #### 13.2.1. Using cross-language transforms in a Java pipeline
 
-Currently, to access cross-language transforms from the Java SDK, you have to 
use the lower-level 
[External](https://github.com/apache/beam/blob/master/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/External.java)
 class.
+Users have three options to use cross-language transforms in a Java pipeline. 
At the highest level of abstraction, some popular Python transforms are 
accessible through dedicated Java wrapper transforms. For example, the Java SDK 
has the `DataframeTransform` class, which uses the Python SDK's 
`DataframeTransform`, and it has the `RunInference` class, which uses the 
Python SDK's `RunInference`, and so on. When an SDK-specific wrapper transform 
is not available for a target Python transform, you can use the lower-level 
[PythonExternalTransform](https://github.com/apache/beam/blob/master/sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/PythonExternalTransform.java)
 class instead by specifying the fully qualified name of the Python transform. 
If you want to try external transforms from SDKs other than Python, you can 
also use the lowest-level 
[External](https://github.com/apache/beam/blob/master/runners/core-construction-java/src/main/java/org/apache/beam/
 runners/core/construction/External.java) class.

Review Comment:
   We should try to make this less language combination specific (Python from 
Java) if possible. Probably structure it as:
   
   \#\#\#\#\# Simplified APIs for using Python transforms from Java
   \#\#\#\#\# Generic API for using arbitrary transforms from Java 
   
   Also, please mention that latter applies for Java-on-Java as well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to