[
https://issues.apache.org/jira/browse/BEAM-14483?focusedWorklogId=774808&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774808
]
ASF GitHub Bot logged work on BEAM-14483:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 25/May/22 20:54
Start Date: 25/May/22 20:54
Worklog Time Spent: 10m
Work Description: chamikaramj commented on code in PR #17696:
URL: https://github.com/apache/beam/pull/17696#discussion_r881987405
##########
sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/PythonMap.java:
##########
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.python.transforms;
+
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.extensions.python.PythonExternalTransform;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.util.PythonCallableSource;
+import org.apache.beam.sdk.values.PCollection;
+import org.checkerframework.checker.nullness.qual.Nullable;
+
+public class PythonMap<InputT, OutputT>
+ extends PTransform<PCollection<? extends InputT>, PCollection<OutputT>> {
+
+ private PythonCallableSource pythonFunction;
+ private @Nullable String expansionService;
+ private Coder<?> outputCoder;
+ private static final String PYTHON_MAP_FN_TRANSFORM = "apache_beam.Map";
+ private static final String PYTHON_FLATMAP_FN_TRANSFORM =
"apache_beam.FlatMap";
+ private String pythonTransform;
+
+ private PythonMap(
+ PythonCallableSource pythonFunction, Coder<?> outputCoder, String
pythonTransform) {
+ this.pythonFunction = pythonFunction;
+ this.outputCoder = outputCoder;
+ this.pythonTransform = pythonTransform;
+ }
+
+ public static <InputT, OutputT> PythonMap<InputT, OutputT> viaMapFn(
+ PythonCallableSource pythonFunction, Coder<?> outputCoder) {
+ return new PythonMap<>(pythonFunction, outputCoder,
PYTHON_MAP_FN_TRANSFORM);
+ }
+
+ public static <InputT, OutputT> PythonMap<InputT, OutputT> viaFlatMapFn(
+ PythonCallableSource pythonFunction, Coder<?> outputCoder) {
+ return new PythonMap<>(pythonFunction, outputCoder,
PYTHON_FLATMAP_FN_TRANSFORM);
+ }
+
+ public PythonMap<InputT, OutputT> withExpansionService(String
expansionService) {
+ this.expansionService = expansionService;
+ return this;
+ }
+
+ @Override
+ public PCollection<OutputT> expand(PCollection<? extends InputT> input) {
+ PythonExternalTransform<PCollection<? extends InputT>,
PCollection<OutputT>> pythonMapElements =
+ (expansionService == null)
Review Comment:
Updated.
##########
sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/PythonExternalTransform.java:
##########
@@ -191,6 +196,23 @@ public PythonExternalTransform<InputT, OutputT>
withTypeHint(
return this;
}
+ public PythonExternalTransform<InputT, OutputT> withOutputCoders(
+ Map<String, Coder<?>> outputCoders) {
+ if (this.outputCoders.size() > 0) {
+ throw new IllegalArgumentException("Output coders were already
specified");
+ }
+ this.outputCoders.putAll(outputCoders);
+ return this;
+ }
+
+ public PythonExternalTransform<InputT, OutputT> withOutputCoder(Coder<?>
outputCoder) {
Review Comment:
Done.
##########
sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/PythonExternalTransform.java:
##########
@@ -191,6 +196,23 @@ public PythonExternalTransform<InputT, OutputT>
withTypeHint(
return this;
}
+ public PythonExternalTransform<InputT, OutputT> withOutputCoders(
Review Comment:
Done.
##########
sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/PythonMap.java:
##########
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.python.transforms;
+
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.extensions.python.PythonExternalTransform;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.util.PythonCallableSource;
+import org.apache.beam.sdk.values.PCollection;
+import org.checkerframework.checker.nullness.qual.Nullable;
+
+public class PythonMap<InputT, OutputT>
+ extends PTransform<PCollection<? extends InputT>, PCollection<OutputT>> {
+
+ private PythonCallableSource pythonFunction;
+ private @Nullable String expansionService;
+ private Coder<?> outputCoder;
+ private static final String PYTHON_MAP_FN_TRANSFORM = "apache_beam.Map";
+ private static final String PYTHON_FLATMAP_FN_TRANSFORM =
"apache_beam.FlatMap";
+ private String pythonTransform;
+
+ private PythonMap(
+ PythonCallableSource pythonFunction, Coder<?> outputCoder, String
pythonTransform) {
+ this.pythonFunction = pythonFunction;
+ this.outputCoder = outputCoder;
+ this.pythonTransform = pythonTransform;
+ }
+
+ public static <InputT, OutputT> PythonMap<InputT, OutputT> viaMapFn(
+ PythonCallableSource pythonFunction, Coder<?> outputCoder) {
Review Comment:
I think it makes sense to simplify this to String for now. We can add an
overloaded methods for PythonCallableSource later of needed.
##########
sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/transforms/PythonMap.java:
##########
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.python.transforms;
+
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.extensions.python.PythonExternalTransform;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.util.PythonCallableSource;
+import org.apache.beam.sdk.values.PCollection;
+import org.checkerframework.checker.nullness.qual.Nullable;
+
+public class PythonMap<InputT, OutputT>
+ extends PTransform<PCollection<? extends InputT>, PCollection<OutputT>> {
+
+ private PythonCallableSource pythonFunction;
+ private @Nullable String expansionService;
+ private Coder<?> outputCoder;
+ private static final String PYTHON_MAP_FN_TRANSFORM = "apache_beam.Map";
+ private static final String PYTHON_FLATMAP_FN_TRANSFORM =
"apache_beam.FlatMap";
+ private String pythonTransform;
+
+ private PythonMap(
+ PythonCallableSource pythonFunction, Coder<?> outputCoder, String
pythonTransform) {
+ this.pythonFunction = pythonFunction;
+ this.outputCoder = outputCoder;
+ this.pythonTransform = pythonTransform;
+ }
+
+ public static <InputT, OutputT> PythonMap<InputT, OutputT> viaMapFn(
+ PythonCallableSource pythonFunction, Coder<?> outputCoder) {
Review Comment:
Unfortunately Python Map/FlatMap transforms end up using PickleCoder by
default when a coder hint is not provided. So we have to provide a coder hint
for the expansion to not break. We could use StringUTF8Coder or RowCoder (with
a user provided Schema) as the default but none of these options seems to be
good either.
##########
sdks/java/extensions/python/src/main/java/org/apache/beam/sdk/extensions/python/PythonExternalTransform.java:
##########
@@ -72,6 +73,9 @@
private @Nullable Object @NonNull [] argsArray;
private @Nullable Row providedKwargsRow;
+ Map<String, Coder<?>> outputCoders;
+ private static final String PYTHON_MAIN_OUTPUT_KEY = "0";
Review Comment:
I don't think the key we use here matters when using for a single output.
Changed to a different string and added a comment.
Issue Time Tracking
-------------------
Worklog Id: (was: 774808)
Time Spent: 2h 20m (was: 2h 10m)
> Add Java cross-language transforms for invoking Python Map and FlatMap
> ----------------------------------------------------------------------
>
> Key: BEAM-14483
> URL: https://issues.apache.org/jira/browse/BEAM-14483
> Project: Beam
> Issue Type: New Feature
> Components: cross-language
> Reporter: Chamikara Madhusanka Jayalath
> Assignee: Chamikara Madhusanka Jayalath
> Priority: P1
> Time Spent: 2h 20m
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)