[
https://issues.apache.org/jira/browse/BEAM-6858?focusedWorklogId=290596&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-290596
]
ASF GitHub Bot logged work on BEAM-6858:
----------------------------------------
Author: ASF GitHub Bot
Created on: 07/Aug/19 16:46
Start Date: 07/Aug/19 16:46
Worklog Time Spent: 10m
Work Description: salmanVD commented on pull request #9275: [BEAM-6858]
Support side inputs injected into a DoFn
URL: https://github.com/apache/beam/pull/9275#discussion_r311655427
##########
File path:
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ParDo.java
##########
@@ -652,6 +650,17 @@ public static DoFnSchemaInformation
getDoFnSchemaInformation(
return withSideInputs(Arrays.asList(sideInputs));
}
+ /**
+ * Returns a new {@link ParDo} {@link PTransform} that's like this {@link
PTransform} but with
+ * the specified additional side inputs. Does not modify this {@link
PTransform}.
+ *
+ * <p>See the discussion of Side Inputs above for more explanation.
+ */
+ public SingleOutput<InputT, OutputT> withSideInput(String tagId,
PCollectionView<?> sideInput) {
+ sideInput.setTagInternalId(tagId);
Review comment:
I tried to use SerializableUtils.clone to copy but since `PCollection`,
`PValueBase` does not implement serializable we cannot clone it with this
method. Cloning it with this method makes `PCollection`, `Pipeline` and couple
of other fields as `null`. I tried to implement setter but for `Pipeline` is
final in `PValueBase` which affects a lot of other classes as well.
So, cloning the object and setting just the PCollection in that cloned
object works but during the execution of pipeline it fails the condition in
`SideInputContainer.createReaderForViews` method with an error `Can't create a
SideInputReader with unknown views`.
I think we need to set all the objects which are getting null after cloning
the object. Do you have any other suggestion?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 290596)
Time Spent: 1h 20m (was: 1h 10m)
> Support side inputs injected into a DoFn
> ----------------------------------------
>
> Key: BEAM-6858
> URL: https://issues.apache.org/jira/browse/BEAM-6858
> Project: Beam
> Issue Type: Bug
> Components: sdk-java-core
> Reporter: Reuven Lax
> Assignee: Shehzaad Nakhoda
> Priority: Major
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> Beam currently supports injecting main inputs into a DoFn process method. A
> user can write the following:
> @ProcessElement public void process(@Element InputT element)
> And Beam will (using ByteBuddy code generation) inject the input element into
> the process method.
> We would like to also support the same for side inputs. For example:
> @ProcessElement public void process(@Element InputT element,
> @SideInput("tag1") String input1, @SideInput("tag2") Integer input2)
> This requires the existing process-method analysis framework to capture these
> side inputs. The ParDo code would have to verify the type of the side input
> and include them in the list of side inputs. This would also eliminate the
> need for the user to explicitly call withSideInputs on the ParDo.
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)