[
https://issues.apache.org/jira/browse/BEAM-11099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Burke updated BEAM-11099:
--------------------------------
Resolution: Duplicate
Status: Resolved (was: Open)
Turns out this is s a duplicate of BEAM-3305, which I didn't recognize as the
same feature.
> Go SDK Custom - Pre-Processing of SideInput data.
> -------------------------------------------------
>
> Key: BEAM-11099
> URL: https://issues.apache.org/jira/browse/BEAM-11099
> Project: Beam
> Issue Type: Wish
> Components: sdk-go
> Reporter: Robert Burke
> Priority: P4
>
> An idea borrowed from python: Allow users to specify a way to pre-process
> side input data on first use, and leverage the caching. This can simplify
> user DoFns by allowing them to convert their side input data (mostly lists)
> into a more useful form for their access pattern.
> It is strongly recommended to add Map Side Inputs
> https://issues.apache.org/jira/browse/BEAM-3293 before implementing this
> suggestion, and required to have caching implemented
> https://issues.apache.org/jira/browse/BEAM-11097. Otherwise very little
> benefit is acheived.
> See https://issues.apache.org/jira/browse/BEAM-3293 for where code might need
> to be changed.
> In particular, it would require a mechanism for the SDK to determine that a
> given unknown type is actually representing a side input, and a method by
> which to pre-process the data associated with it.
> Positional handling would expect to be maintained to identify the type of
> side inputs for pipeline type checking.
> Some "magic Method" similar to how the structural DoFn methods is likely the
> right approach, however, it's an open question on how to make this scale
> properly to more than a single side input. Otherwise, perhaps something that
> takes in a valid side input form, and returns a single value to be used
> instead?
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)