Thank you for getting back to me. I would be happy to help contribute - has there been any discussion around this issue before?
At the least, I think it be preferable to raise a not implemented error in Python when encountering this case. It seems like multi-input for CoGroupByKey is represented as a Union of all the component collection types. Would it make sense to do the same for the output types? Is this a better discussion for the dev group? Thanks again for your time and help. Best, Joshua On Mon, Mar 30, 2020 at 11:22 AM Robert Bradshaw <[email protected]> wrote: > That is correct, type hints unfortunately are not yet supported for > multiple-output PTransforms. > > On Thu, Mar 26, 2020 at 10:05 PM Joshua B. Harrison < > [email protected]> wrote: > >> Hello all, >> >> I am working on adding type hints to my pipeline, and ran into an issue >> with PTransforms that produce multiple, tagged outputs. >> >> My class looks like this: >> >> @with_input_types(mytype.Data) >>> @with_output_types(mytype.KeyedData) >>> class DenormalizeData(ptransform.PTransform): >>> MAIN = 'denormalized' >>> SKIPPED = functions.DenormalizeData.SKIPPED >>> def expand(self, pcol: mytype.Data) -> mytype.KeyedPriceData: >>> return (pcol >>> | 'Denormalize PriceData' >> core.ParDo( >>> functions.DenormalizeData()).with_outputs( >>> self.SKIPPED, main=self.MAIN)) >> >> >> Where functions.DenormalizeData is a core.DoFn. From what I can tell, the >> type checking code here at >> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/ptransform.py#L429 >> attempts >> to access the pvalue._element_type. But in this case, the pvalue is a >> DoOutputsTuple ( >> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/pvalue.py#L239) >> which overrides __getattr__ to check for tag names. In this case, >> _element_type is not a valid tag, and I get the following partial stack: >> >> "apache_beam_2_17_0/apache_beam/transforms/ptransform.py", line 401, in >>> type_check_inputs_or_outputs >>> if pvalue_.element_type is None: >>> File "apache_beam_2_17_0/apache_beam/pvalue.py", line 241, in >>> __getattr__ >>> return self[tag] >>> File "apache_beam_2_17_0/apache_beam/pvalue.py", line 256, in >>> __getitem__ >>> tag, self._main_tag, self._tags)) >>> ValueError: Tag 'element_type' is neither the main tag 'denormalized' >>> nor any of the tags ('skipped',) >> >> >> Is my diagnoses correct? Is this a known issue? Can type hints exist on >> DoOutputsTuples? >> >> Thank you for your time and help. >> >> Best, >> Joshua >> >> -- >> Joshua Harrison | Software Engineer | [email protected] >> <[email protected]> | 404-433-0242 <(404)%20433-0242> >> > -- Joshua Harrison | Software Engineer | [email protected] <[email protected]> | 404-433-0242
