Much appreciated, that option is super helpful. Thanks again for the help!

Evan

On Fri, Jul 30, 2021 at 12:33 PM Robert Bradshaw <[email protected]>
wrote:

> You can try using --performance_runtime_type_check which will
> hopefully help you pinpoint the bad type.
>
> On Fri, Jul 30, 2021 at 9:23 AM Evan Galpin <[email protected]> wrote:
> >
> > Ah, that makes loads of sense. Ya I was thinking this must be more of
> the flavour of “silent failure” without type checking, but without your
> comments it would have taken a lot longer to figure that out for certain.
> Thanks!
> >
> > I must have an incorrect hint somewhere, such as hinting with Iterable
> which would be technically correct for both a String or a list of strings.
> >
> > Thanks again,
> > Evan
> >
> > On Fri, Jul 30, 2021 at 12:18 Robert Bradshaw <[email protected]>
> wrote:
> >>
> >> Running with --no_pipeline_type_check also disables any type inference
> >> for coders, so in this case (essentially) all your coders will be
> >> PickleCoder. You're getting an error here because beam inferred the
> >> output of MyDoFn to be a str and chose a (more efficient) str coder to
> >> encode its outputs, but in fact it outputs lists.
> >>
> >> It would be useful to see your code to see if this is a bug in Beam or
> >> an erroneous type declaration.
> >>
> >> On Fri, Jul 30, 2021 at 7:28 AM Evan Galpin <[email protected]>
> wrote:
> >> >
> >> > Hi all,
> >> >
> >> > I wonder if anyone can shed some light on an issue I'm having with
> type checking and the Beam python SDK.
> >> >
> >> > Without changing any python code, I'm finding that running my
> pipeline with or without the option no_pipeline_type_check starts the
> pipeline without issue; no type check errors are raised and the pipeline
> begins processing.
> >> >
> >> > However, if I allow pipeline type checking, I get a coder error later
> at runtime:
> >> >
> >> >   File "apache_beam/coders/coder_impl.py", line 271, in
> apache_beam.coders.coder_impl.CallbackCoderImpl.estimate_size
> >> >   File
> "/my/path/lib/python3.6/site-packages/apache_beam/coders/coders.py", line
> 189, in estimate_size
> >> >     return len(self.encode(value))
> >> >   File
> “/my/path/lib/python3.6/site-packages/apache_beam/coders/coders.py", line
> 411, in encode
> >> >     return value.encode('utf-8')
> >> > AttributeError: 'list' object has no attribute 'encode' [while
> running 'ParDo(MyDoFn)']
> >> >
> >> > I do realize that the attribute error itself is a legitimate type of
> error, and list type does in fact have no "encode" attribute. But if I run
> my pipeline with --no_pipeline_type_check, this attr error does not arise
> and the pipeline completes without errors. What could be happening here?
> >> >
> >> > Thanks,
> >> > Evan
>

Reply via email to