Ah, that makes loads of sense. Ya I was thinking this must be more of the
flavour of “silent failure” without type checking, but without your
comments it would have taken a lot longer to figure that out for certain.
Thanks!

I must have an incorrect hint somewhere, such as hinting with Iterable
which would be technically correct for both a String or a list of strings.

Thanks again,
Evan

On Fri, Jul 30, 2021 at 12:18 Robert Bradshaw <[email protected]> wrote:

> Running with --no_pipeline_type_check also disables any type inference
> for coders, so in this case (essentially) all your coders will be
> PickleCoder. You're getting an error here because beam inferred the
> output of MyDoFn to be a str and chose a (more efficient) str coder to
> encode its outputs, but in fact it outputs lists.
>
> It would be useful to see your code to see if this is a bug in Beam or
> an erroneous type declaration.
>
> On Fri, Jul 30, 2021 at 7:28 AM Evan Galpin <[email protected]> wrote:
> >
> > Hi all,
> >
> > I wonder if anyone can shed some light on an issue I'm having with type
> checking and the Beam python SDK.
> >
> > Without changing any python code, I'm finding that running my pipeline
> with or without the option no_pipeline_type_check starts the pipeline
> without issue; no type check errors are raised and the pipeline begins
> processing.
> >
> > However, if I allow pipeline type checking, I get a coder error later at
> runtime:
> >
> >   File "apache_beam/coders/coder_impl.py", line 271, in
> apache_beam.coders.coder_impl.CallbackCoderImpl.estimate_size
> >   File
> "/my/path/lib/python3.6/site-packages/apache_beam/coders/coders.py", line
> 189, in estimate_size
> >     return len(self.encode(value))
> >   File
> “/my/path/lib/python3.6/site-packages/apache_beam/coders/coders.py", line
> 411, in encode
> >     return value.encode('utf-8')
> > AttributeError: 'list' object has no attribute 'encode' [while running
> 'ParDo(MyDoFn)']
> >
> > I do realize that the attribute error itself is a legitimate type of
> error, and list type does in fact have no "encode" attribute. But if I run
> my pipeline with --no_pipeline_type_check, this attr error does not arise
> and the pipeline completes without errors. What could be happening here?
> >
> > Thanks,
> > Evan
>

Reply via email to