Much appreciated, that option is super helpful. Thanks again for the help! Evan
On Fri, Jul 30, 2021 at 12:33 PM Robert Bradshaw <[email protected]> wrote: > You can try using --performance_runtime_type_check which will > hopefully help you pinpoint the bad type. > > On Fri, Jul 30, 2021 at 9:23 AM Evan Galpin <[email protected]> wrote: > > > > Ah, that makes loads of sense. Ya I was thinking this must be more of > the flavour of “silent failure” without type checking, but without your > comments it would have taken a lot longer to figure that out for certain. > Thanks! > > > > I must have an incorrect hint somewhere, such as hinting with Iterable > which would be technically correct for both a String or a list of strings. > > > > Thanks again, > > Evan > > > > On Fri, Jul 30, 2021 at 12:18 Robert Bradshaw <[email protected]> > wrote: > >> > >> Running with --no_pipeline_type_check also disables any type inference > >> for coders, so in this case (essentially) all your coders will be > >> PickleCoder. You're getting an error here because beam inferred the > >> output of MyDoFn to be a str and chose a (more efficient) str coder to > >> encode its outputs, but in fact it outputs lists. > >> > >> It would be useful to see your code to see if this is a bug in Beam or > >> an erroneous type declaration. > >> > >> On Fri, Jul 30, 2021 at 7:28 AM Evan Galpin <[email protected]> > wrote: > >> > > >> > Hi all, > >> > > >> > I wonder if anyone can shed some light on an issue I'm having with > type checking and the Beam python SDK. > >> > > >> > Without changing any python code, I'm finding that running my > pipeline with or without the option no_pipeline_type_check starts the > pipeline without issue; no type check errors are raised and the pipeline > begins processing. > >> > > >> > However, if I allow pipeline type checking, I get a coder error later > at runtime: > >> > > >> > File "apache_beam/coders/coder_impl.py", line 271, in > apache_beam.coders.coder_impl.CallbackCoderImpl.estimate_size > >> > File > "/my/path/lib/python3.6/site-packages/apache_beam/coders/coders.py", line > 189, in estimate_size > >> > return len(self.encode(value)) > >> > File > “/my/path/lib/python3.6/site-packages/apache_beam/coders/coders.py", line > 411, in encode > >> > return value.encode('utf-8') > >> > AttributeError: 'list' object has no attribute 'encode' [while > running 'ParDo(MyDoFn)'] > >> > > >> > I do realize that the attribute error itself is a legitimate type of > error, and list type does in fact have no "encode" attribute. But if I run > my pipeline with --no_pipeline_type_check, this attr error does not arise > and the pipeline completes without errors. What could be happening here? > >> > > >> > Thanks, > >> > Evan >
