You can try using --performance_runtime_type_check which will
hopefully help you pinpoint the bad type.

On Fri, Jul 30, 2021 at 9:23 AM Evan Galpin <[email protected]> wrote:
>
> Ah, that makes loads of sense. Ya I was thinking this must be more of the 
> flavour of “silent failure” without type checking, but without your comments 
> it would have taken a lot longer to figure that out for certain. Thanks!
>
> I must have an incorrect hint somewhere, such as hinting with Iterable which 
> would be technically correct for both a String or a list of strings.
>
> Thanks again,
> Evan
>
> On Fri, Jul 30, 2021 at 12:18 Robert Bradshaw <[email protected]> wrote:
>>
>> Running with --no_pipeline_type_check also disables any type inference
>> for coders, so in this case (essentially) all your coders will be
>> PickleCoder. You're getting an error here because beam inferred the
>> output of MyDoFn to be a str and chose a (more efficient) str coder to
>> encode its outputs, but in fact it outputs lists.
>>
>> It would be useful to see your code to see if this is a bug in Beam or
>> an erroneous type declaration.
>>
>> On Fri, Jul 30, 2021 at 7:28 AM Evan Galpin <[email protected]> wrote:
>> >
>> > Hi all,
>> >
>> > I wonder if anyone can shed some light on an issue I'm having with type 
>> > checking and the Beam python SDK.
>> >
>> > Without changing any python code, I'm finding that running my pipeline 
>> > with or without the option no_pipeline_type_check starts the pipeline 
>> > without issue; no type check errors are raised and the pipeline begins 
>> > processing.
>> >
>> > However, if I allow pipeline type checking, I get a coder error later at 
>> > runtime:
>> >
>> >   File "apache_beam/coders/coder_impl.py", line 271, in 
>> > apache_beam.coders.coder_impl.CallbackCoderImpl.estimate_size
>> >   File 
>> > "/my/path/lib/python3.6/site-packages/apache_beam/coders/coders.py", line 
>> > 189, in estimate_size
>> >     return len(self.encode(value))
>> >   File 
>> > “/my/path/lib/python3.6/site-packages/apache_beam/coders/coders.py", line 
>> > 411, in encode
>> >     return value.encode('utf-8')
>> > AttributeError: 'list' object has no attribute 'encode' [while running 
>> > 'ParDo(MyDoFn)']
>> >
>> > I do realize that the attribute error itself is a legitimate type of 
>> > error, and list type does in fact have no "encode" attribute. But if I run 
>> > my pipeline with --no_pipeline_type_check, this attr error does not arise 
>> > and the pipeline completes without errors. What could be happening here?
>> >
>> > Thanks,
>> > Evan

Reply via email to