damccorm opened a new issue, #21203:
URL: https://github.com/apache/beam/issues/21203
After upgrading our Python project from 2.31.0 to 2.33.0, we started getting
TypeCheckErrors such as
> apache_beam.typehints.decorators.TypeCheckError: Type hint violation for
'all_data/combine_new_and_all': requires `Tuple[Tuple[Any, Any], Dict[str,
Iterable[_CombinedEntry]]]` but got `Tuple[Tuple[int, int], Dict[str,
List[Union[]]]]` for element
>
> ere the output value of a `CoGroupByKey()` is apparently incorrectly
deduced to be a `Dict[str, List[Union[]]]`.
I managed to build a small repro case:
```
import apache_beam as beam
from typing import Dict, Iterable, Tuple
{
"foo": [(42, "foo")],
"bar": [(42, "bar")],
} | beam.CoGroupByKey().with_output_types(Tuple[int, Dict[str,
Iterable[str]]])
```
which raises
> apache_beam.typehints.decorators.TypeCheckError: Output type hint
violation at CoGroupByKey: expected `Tuple[int, Dict[str, Iterable[str]]]`, got
`Tuple[int, Dict[str, List[Union[]]]]`
>
> alternatively, using a TestPipeline:
> ```
import apache_beam as beam
from apache_beam.testing.test_pipeline import TestPipeline
from apache_beam.testing.util
import assert_that, equal_to
from typing import Dict, Iterable, Tuple
with TestPipeline() as p:
actual = {
"foo": p | "create_foo" >> beam.Create([(42, "foo")]),
"bar": p | "create_bar"
>> beam.Create([(42, "bar")]),
} | beam.CoGroupByKey().with_output_types(Tuple[int, Dict[str,
Iterable[str]]])
assert_that(actual, equal_to([(42, {"foo": ["foo"], "bar": ["bar"]})]))
```
Oh, and one more thing, about that `Tuple[Any, Any]` from the original error
message I posted. We can reproduce that like this:
```
import apache_beam as beam
from typing import Dict, Iterable, NewType, Tuple
key = NewType("key",
int)
{
"foo": [(key(1337), "foo")],
"bar": [(key(1337), "bar")],
} | beam.CoGroupByKey().with_output_types(Tuple[key,
Dict[str, Iterable[str]]])
```
> apache_beam.typehints.decorators.TypeCheckError: Output type hint
violation at CoGroupByKey: expected `Tuple[Any, Dict[str, Iterable[str]]]`, got
`Tuple[int, Dict[str, List[Union[]]]]`
>
> looks like `NewType` is treated as `Any`? That surprised me.
I could also reproduce the issue in 2.32.0.
Imported from Jira
[BEAM-13217](https://issues.apache.org/jira/browse/BEAM-13217). Original Jira
may contain additional context.
Reported by: mrwonko.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]