damccorm opened a new issue, #21161:
URL: https://github.com/apache/beam/issues/21161
The below code throws this type error on the effected versions, and merely
works as expected on 2.28.0:
`TypeError: Unable to deterministically encode '2021-11-02' of type '<class
'datetime.date'\>', please provide a type hint for the input of 'GroupByKey'
[while running 'Create/Map(decode)']`
```
import typing
from datetime import date
import apache_beam as beam
from apache_beam.testing.test_pipeline
import TestPipeline
with TestPipeline() as pipeline:
today = date.today()
results = (
pipeline
|
beam.Create([(1, { 'd': today }), (1, { 'd': today })])
| beam.MapTuple(lambda i, d: (d['d'], i))
# <-- this step only requires output type hints on versions after 2.28.0,
and only if the date is being
"projected" from some other data structure
| beam.CombinePerKey(sum) # <-- if this aggregation is
removed, the pipeline also works without error
)
results | beam.Map(print)
```
This stackoverflow issue is having the same problem:
https://stackoverflow.com/questions/69409693/how-do-i-use-a-datetime-date-value-in-apache-beam-groupby
It's possible to fix the errors by registering a `DateCoder` and adding
output type hints to the projection `MapTuple` step, but since this isn't
necessary in other situations and versions, it seems this is a bug. Our
production pipelines will need to add many of these tedious type hints in order
to work properly, so we're effectively blocked from upgrading to the newest
version.
Imported from Jira
[BEAM-13166](https://issues.apache.org/jira/browse/BEAM-13166). Original Jira
may contain additional context.
Reported by: [email protected].
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]