claudevdm commented on code in PR #34348:
URL: https://github.com/apache/beam/pull/34348#discussion_r2014764448
##########
sdks/go/pkg/beam/runners/prism/internal/urns/urns.go:
##########
@@ -124,6 +124,7 @@ var (
CoderTimer = cdrUrn(pipepb.StandardCoders_TIMER)
CoderKV = cdrUrn(pipepb.StandardCoders_KV)
+ CoderTuple = "beam:coder:tuple:v1"
Review Comment:
> The actual shuffle needs KVs
What do you mean by this? The way I understand it is
- Reshuffle adds random keys (k, v)
- ReifyMetadata maintains a kv, with a nested tuple as a value (value,
timestamp, pane_info)
https://github.com/apache/beam/blob/db0aa824461610463df455fee69ca52b0b8ba3f4/sdks/python/apache_beam/transforms/util.py#L966
I guess if we want to avoid this we can have a nested kv in ReifyMetedata so
it is
- key, (value, (timestamp, pane_info))
Then the regular kv coder should work?
Or can we also use windowed_value as the value in the reify output instead
of a tuple with the medatada?
The original reify just used a kv as the value in the reify function
https://github.com/apache/beam/blob/57d1c35aaf7fadc62e69ddf72e95af803d92b1c3/sdks/python/apache_beam/transforms/util.py#L972
We are now including pane info as mentioned above so it becomes a tuple
This only happens for global window case, in the custom window case the
value for reify is a windowed value
https://github.com/apache/beam/blob/57d1c35aaf7fadc62e69ddf72e95af803d92b1c3/sdks/python/apache_beam/transforms/util.py#L996
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]