[ 
https://issues.apache.org/jira/browse/BEAM-11719?focusedWorklogId=591804&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-591804
 ]

ASF GitHub Bot logged work on BEAM-11719:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 01/May/21 03:59
            Start Date: 01/May/21 03:59
    Worklog Time Spent: 10m 
      Work Description: shoyer commented on pull request #14680:
URL: https://github.com/apache/beam/pull/14680#issuecomment-830512097


   > Nice. The overhead of encoding the type with each element is still there, 
but this make sit possible.
   
   Yes, this is a little unfortunate. For my use-cases, it doesn't matter: we 
pass around small custom objects for keys, with the bulk of our in the form of 
`values` ​full of large NumPy arrays, which pickle handles very efficiently.
   
   One thing that I originally wanted to do as part of this change is make 
`FastPrimitivesCoder` use explicitly registered coders for nested values. But 
this turns out not to be possible (at least in general) because coders are 
registered by _typehint_ rather than types, and multiple typehints can exist 
for the same type.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 591804)
    Time Spent: 14h 20m  (was: 14h 10m)

> Enforce deterministic coding for GroupByKey and Stateful DoFns
> --------------------------------------------------------------
>
>                 Key: BEAM-11719
>                 URL: https://issues.apache.org/jira/browse/BEAM-11719
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>            Reporter: Robert Bradshaw
>            Assignee: Robert Bradshaw
>            Priority: P1
>             Fix For: 2.29.0
>
>          Time Spent: 14h 20m
>  Remaining Estimate: 0h
>
> If a non-deterministic coder, such as pickling, is used for keys this can 
> result in two copies of the same key being grouped separately (based on their 
> encodings). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to