claudevdm commented on issue #21298:
URL: https://github.com/apache/beam/issues/21298#issuecomment-2895796986

   Hi @mattalbr , I hope we can get this done in the coming months. The current 
blocker is that Deterministic FastPrimitivesCoder encodes the type using dill 
https://github.com/apache/beam/blob/b9fa49a9827dd28349e382f479ebd1a8bbe27d07/sdks/python/apache_beam/coders/coder_impl.py#L529.
 This works because dill is deterministic.
   
   Cloudpickle on the other hand uses uuid's to track dynamic types. Replacing 
dill here would break GroupByKey that requires the byte keys to be 
deterministic.
   
   We need to modify cloudpickle to generate deterministic id's for dynamic 
types, which will require a fair amount of validation.
   
   If this is not viable we could also consider keep using dill for encoding 
types here but relax the strict dependency, which will also require validation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to