tvalentyn commented on issue #22893:
URL: https://github.com/apache/beam/issues/22893#issuecomment-1502354194

   Thanks all, here is an update.
   
   - Dill has made several breaking changes between dill version 0.3.1.1 and 
dill version 0.3.6, which affect what gets pickled and how: 
https://github.com/uqfoundation/dill/issues?q=is%3Aissue+regression 
   - Switching Apache Beam to newer version of dill will very likely negatively 
affect some group of users, while it may be a non-issue for some other group of 
users.
   - Dill has also made changes to internal code, which Beam had some 
assumptions about. Beam 2.47.0 will include code changes to be able to work 
with `dill==0.3.6`: https://github.com/apache/beam/pull/26086, however the 
default and required version of dill still remains `dill==0.3.1.1` for now. 
   -  `dill==0.3.1.1`. doesn't support Python 3.11. For Beam, one of the 
primary motivation to upgrade dill is to support Python 3.11, in addition to 
concerns in this bug. To unblock Python 3.11 support we went with  
monkey-patching dill 0.3.1.1 at runtime. The patch is applied only if dill 
version is 0.3.1.1 and Python version 3.11 or higher: [`3d0ee7b` 
(#26121)](https://github.com/apache/beam/pull/26121/commits/3d0ee7b4ccbebe6069e0dca81d1bfe46381d546f)
 .  The alternative we have considered is to vendor dill.  I decided against 
vendoring at this time at last minute because: dill makes changes to the global 
state, for example modifies global dispatch table used by standard pickler. 
Given the demand for this issue it is clear that newer versions of dill will be 
installed in addition to vendored version. The vendored version and a stock 
version installed at runtime may potentially modify the global state 
differently. I didn't have enough time before 2.47.0 release cut to properly 
evalu
 ate and address a risk of such concurrent modification.    
   - I have also evaluated setting cloudpickle as default pickler, and have 
encountered one issue that warranted additional investigation: 
https://github.com/apache/beam/issues/26209 . 
   - I plan to have a conversation with dill maintainers to see if we can 
mitigate the impact of the breaking changes they have introduced to be able to 
update smoothly or switch to a different default pickler.
   
   In the meantime, with Beam 2.47.0, users can try to update to newer version 
of dill, even though beam requires dill 0.3.1.1. Users can force-install newer 
version of dill in their submission environment ***as long as they install the 
same version of dill at runtime environment***. As I mentioned above, some 
users may not be affected by dill's breaking changes while some other users may 
be. Dill's breaking changes are not something Beam controls, but as mentioned 
above, Beam did make code changes to work with newer versions of dill. I will 
also continue to work on a better solution for this issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to