Thanks Claude! Great to see a lot of progress on this effort. The dependency on an old version of dill has been a persistent painpoint for many users.
Please call out this change in the release notes, so that customers can provide feedback and find instructions on how to unblock themselves. It can be a linked github issue with a summary and examples of necessary code changes. On Mon, Apr 28, 2025 at 3:52 PM Claudius van der Merwe <claud...@vdmza.com> wrote: > Hi Beam Devs, > > I am making progress on making cloudpickle the default pickling library > and removing the strict dependency on dill as outlined in > https://s.apache.org/beam-cloudpickle-next-steps. > > The current plan is to: > > 1. Make cloudpickle the default library in Beam 2.65.0 release (see > https://github.com/apache/beam/pull/34695). Users will be able to specify > pickle_library='dill' without any additional requirements. There will still > be a hard dependency on dill (blocked by #2) but it is a step in the right > direction. > > 2. Remove the strict dependency on dill in Beam 2.66.0 release. Dill is > directly used for coder's encoding types in FastPrimitivesCoderImpl [1][2]. > I prefer to submit a fix for this after the branch cut so we have more time > to identify any issues. > > Coudpickle has some fundamentally different pickling behavior to dill that > is likely to break: > > - > > Unittests that rely on globals > - > > This can be fixed by using apache_beam.utils.shared [3] > - > > Closures and dynamic classes that reference unpicklable globals > - > > This can be fixed by defining functions in the top level, and using > functools.partial to bind parameters if necessary > > > [1] > https://github.com/apache/beam/blob/b9fa49a9827dd28349e382f479ebd1a8bbe27d07/sdks/python/apache_beam/coders/coder_impl.py#L529 > > [2] > https://github.com/apache/beam/blob/b9fa49a9827dd28349e382f479ebd1a8bbe27d07/sdks/python/apache_beam/coders/coder_impl.py#L595 > > [3] > https://github.com/apache/beam/blob/b9fa49a9827dd28349e382f479ebd1a8bbe27d07/sdks/python/apache_beam/internal/cloudpickle_pickler_test.py#L54 > > > I'd appreciate any feedback or concerns. > > > Best, > > Claude > >