chuckwondo commented on issue #28558:
URL: https://github.com/apache/beam/issues/28558#issuecomment-1735171162

   > > Instead PickleCoder, which uses the pickle module, is what is being 
invoked to pickle the function, rather than dill.
   > 
   > This is not the case, functions are pickled using 
`apache_beam.internal.pickler`. Coders are only user to encode pcollection 
elements. The `--pickle_library` option did not intend to influence the 
pickler's selection of PickleCoder - the intent of that coder was to use 
Python's standard pickler module.
   
   Perhaps that should be the case, but that is not what I am experiencing, 
which is why I'm reporting this. Apache Beam is very new to me, so it could 
very well be that I simply don't know what I'm doing, and I'm missing something 
important.
   
   I'll attempt to summarize and clarify what I'm doing and what I'm 
encountering:
   
   1. My code snippets from above are pulled from this issue I created in 
`pangeo-forge-recipes`: 
https://github.com/pangeo-forge/pangeo-forge-recipes/issues/616
   2. Since I wrote that issue, I discovered beam's `save_main_session` and 
`pickle_library` options as possibilities for addressing the pickling error I'm 
encountering.
   3. Finding that no combination of setting those options eliminates the 
pickling error, I created the issue here. (Only by tweaking my locally 
installed apache_beam dependency's `PickleCoder` to use the internal `pickler` 
module was I able to eliminate the picking error.)
   
   My goal is to drop a problematic variable (`"lst_unc_sys"`) from my dataset, 
but using the `"preprocess"` option of the `mzz_kwargs` argument to 
`CombineReferences` is failing because something in the bowels of beam seems 
not to realize that it should be using `apache_beam.internal.pickler` to pickle 
the preprocess function I'm supplying.
   
   Is there something I'm missing in order to make that happen?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to