Kalpana-chavhan opened a new issue, #37209: URL: https://github.com/apache/beam/issues/37209
### Description Apache Beam Python SDK requires user-defined functions to be serializable for distributed execution. Currently, when users pass non-serializable lambdas or closures to beam.Map or beam.FlatMap, the resulting error is a low-level pickling exception that does not explain the cause or resolution. This issue proposes improving the error message during serialization failure to: - Clearly explain why serialization is required - Highlight common causes (captured variables, non-serializable objects) - Suggest correct patterns (named functions or DoFn classes) - Link to official documentation ### Proposed Solution Wrap the serialization call in apache_beam/internal/pickler.py with a clearer RuntimeError while preserving the original exception. Add unit tests to ensure the improved message is raised. ### Why this is valid This is a pure DX (Developer Experience) improvement. It does not change the execution logic of pipelines but significantly reduces the onboarding friction for new developers. ### I am willing to contribute I have identified the location in `pickler.py` and have a draft implementation ready with unit tests. ### Impact - Improves developer experience - Helps new Beam users debug pipelines faster - No behavior change or backward compatibility impact -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
