damccorm commented on code in PR #36453:
URL: https://github.com/apache/beam/pull/36453#discussion_r2417911376
##########
sdks/python/apache_beam/options/pipeline_options.py:
##########
@@ -874,6 +874,18 @@ def _add_argparse_args(cls, parser):
'their condition met. Some operations, such as GroupByKey, disallow '
'this. This exists for cases where such loss is acceptable and for '
'backwards compatibility. See BEAM-9487.')
+ parser.add_argument(
+ '--force_cloudpickle_deterministic_coders',
Review Comment:
I don't quite follow the motivation - why would a user ever want to do this?
Won't this always lead to correctness problems?
##########
.github/workflows/beam_PreCommit_Python_Dill.yml:
##########
@@ -106,9 +106,9 @@ jobs:
arguments: |
-Pposargs="${{
contains(matrix.os, 'self-hosted') &&
- 'apache_beam/internal/ apache_beam/ml/ apache_beam/transforms/
apache_beam/typehints/ apache_beam/runners/portability/ -m (uses_dill and not
require_docker_in_docker)' ||
- 'apache_beam/internal/ apache_beam/ml/ apache_beam/transforms/
apache_beam/typehints/ apache_beam/runners/portability/ -m (uses_dill and
require_docker_in_docker)'
- }}" \
+ 'apache_beam/internal/pickler_test.py
apache_beam/io/gcp/bigquery_file_loads_test.py
apache_beam/options/pipeline_options_validator_test.py
apache_beam/transforms/combinefn_lifecycle_test.py
apache_beam/transforms/util_test.py apache_beam/transforms/ptransform_test.py
apache_beam/typehints/schemas_test.py
apache_beam/runners/portability/stager_test.py -m (uses_dill and not
require_docker_in_docker)' ||
+ 'apache_beam/internal/pickler_test.py
apache_beam/io/gcp/bigquery_file_loads_test.py
apache_beam/options/pipeline_options_validator_test.py
apache_beam/transforms/combinefn_lifecycle_test.py
apache_beam/transforms/util_test.py apache_beam/transforms/ptransform_test.py
apache_beam/typehints/schemas_test.py
apache_beam/runners/portability/stager_test.py -m (uses_dill and
require_docker_in_docker)'
Review Comment:
Why do we need this? Shouldn't filtering to uses_dill be enough
##########
sdks/python/apache_beam/coders/coders.py:
##########
@@ -1004,8 +1004,12 @@ def to_type_hint(self):
return Any
-def _should_force_use_dill(update_compat_version):
+def _should_force_use_dill(registry):
+ if getattr(registry, 'force_cloudpickle_deterministic_coders', False):
Review Comment:
Rather than doing this, should we provide a default value in __init__ like
https://github.com/apache/beam/blob/6f31e56fcaca5cbb71b90f631c6562a1f1471c08/sdks/python/apache_beam/coders/typecoders.py#L87
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]