Abacn commented on code in PR #36271: URL: https://github.com/apache/beam/pull/36271#discussion_r2392220036
########## sdks/python/apache_beam/internal/cloudpickle_pickler.py: ########## @@ -196,12 +196,35 @@ def _lock_reducer(obj): def dump_session(file_path): - # It is possible to dump session with cloudpickle. However, since references - # are saved it should not be necessary. See https://s.apache.org/beam-picklers - pass + # Since References are saved (https://s.apache.org/beam-picklers), we only + # dump supported Beam Registries (currently only logical type registry) + from apache_beam.typehints import schemas + from apache_beam.coders import typecoders + + with _pickle_lock, open(file_path, 'wb') as file: + coder_reg = typecoders.registry.get_custom_type_coder_tuples() + logicaltype_reg = schemas.LogicalType._known_logical_types.copy() + + pickler = cloudpickle.CloudPickler(file) + # TODO(https://github.com/apache/beam/issues/18500) add file system registry + # once implemented + pickler.dump({"coder": coder_reg, "logicaltype": logicaltype_reg}) Review Comment: > only pickling/loading the coders that are not registered in schemas.py? I understand this change already handles this. It uses "get_custom_type_coder_tuples". Standard coders are registered inside typecoders.py, which uses "_register_coder_internal" directly -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
