[GitHub] [beam] lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.
lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders. URL: https://github.com/apache/beam/pull/11222#discussion_r398806850 ## File path: sdks/python/apache_beam/runners/portability/fn_api_runner/translations.py ## @@ -347,6 +348,22 @@ def add_or_get_coder_id(self, self.components.coders[new_coder_id].CopyFrom(coder_proto) return new_coder_id + def add_data_channel_coder(self, pcoll_id): +pcoll = self.components.pcollections[pcoll_id] +proto = beam_runner_api_pb2.Coder( +spec=beam_runner_api_pb2.FunctionSpec( +urn=common_urns.coders.WINDOWED_VALUE.urn), +component_coder_ids=[ +pcoll.coder_id, +self.components.windowing_strategies[ +pcoll.windowing_strategy_id].window_coder_id +]) +channel_coder = self.add_or_get_coder_id( +proto, pcoll.coder_id + '_windowed') +if pcoll.coder_id in self.safe_coders: + channel_coder = self.length_prefixed_coder(channel_coder) Review comment: I now understand the nuance of your statement. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [beam] lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.
lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders. URL: https://github.com/apache/beam/pull/11222#discussion_r398793631 ## File path: sdks/python/apache_beam/runners/portability/fn_api_runner/translations.py ## @@ -347,6 +348,22 @@ def add_or_get_coder_id(self, self.components.coders[new_coder_id].CopyFrom(coder_proto) return new_coder_id + def add_data_channel_coder(self, pcoll_id): +pcoll = self.components.pcollections[pcoll_id] +proto = beam_runner_api_pb2.Coder( +spec=beam_runner_api_pb2.FunctionSpec( +urn=common_urns.coders.WINDOWED_VALUE.urn), +component_coder_ids=[ +pcoll.coder_id, +self.components.windowing_strategies[ +pcoll.windowing_strategy_id].window_coder_id +]) +channel_coder = self.add_or_get_coder_id( +proto, pcoll.coder_id + '_windowed') +if pcoll.coder_id in self.safe_coders: + channel_coder = self.length_prefixed_coder(channel_coder) Review comment: lines 398-399 already do this check as part of the `length_prefixed_coder` method. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [beam] lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.
lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders. URL: https://github.com/apache/beam/pull/11222#discussion_r398677414 ## File path: sdks/python/apache_beam/runners/portability/fn_api_runner/translations.py ## @@ -347,6 +348,22 @@ def add_or_get_coder_id(self, self.components.coders[new_coder_id].CopyFrom(coder_proto) return new_coder_id + def add_data_channel_coder(self, pcoll_id): +pcoll = self.components.pcollections[pcoll_id] +proto = beam_runner_api_pb2.Coder( +spec=beam_runner_api_pb2.FunctionSpec( +urn=common_urns.coders.WINDOWED_VALUE.urn), +component_coder_ids=[ +pcoll.coder_id, +self.components.windowing_strategies[ +pcoll.windowing_strategy_id].window_coder_id +]) +channel_coder = self.add_or_get_coder_id( +proto, pcoll.coder_id + '_windowed') +if pcoll.coder_id in self.safe_coders: + channel_coder = self.length_prefixed_coder(channel_coder) Review comment: Note this check is already done within `length_prefix_coder` where it will return the original coder id if its a safe coder so you can always use it. ```suggestion channel_coder = self.length_prefixed_coder(channel_coder) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [beam] lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.
lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders. URL: https://github.com/apache/beam/pull/11222#discussion_r398678099 ## File path: sdks/python/apache_beam/runners/portability/fn_api_runner/fn_runner.py ## @@ -426,7 +423,8 @@ def _collect_written_timers_and_add_to_deferred_inputs( pipeline_components, # type: beam_runner_api_pb2.Components stage, # type: translations.Stage bundle_context_manager, # type: execution.BundleContextManager - deferred_inputs # type: MutableMapping[str, PartitionableBuffer] + deferred_inputs, # type: MutableMapping[str, PartitionableBuffer] + data_channel_coders, Review comment: If possible try and keep the typing information on methods. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [beam] lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders.
lukecwik commented on a change in pull request #11222: [BEAM-4150] Don't window PCollection coders. URL: https://github.com/apache/beam/pull/11222#discussion_r398677414 ## File path: sdks/python/apache_beam/runners/portability/fn_api_runner/translations.py ## @@ -347,6 +348,22 @@ def add_or_get_coder_id(self, self.components.coders[new_coder_id].CopyFrom(coder_proto) return new_coder_id + def add_data_channel_coder(self, pcoll_id): +pcoll = self.components.pcollections[pcoll_id] +proto = beam_runner_api_pb2.Coder( +spec=beam_runner_api_pb2.FunctionSpec( +urn=common_urns.coders.WINDOWED_VALUE.urn), +component_coder_ids=[ +pcoll.coder_id, +self.components.windowing_strategies[ +pcoll.windowing_strategy_id].window_coder_id +]) +channel_coder = self.add_or_get_coder_id( +proto, pcoll.coder_id + '_windowed') +if pcoll.coder_id in self.safe_coders: + channel_coder = self.length_prefixed_coder(channel_coder) Review comment: Note this check is already done within `length_prefix_coder` where it will return the original coder id if its a safe coder so you ```suggestion channel_coder = self.length_prefixed_coder(channel_coder) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services