[
https://issues.apache.org/jira/browse/BEAM-6067?focusedWorklogId=169400&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-169400
]
ASF GitHub Bot logged work on BEAM-6067:
----------------------------------------
Author: ASF GitHub Bot
Created on: 26/Nov/18 17:20
Start Date: 26/Nov/18 17:20
Worklog Time Spent: 10m
Work Description: CraigChambersG commented on a change in pull request
#7081: [BEAM-6067] In Python SDK, specify pipeline_proto_coder_id property in
non-Beam-standard CloudObject coders
URL: https://github.com/apache/beam/pull/7081#discussion_r236348888
##########
File path: sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
##########
@@ -441,22 +443,25 @@ def _get_side_input_encoding(self, input_encoding):
def _get_encoded_output_coder(self, transform_node, window_value=True):
"""Returns the cloud encoding of the coder for the output of a
transform."""
+ from apache_beam.runners.dataflow.internal import apiclient
if (len(transform_node.outputs) == 1
and transform_node.outputs[None].element_type is not None):
# TODO(robertwb): Handle type hints for multi-output transforms.
element_type = transform_node.outputs[None].element_type
+ use_fnapi =
apiclient._use_fnapi(transform_node.outputs[None].pipeline._options)
else:
# TODO(silviuc): Remove this branch (and assert) when typehints are
# propagated everywhere. Returning an 'Any' as type hint will trigger
# usage of the fallback coder (i.e., cPickler).
element_type = typehints.Any
+ use_fnapi = False # TODO(chambers): XXX do the right thing for this
Review comment:
One post-commit test failed, on a checksum comparison. I don't have any
deeper understanding of what the test is doing or why there was a failure. I
have had other experiences where tests were (brittlely) checking for
equivalence against some expected representation which can be adversely
affected by adding an otherwise unused property to CloudObjects.
To be clear, we *do* need the coder id in this case, at least when we
support multi-output DoFns over the fnapi using the worker code that reads this
property. We're not running such tests now. I need advice on how to get a
hold of the pipeline object in this branch in order to put in the proper code.
Also, the TODO in this branch suggests that the branch may be going away, so
maybe it doesn't need to be fixed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 169400)
Time Spent: 5.5h (was: 5h 20m)
Remaining Estimate: 162.5h (was: 162h 40m)
> Dataflow runner should include portable pipeline coder id in CloudObject
> coder representation
> ---------------------------------------------------------------------------------------------
>
> Key: BEAM-6067
> URL: https://issues.apache.org/jira/browse/BEAM-6067
> Project: Beam
> Issue Type: Improvement
> Components: beam-model
> Reporter: Craig Chambers
> Assignee: Craig Chambers
> Priority: Major
> Original Estimate: 168h
> Time Spent: 5.5h
> Remaining Estimate: 162.5h
>
> When translating a BeamJava Coder into the DataflowRunner's CloudObject
> property map, include a property that specifies the id in the Beam model
> Pipeline coders map corresponding to that Coder. This will allow the
> DataflowRunner to reference the corresponding Beam coder in the FnAPI
> processing bundle.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)