[
https://issues.apache.org/jira/browse/BEAM-3515?focusedWorklogId=99253&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-99253
]
ASF GitHub Bot logged work on BEAM-3515:
----------------------------------------
Author: ASF GitHub Bot
Created on: 07/May/18 23:07
Start Date: 07/May/18 23:07
Worklog Time Spent: 10m
Work Description: jkff commented on issue #5277: [BEAM-3515] Portable
translation of SplittableProcessKeyed
URL: https://github.com/apache/beam/pull/5277#issuecomment-387233631
I'm not sure how your understanding of "ParDo is a primitive" is different
from my understanding of the same, and how that is different from "well-defined
composite". It seems that in theory runners are free to evaluate both
transforms any way they wish, and for both there is a well-known particularly
common way to evaluate them, perhaps common enough to be used in all runners,
perhaps not. I don't see how in this sense SPK is different from COMBINE_PGBKCV.
I think we both agree that ultimately the pipeline proto, as submitted by
the SDK to a JobService, should contain a ParDoPayload with is_splittable=true,
and then ParDoPayload should probably contain a restriction coder. I'm happy to
add that to the current PR, but it will not be used just yet (because currently
a splittable ParDo is expanded before the pipeline is translated to proto, so
the proto doesn't contain any ParDoPayload's with is_splittable=true).
Are you saying that the SPK transform should not exist at all, or that it
should exist somehow less prominently, or that its representation should be
piggybacked onto ParDoPayload? (but it definitely should be evaluated
differently than a regular ParDo because it is splittable, and it should also
be evaluated differently than a splittable ParDo because it doesn't take
elements but takes element-restriction tuples) I'm also not sure what are the
"three primitives" you're referring to.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 99253)
Time Spent: 2h 20m (was: 2h 10m)
> Use portable ParDoPayload for SDF in DataflowRunner
> ---------------------------------------------------
>
> Key: BEAM-3515
> URL: https://issues.apache.org/jira/browse/BEAM-3515
> Project: Beam
> Issue Type: Sub-task
> Components: runner-dataflow
> Reporter: Kenneth Knowles
> Assignee: Eugene Kirpichov
> Priority: Major
> Labels: portability
> Time Spent: 2h 20m
> Remaining Estimate: 0h
>
> The Java-specific blobs transmitted to Dataflow need more context, in the
> form of portability framework protos.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)