[
https://issues.apache.org/jira/browse/BEAM-10940?focusedWorklogId=505400&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-505400
]
ASF GitHub Bot logged work on BEAM-10940:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 27/Oct/20 20:21
Start Date: 27/Oct/20 20:21
Worklog Time Spent: 10m
Work Description: mxm commented on a change in pull request #13105:
URL: https://github.com/apache/beam/pull/13105#discussion_r513004878
##########
File path:
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
##########
@@ -683,8 +694,24 @@ private void translateStreamingImpulse(
inputPCollectionId,
valueCoder.getClass().getSimpleName()));
}
- keyCoder = ((KvCoder) valueCoder).getKeyCoder();
- keySelector = new KvToByteBufferKeySelector(keyCoder);
+ if (stateful) {
+ keyCoder = ((KvCoder) valueCoder).getKeyCoder();
+ keySelector = new KvToByteBufferKeySelector(keyCoder);
+ } else {
+ // For an SDF, we know that the input element should be
+ // KV<KV<element, KV<restriction, watermarkState>>, size>. We are
going to use the element
+ // as the key.
+ if (!(((KvCoder) valueCoder).getKeyCoder() instanceof KvCoder)) {
+ throw new IllegalStateException(
+ String.format(
+ Locale.ENGLISH,
+ "The element coder for splittable DoFn '%s' must be
KVCoder(KvCoder, DoubleCoder) but is: %s",
+ inputPCollectionId,
+ valueCoder.getClass().getSimpleName()));
+ }
+ keyCoder = ((KvCoder) ((KvCoder)
valueCoder).getKeyCoder()).getKeyCoder();
+ keySelector = new SdfByteBufferKeySelector(keyCoder);
+ }
Review comment:
It's because we partition based on the element instead of on the key.
This wouldn't work with stateful operations where we expect all keys to land on
the same partition.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 505400)
Time Spent: 7h 50m (was: 7h 40m)
> Portable Flink runner should handle DelayedBundleApplication from
> ProcessBundleResponse.
> ----------------------------------------------------------------------------------------
>
> Key: BEAM-10940
> URL: https://issues.apache.org/jira/browse/BEAM-10940
> Project: Beam
> Issue Type: New Feature
> Components: runner-flink
> Reporter: Boyuan Zhang
> Assignee: Boyuan Zhang
> Priority: P2
> Time Spent: 7h 50m
> Remaining Estimate: 0h
>
> SDF can produce residuals by self-checkpoint, which will be returned to
> runner by ProcessBundleResponse.DelayedBundleApplication. The portable runner
> should be able to handle the DelayedBundleApplication and reschedule it based
> on the timestamp.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)