mxm commented on a change in pull request #13105:
URL: https://github.com/apache/beam/pull/13105#discussion_r513004878
##########
File path:
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
##########
@@ -683,8 +694,24 @@ private void translateStreamingImpulse(
inputPCollectionId,
valueCoder.getClass().getSimpleName()));
}
- keyCoder = ((KvCoder) valueCoder).getKeyCoder();
- keySelector = new KvToByteBufferKeySelector(keyCoder);
+ if (stateful) {
+ keyCoder = ((KvCoder) valueCoder).getKeyCoder();
+ keySelector = new KvToByteBufferKeySelector(keyCoder);
+ } else {
+ // For an SDF, we know that the input element should be
+ // KV<KV<element, KV<restriction, watermarkState>>, size>. We are
going to use the element
+ // as the key.
+ if (!(((KvCoder) valueCoder).getKeyCoder() instanceof KvCoder)) {
+ throw new IllegalStateException(
+ String.format(
+ Locale.ENGLISH,
+ "The element coder for splittable DoFn '%s' must be
KVCoder(KvCoder, DoubleCoder) but is: %s",
+ inputPCollectionId,
+ valueCoder.getClass().getSimpleName()));
+ }
+ keyCoder = ((KvCoder) ((KvCoder)
valueCoder).getKeyCoder()).getKeyCoder();
+ keySelector = new SdfByteBufferKeySelector(keyCoder);
+ }
Review comment:
It's because we partition based on the element instead of on the key.
This wouldn't work with stateful operations where we expect all keys to land on
the same partition.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]