mosche commented on code in PR #23540:
URL: https://github.com/apache/beam/pull/23540#discussion_r1142922485


##########
sdks/java/io/amazon-web-services2/src/main/java/org/apache/beam/sdk/io/aws2/kinesis/KinesisIO.java:
##########
@@ -147,6 +150,42 @@
  * Read#withCustomRateLimitPolicy(RateLimitPolicyFactory)}. This requires 
implementing {@link
  * RateLimitPolicy} with a corresponding {@link RateLimitPolicyFactory}.
  *
+ * <h4>Enhanced Fan-Out</h4>
+ *
+ * Kinesis IO supports Consumers with Dedicated Throughput (Enhanced Fan-Out, 
EFO). This type of
+ * consumer doesn't have to contend with other consumers that are receiving 
data from the stream.
+ *
+ * <p>More details can be found here: <a
+ * 
href="https://docs.aws.amazon.com/streams/latest/dev/enhanced-consumers.html";>Consumers
 with
+ * Dedicated Throughput</a>
+ *
+ * <p>EFO can be enabled for one or more {@link KinesisIO.Read} instances via 
pipeline options:

Review Comment:
   @psolomin Keep in mind, there's still other runners that don't share the 
behavior of Flink. With Dataflow I'm 99% sure you can change the configuration 
when updating / resuming a streaming pipeline 
(https://cloud.google.com/dataflow/docs/guides/updating-a-pipeline). Though, 
happy to confirm that with some Googlers if you prefer. Honestly, I'd be rather 
reluctant to add a non-functional configuration such as 
`withEnhancedFanOutEnabled()` that then fully relies on configuration via 
pipeline options.
   I feel to comply with Beam standards it should be possible to configure this 
feature alone using the read spec (even though that is very limited when used 
with Flink). 
   
   Regarding the inconsistency discussed above (enabled via read spec, but how 
to disable then), it wouldn't be hard to use the pipeline options mechanism as 
well using a null mapping:
   
   ```
   --kinesisIOConsumerArns={"stream1":null}
   ```
   
   Just to get a 3rd opinion on his, @aromanenko-dev any thoughts from your 
side? 
   
   One use case we discussed is to temporarily enabled EFO consumers to 
initially backfill a pipeline with max. throughput, but then disable EFO again 
once the pipeline caught up to save on costs.
   
   Context to this discussion is the question how to enable Enhanced Fanout 
Consumers for Kinesis to support the above use case as well. Problem is that 
the read specification is serialized together with the source in Flink 
savepoints. So when resuming the pipeline any changes to the read configuration 
won't be noticed - and EFO can't be disabled. The only way to do this (for 
Flink) is to configure EFO consumers via pipeline options. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to