dannycranmer opened a new pull request #14406:
URL: https://github.com/apache/flink/pull/14406


   ## What is the purpose of the change
   
   The Kinesis EFO connector invokes `DescribeStream` during startup to acquire 
the stream ARN. This call also includes the shard information and has a TPS of 
10. A similar service exists, `DescribeStreamSummary` that has a TPS of 20 and 
a lighter response payload size.
   
   During startup sources with high parallelism compete to call this service 
(in `LAZY` mode), resulting in backoff and retry. Essentially the startup time 
can grow by 1s for every 10 parallelism, due to the 10 TPS. Migrating to 
`DescribeStreamSummary` will improve startup time.
   
   ## Brief change log
   
   - Migrate call to `DescribeStream` to use `DescribeStreamSummary` 
   - Updated Kinesis Connector documentation to reflect this change
   
   ## Verifying this change
   
   - Migrated unit tests, all pass
   - Deployed application to Flink cluster locally and verified working 
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
     - The serializers: no
     - The runtime per-record code paths (performance sensitive): no
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
     - The S3 file system connector: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? no
     - If yes, how is the feature documented? I tweaked the documentation to 
respect this change
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to