Github user fmthoma commented on a diff in the pull request:

    https://github.com/apache/flink/pull/6021#discussion_r190154347
  
    --- Diff: 
flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisProducer.java
 ---
    @@ -326,6 +342,24 @@ private void checkAndPropagateAsyncError() throws 
Exception {
                }
        }
     
    +   /**
    +    * If the internal queue of the {@link KinesisProducer} gets too long,
    +    * flush some of the records until we are below the limit again.
    +    * We don't want to flush _all_ records at this point since that would
    +    * break record aggregation.
    +    */
    +   private void checkQueueLimit() {
    +           while (producer.getOutstandingRecordsCount() >= queueLimit) {
    +                   producer.flush();
    --- End diff --
    
    @tzulitai @bowenli86 I've given this some more thought. `wait()`/`notify()` 
requires a `synchronized` block. So if we just notify some lock in the 
callback, this would lead to synchronization overhead. We'd have to recognize a 
transition from »queue size > queue limit« to »queue size <= queue limit« 
and only synchronize then, which adds a lot of complexity.
    
    On the other hand: Kinesis accepts up to 1MB per second per shard. The 
queue limit should be chosen so that some data can be accumulated still before 
sending, i.e. more than a second of data (more than 1MB per shard). If the 
queue limit is chosen adequately, then the `Thread.sleep(500)` does not harm, 
as the queued records take more than one second to flush anyway. If the queue 
limit is chosen too low, then sleeping half a second may be too long, but we 
would not reach maximum throughput anyway because of the limitation on the 
number of `Put` requests.
    
    I think it's not worth the additional complexity.


---

Reply via email to