Github user PramodSSImmaneni commented on a diff in the pull request:

    
https://github.com/apache/incubator-apex-malhar/pull/132#discussion_r48581219
  
    --- Diff: 
contrib/src/main/java/com/datatorrent/contrib/kafka/AbstractKafkaInputOperator.java
 ---
    @@ -392,9 +412,20 @@ public void emitTuples()
         if (maxTuplesPerWindow > 0) {
           count = Math.min(count, maxTuplesPerWindow - emitCount);
         }
    -    for (int i = 0; i < count; i++) {
    +    if (count > 0) {
    +      // If the total size transmitted in the window will be exceeded 
don't transmit anymore messages in this window
    +      // Make an exception for the case when no message has been 
transmitted in the window and transmit at least one
    +      // message even if the message size is greater than max message size 
so that the processing doesn't get stuck
    +      if ((emitCount > 0) && ((maxTotalMsgSizePerWindow - 
emitTotalMsgSize) < consumer.peekMessage().msg.size())) {
    +        return;
    +      }
    +    }
    +    int numEmitted = 0;
    +   for (int i = 0; i < count; i++) {
           KafkaConsumer.KafkaMessage message = consumer.pollMessage();
           emitTuple(message.msg);
    +      ++numEmitted;
    --- End diff --
    
    Oh ok. I actually gave it a second thought and think it is better to poll 
the message and keep it locally to avoid two locked reads from the queue and 
associated performance impact. Anyway in the other case there will exist a 
scenario where the message is out of the queue in emitTuples and new messages 
get added to the holdingBuffer and it is full resulting in one extra message 
than limit being outstanding. Made those changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to