[ 
https://issues.apache.org/jira/browse/NIFI-2456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Percivall updated NIFI-2456:
-----------------------------------
    Attachment: Screen Shot 2016-08-01 at 8.02.10 PM.png

Attached is an example flow where "GenerateFlowFile" is running on a "1 sec" 
run schedule generating 10B flowfiles. PutKafka is configured with a "0 secs" 
run schedule and "5 secs" for the "Queue Buffering Max Time". 

You can see that the PutKafka processor is only processing 60 flowfiles per 5 
minutes and running the whole time (5 minutes worth of run time).

> In Publish/PutKafka, "linger.ms" and "Queue Buffering Max Time", 
> respectively, don't perform as expected
> --------------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-2456
>                 URL: https://issues.apache.org/jira/browse/NIFI-2456
>             Project: Apache NiFi
>          Issue Type: Bug
>            Reporter: Joseph Percivall
>         Attachments: Screen Shot 2016-08-01 at 8.02.10 PM.png
>
>
> Focusing on PutKafka first, there exists a processor property called “Queue 
> Buffering Max Time” that has the description:
>       "Maximum time to buffer data before sending to Kafka. For example a 
> setting of 100 ms will try to batch together 100 milliseconds' worth of 
> messages to send at once. This will improve throughput but adds message 
> delivery latency due to the buffering.” 
> It is utilizes the “linger.ms” Kafka property[1] under the hood to configure 
> this. The documentation for which can be found here[2] (search for 
> “linger.ms”).
> As a user I would expect this property to mean when I’m sending lots of data 
> to the processor within the buffering period it will batch them together into 
> one message to send to the server. The way it currently runs is that it will 
> instead block for that period and only send one message per “Queue Buffering 
> Max Time”. So if I have the buffering time set to "5 secs" and have data 
> queueing up in the connection before it, it will only ever process one 
> FlowFile per 5 seconds. 
> This behavior is the same when "PublishKafka" has a dynamic property with the 
> name "linger.ms" set. 
> [1] 
> https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-kafka-bundle/nifi-kafka-processors/src/main/java/org/apache/nifi/processors/kafka/PutKafka.java#L493-L493
> [2] http://kafka.apache.org/082/documentation.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to