[
https://issues.apache.org/jira/browse/NIFI-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15174450#comment-15174450
]
Bryan Bende commented on NIFI-1579:
-----------------------------------
Been investigating this by running a series of tests on my laptop, and
uncovered a few things...
* There is a yield in the onTrigger method when polling the queue with a 100ms
wait and getting nothing, which could hurt performance if it ends up yielding
and missing 1 second messages coming in 10s of thousands of messages per
second, since we have the 100ms poll we don't really need a yield
* The internal queue was hard coded to a max capacity of 10 which seems a bit
too small to handle possible surges, it would be much better to let the user
make a decision here about how much data to buffer in memory
* Running tests on my laptop where I send millions of messages over a few
minutes, I would eventually see a check point from the FlowFile repository with
a stop-the-world check of upwards of 10-11 seconds, and during this time
messages were still being read in from the channel and queue which could easily
fill the queue and start blocking and eventually back up to the OS buffer and
potentially drop the messages. It is not clear if this would happen on a high
performance server, but after discussing with [~markap14] we determined that
adjusting nifi.flowfile.repository.partitions in nifi.properties and reducing
it significantly from 256 (used 8 in this case) would reduce the amount
FileOutputStreams that need to be flushed and thus reduce the overall wait
Using the previous 0.5.1 release, I was barely able to achieve 5k messages per
second without any data loss.
I then applied a patch that addresses the first two items above, and tested
with the following configuration which seems to be a sweet spot on my laptop:
* JDK 1.8
* 2GB Heap
* G1GC
* Reduced nifi.flowfile.repository.partitions to 8
* Increased nifi.provenance.repository.rollover.time to 60 seconds
* Set root logger to WARN
* 2MB Socket Buffer
* 10k Internal Queue size (default value from new patch)
Test1
1 concurrent task, parsing on, batch size of 1: Up to 11k messages/sec with no
loss
4 concurrent tasks, parsing on, batch size of 1: Up to 15k messages/sec with no
loss
1 concurrent tasks, parsing off, batch size of 1000: Up to 53k messages/sec
with no loss
I will momentarily post the patch described above.
> Improve ListenSyslog Performance
> --------------------------------
>
> Key: NIFI-1579
> URL: https://issues.apache.org/jira/browse/NIFI-1579
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Bryan Bende
> Assignee: Bryan Bende
> Fix For: 0.6.0
>
>
> While testing ListenSyslog at various data rates it was observed that a
> significant amount of packets were dropped when using UDP.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)