Bryan, Understood, but wouldn't then this processor be inefficient if you are dealing with a very large number of syslog messages, if you don't have the batching option? I suppose we could have had the option of parsing each syslog record in a batch and then writing the syslog message along with the syslog headers to the flowfile content. thanks Clay
On Mon, Aug 5, 2019 at 12:12 PM Bryan Bende <[email protected]> wrote: > Clay, > > You can only parse when its 1 message per flow file because parsing > adds all the field/value pairs as flow file attributes, which wouldn't > really make sense when you have say 1k messages with all different > values for those fields. > > -Bryan > > On Mon, Aug 5, 2019 at 11:25 AM Clay Teahouse <[email protected]> > wrote: > > > > Hi Edward, Bryan > > One more question regarding ListenSyslog. Is it possible to set batch > size > 1 with parse set to true? I am ingesting a very high volume of > syslog records and want to avoid flowfiles containing only one record but > at the same time, I want to be able to parse the records. Is there a way > around this? > > > > thanks > > Clay > > > > On Fri, Aug 2, 2019 at 8:50 AM Edward Armes <[email protected]> > wrote: > >> > >> HI Clay, > >> > >> So as Bryan has said the actual connection is managed by a selector and > all this does is goes through each connection and once that connection has > data to receive it the selector then hands that over to a thread in the TCP > receiving thread pool which does then some basic TCP processing and puts it > into a buffer for an instance of associated ListenSyslog processor to > processes, when the framework executes an instance of that processor. > >> > >> Just so you're aware while setting the maximum number of connections > does create a thread pool of 4,000 threads. In reality these threads don't > really exist until one is created by the selector to run on the pool. So in > short unless a single Nifi server gets 4,000 syslog messages in a very > short space time (< 1 micro-second) I can't see it being an issue. > >> > >> Edward > >> > >> On Fri, Aug 2, 2019 at 2:06 PM Bryan Bende <[email protected]> wrote: > >>> > >>> The actual connections themselves are managed with a selector, so if > >>> all the connections are idle there should only be one thread for the > >>> socket. > >>> > >>> As soon as a connection has something available to read then a thread > >>> is spawned to start reading the connection until either no matter is > >>> available, or it is closed. > >>> > >>> On Fri, Aug 2, 2019 at 7:18 AM Clay Teahouse <[email protected]> > wrote: > >>> > > >>> > Hello Edward, > >>> > So, if have of to listen to 32,000 tcp connections and I have only > 80 cores, and I configure each ListenSyslog instance for 4,000 connections, > doesn't each spawn 4,000 threads behind the scene? The tcp connections will > be idle most of the time. > >>> > > >>> > thanks > >>> > Clay > >>> > > >>> > > >>> > On Fri, Aug 2, 2019 at 6:10 AM Edward Armes <[email protected]> > wrote: > >>> >> > >>> >> Hi Clay, > >>> >> > >>> >> Because Nifi underneath uses a thread pool for it's own threading > underneath, and each instance processor runs does so in it's own thread, I > don't see any reason why not. One thing to note that the way the ListenTCP > processor appears to have been written such that it gets all the requests > that have been received on that socket and processes them until either it > has no more requests left or process or that instance of the processor is > no longer scheduled to run. > >>> >> > >>> >> Hope that helps > >>> >> > >>> >> Edward > >>> >> > >>> >> On Fri, Aug 2, 2019 at 11:28 AM Clay Teahouse < > [email protected]> wrote: > >>> >>> > >>> >>> Hello All, > >>> >>> > >>> >>> I need to listen to and process thousands of persistent TCP > connections. I have 10 nodes, each having 8 cores. > >>> >>> My understanding is that with existing NiFi listening processors, > such as ListnSyslog, a thread is utilized for each TCP connection. Does > this scale? Do I need to write a custom processor that utilizes a thread > pool for reading the data from the socket and processing them? > >>> >>> > >>> >>> thanks > >>> >>> Clay >
