Clay, You can only parse when its 1 message per flow file because parsing adds all the field/value pairs as flow file attributes, which wouldn't really make sense when you have say 1k messages with all different values for those fields.
-Bryan On Mon, Aug 5, 2019 at 11:25 AM Clay Teahouse <[email protected]> wrote: > > Hi Edward, Bryan > One more question regarding ListenSyslog. Is it possible to set batch size > > 1 with parse set to true? I am ingesting a very high volume of syslog records > and want to avoid flowfiles containing only one record but at the same time, > I want to be able to parse the records. Is there a way around this? > > thanks > Clay > > On Fri, Aug 2, 2019 at 8:50 AM Edward Armes <[email protected]> wrote: >> >> HI Clay, >> >> So as Bryan has said the actual connection is managed by a selector and all >> this does is goes through each connection and once that connection has data >> to receive it the selector then hands that over to a thread in the TCP >> receiving thread pool which does then some basic TCP processing and puts it >> into a buffer for an instance of associated ListenSyslog processor to >> processes, when the framework executes an instance of that processor. >> >> Just so you're aware while setting the maximum number of connections does >> create a thread pool of 4,000 threads. In reality these threads don't really >> exist until one is created by the selector to run on the pool. So in short >> unless a single Nifi server gets 4,000 syslog messages in a very short space >> time (< 1 micro-second) I can't see it being an issue. >> >> Edward >> >> On Fri, Aug 2, 2019 at 2:06 PM Bryan Bende <[email protected]> wrote: >>> >>> The actual connections themselves are managed with a selector, so if >>> all the connections are idle there should only be one thread for the >>> socket. >>> >>> As soon as a connection has something available to read then a thread >>> is spawned to start reading the connection until either no matter is >>> available, or it is closed. >>> >>> On Fri, Aug 2, 2019 at 7:18 AM Clay Teahouse <[email protected]> wrote: >>> > >>> > Hello Edward, >>> > So, if have of to listen to 32,000 tcp connections and I have only 80 >>> > cores, and I configure each ListenSyslog instance for 4,000 connections, >>> > doesn't each spawn 4,000 threads behind the scene? The tcp connections >>> > will be idle most of the time. >>> > >>> > thanks >>> > Clay >>> > >>> > >>> > On Fri, Aug 2, 2019 at 6:10 AM Edward Armes <[email protected]> >>> > wrote: >>> >> >>> >> Hi Clay, >>> >> >>> >> Because Nifi underneath uses a thread pool for it's own threading >>> >> underneath, and each instance processor runs does so in it's own thread, >>> >> I don't see any reason why not. One thing to note that the way the >>> >> ListenTCP processor appears to have been written such that it gets all >>> >> the requests that have been received on that socket and processes them >>> >> until either it has no more requests left or process or that instance of >>> >> the processor is no longer scheduled to run. >>> >> >>> >> Hope that helps >>> >> >>> >> Edward >>> >> >>> >> On Fri, Aug 2, 2019 at 11:28 AM Clay Teahouse <[email protected]> >>> >> wrote: >>> >>> >>> >>> Hello All, >>> >>> >>> >>> I need to listen to and process thousands of persistent TCP >>> >>> connections. I have 10 nodes, each having 8 cores. >>> >>> My understanding is that with existing NiFi listening processors, such >>> >>> as ListnSyslog, a thread is utilized for each TCP connection. Does this >>> >>> scale? Do I need to write a custom processor that utilizes a thread >>> >>> pool for reading the data from the socket and processing them? >>> >>> >>> >>> thanks >>> >>> Clay
