Re: Potential 1.11.X showstopper

Joe Witt Thu, 06 Feb 2020 13:15:19 -0800

received a direct reply - Elli cannot share.

I think unless someone else is able to replicate the behavior there isn't
much more we can tackle on this.


Thanks

On Thu, Feb 6, 2020 at 4:10 PM Joe Witt <[email protected]> wrote:

> Yes Elli it is possible.  Can we please get those lsof outputs in a JIRA?
> As well as more details about configuration?
>
> Thanks
>
> On Thu, Feb 6, 2020 at 2:44 PM Andy LoPresto <[email protected]> wrote:
>
>> I have no input on the specific issue you’re encountering, but a pattern
>> we have seen to reduce the overhead of multiple remote input ports being
>> required is to use a “central” remote input port and immediately follow it
>> with a RouteOnAttribute to distribute specific flowfiles to the appropriate
>> downstream flow / process group. Whatever sends data to this port can use
>> an UpdateAttribute to add some “tracking/routing” attribute on the
>> flowfiles before being sent. Inserting Merge/Split will likely affect your
>> timing due to waiting for bins to fill, depending on your volume. S2S is
>> pretty good at transmitting data on-demand with low overhead on one port;
>> it’s when you have many remote input ports that there is substantial
>> overhead.
>>
>>
>> Andy LoPresto
>> [email protected]
>> [email protected]
>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>
>> > On Feb 6, 2020, at 2:34 PM, Elli Schwarz 
>> > <[email protected]>
>> wrote:
>> >
>> > We ran that command - it appears the site-to-sites that are causing the
>> issue. We had a lot of remote process groups that weren't even being used
>> (no data was being sent to that part of the dataflow), yet when running the
>> lsof command they each had a large number of open files - almost 2k! -
>> showing CLOSE_WAIT. Again, there were no flowfiles being sent to them, so
>> can it be some kind of bug that keeping a remote process group open is
>> somehow opening files and not closing them? (BTW, the reason we had to
>> upgrade from 1.9.2 to 1.11.0 was because we had upgraded our Java version
>> and that cause an IllegalBlockingModeException - is it possible that
>> whatever fixed that problem is now causing an issue with open files?)
>> >
>> > We now disabled all of the unused remote process groups. We still have
>> several remote process groups that we are using so if this is the issue it
>> might be difficult to avoid, but at least we decreased the number of remote
>> process groups we have. Another approach we are trying is a merge content
>> before we send to the Nifi having the most issues, to have fewer flow files
>> sent at once site to site, and then splitting them after they are received.
>> > Thank you!
>> >
>> >    On Thursday, February 6, 2020, 2:19:48 PM EST, Mike Thomsen <
>> [email protected]> wrote:
>> >
>> > Can you share a description of your flows in terms of average flowfile
>> size, queue size, data velocity, etc.?
>> > Thanks,
>> > Mike
>> >
>> > On Thu, Feb 6, 2020 at 1:59 PM Elli Schwarz 
>> > <[email protected]>
>> wrote:
>> >
>> >  We seem to be experiencing the same problems. We recently upgraded
>> several of our Nifis from 1.9.2 to 1.11.0, and now many of them are failing
>> with "too many open files". Nothing else changed other than the upgrade,
>> and our data volume is the same as before. The only solution we've been
>> able to come up with is to run a script to check for this condition and
>> restart the Nifi. Any other ideas?
>> > Thank you!
>> >
>> >     On Sunday, February 2, 2020, 9:11:34 AM EST, Mike Thomsen <
>> [email protected]> wrote:
>> >
>> >  Without further details, this is what I did to see if it was something
>> > other than the usual issue of having not enough file handlers available.
>> > Something like a legitimate case of someone forgetting to close file
>> > objects or something in the code itself.
>> >
>> > 1. Setup a 8core/32GB VM on AWS w/ Amazon AMI.
>> > 2. Pushed 1.11.1RC1
>> > 3. Pushed the RAM settings to 6/12GB
>> > 4. Disabled flowfile archiving because I only allocated 8GB of storage.
>> > 5. Setup a flow that used 2 generateflow instances to generate massive
>> > amounts of garbage data using all available cores. (All queues were
>> setup
>> > to hold 250k flow files)
>> > 6. Kicked it off and let it run for probably about 20 minutes.
>> >
>> > No apparent problem with closing and releasing resources here.
>> >
>> > On Sat, Feb 1, 2020 at 8:00 AM Joe Witt <[email protected]> wrote:
>> >
>> >> these are usually very easy to find.
>> >>
>> >> run lsof -p pid.  and share results
>> >>
>> >>
>> >> thanks
>> >>
>> >> On Sat, Feb 1, 2020 at 7:56 AM Mike Thomsen <[email protected]>
>> >> wrote:
>> >>
>> >>>
>> >>>
>> >>
>> https://stackoverflow.com/questions/59991035/nifi-1-11-opening-more-than-50k-files/60017064#60017064
>> >>>
>> >>> No idea if this is valid or not. I asked for clarification to see if
>> >> there
>> >>> might be a specific processor or something that is triggering this.
>> >>>
>> >>
>> >
>>
>>

Re: Potential 1.11.X showstopper

Reply via email to