My setup was very similar, but didn't have the site to site reporting.

On Thu, Feb 6, 2020 at 5:13 PM Joe Witt <[email protected]> wrote:

> yeah will investigate
>
> thanks
>
> On Thu, Feb 6, 2020 at 4:49 PM Ryan Hendrickson <
> [email protected]> wrote:
>
> > Joe,
> >    We're running:
> >
> >    - OpenJDK Java 1.8.0_242
> >    - NiFi 1.11.0
> >    - CentOS Linux 7.7.1908
> >
> >
> >    We're seeing this across a dozen NiFi's with the same setup.  To
> > reproduce the issue, Generate Flow Files 100GB across a couple million
> > files -> Site to Site -> Receive data -> Merge Content.  We had no issues
> > with this stack:
> >
> >    - OpenJDK Java 1.8.0_232.
> >    - NiFi 1.9.2
> >    - CentOS Linux 7.7.1908
> >
> >    Can your team setup a similar stack and test?
> >
> > Ryan
> >
> > On Thu, Feb 6, 2020 at 4:15 PM Joe Witt <[email protected]> wrote:
> >
> > > received a direct reply - Elli cannot share.
> > >
> > > I think unless someone else is able to replicate the behavior there
> isn't
> > > much more we can tackle on this.
> > >
> > > Thanks
> > >
> > > On Thu, Feb 6, 2020 at 4:10 PM Joe Witt <[email protected]> wrote:
> > >
> > > > Yes Elli it is possible.  Can we please get those lsof outputs in a
> > JIRA?
> > > > As well as more details about configuration?
> > > >
> > > > Thanks
> > > >
> > > > On Thu, Feb 6, 2020 at 2:44 PM Andy LoPresto <[email protected]>
> > > wrote:
> > > >
> > > >> I have no input on the specific issue you’re encountering, but a
> > pattern
> > > >> we have seen to reduce the overhead of multiple remote input ports
> > being
> > > >> required is to use a “central” remote input port and immediately
> > follow
> > > it
> > > >> with a RouteOnAttribute to distribute specific flowfiles to the
> > > appropriate
> > > >> downstream flow / process group. Whatever sends data to this port
> can
> > > use
> > > >> an UpdateAttribute to add some “tracking/routing” attribute on the
> > > >> flowfiles before being sent. Inserting Merge/Split will likely
> affect
> > > your
> > > >> timing due to waiting for bins to fill, depending on your volume.
> S2S
> > is
> > > >> pretty good at transmitting data on-demand with low overhead on one
> > > port;
> > > >> it’s when you have many remote input ports that there is substantial
> > > >> overhead.
> > > >>
> > > >>
> > > >> Andy LoPresto
> > > >> [email protected]
> > > >> [email protected]
> > > >> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> > > >>
> > > >> > On Feb 6, 2020, at 2:34 PM, Elli Schwarz <
> [email protected]
> > > .INVALID>
> > > >> wrote:
> > > >> >
> > > >> > We ran that command - it appears the site-to-sites that are
> causing
> > > the
> > > >> issue. We had a lot of remote process groups that weren't even being
> > > used
> > > >> (no data was being sent to that part of the dataflow), yet when
> > running
> > > the
> > > >> lsof command they each had a large number of open files - almost
> 2k! -
> > > >> showing CLOSE_WAIT. Again, there were no flowfiles being sent to
> them,
> > > so
> > > >> can it be some kind of bug that keeping a remote process group open
> is
> > > >> somehow opening files and not closing them? (BTW, the reason we had
> to
> > > >> upgrade from 1.9.2 to 1.11.0 was because we had upgraded our Java
> > > version
> > > >> and that cause an IllegalBlockingModeException - is it possible that
> > > >> whatever fixed that problem is now causing an issue with open
> files?)
> > > >> >
> > > >> > We now disabled all of the unused remote process groups. We still
> > have
> > > >> several remote process groups that we are using so if this is the
> > issue
> > > it
> > > >> might be difficult to avoid, but at least we decreased the number of
> > > remote
> > > >> process groups we have. Another approach we are trying is a merge
> > > content
> > > >> before we send to the Nifi having the most issues, to have fewer
> flow
> > > files
> > > >> sent at once site to site, and then splitting them after they are
> > > received.
> > > >> > Thank you!
> > > >> >
> > > >> >    On Thursday, February 6, 2020, 2:19:48 PM EST, Mike Thomsen <
> > > >> [email protected]> wrote:
> > > >> >
> > > >> > Can you share a description of your flows in terms of average
> > flowfile
> > > >> size, queue size, data velocity, etc.?
> > > >> > Thanks,
> > > >> > Mike
> > > >> >
> > > >> > On Thu, Feb 6, 2020 at 1:59 PM Elli Schwarz <
> > > [email protected]>
> > > >> wrote:
> > > >> >
> > > >> >  We seem to be experiencing the same problems. We recently
> upgraded
> > > >> several of our Nifis from 1.9.2 to 1.11.0, and now many of them are
> > > failing
> > > >> with "too many open files". Nothing else changed other than the
> > upgrade,
> > > >> and our data volume is the same as before. The only solution we've
> > been
> > > >> able to come up with is to run a script to check for this condition
> > and
> > > >> restart the Nifi. Any other ideas?
> > > >> > Thank you!
> > > >> >
> > > >> >     On Sunday, February 2, 2020, 9:11:34 AM EST, Mike Thomsen <
> > > >> [email protected]> wrote:
> > > >> >
> > > >> >  Without further details, this is what I did to see if it was
> > > something
> > > >> > other than the usual issue of having not enough file handlers
> > > available.
> > > >> > Something like a legitimate case of someone forgetting to close
> file
> > > >> > objects or something in the code itself.
> > > >> >
> > > >> > 1. Setup a 8core/32GB VM on AWS w/ Amazon AMI.
> > > >> > 2. Pushed 1.11.1RC1
> > > >> > 3. Pushed the RAM settings to 6/12GB
> > > >> > 4. Disabled flowfile archiving because I only allocated 8GB of
> > > storage.
> > > >> > 5. Setup a flow that used 2 generateflow instances to generate
> > massive
> > > >> > amounts of garbage data using all available cores. (All queues
> were
> > > >> setup
> > > >> > to hold 250k flow files)
> > > >> > 6. Kicked it off and let it run for probably about 20 minutes.
> > > >> >
> > > >> > No apparent problem with closing and releasing resources here.
> > > >> >
> > > >> > On Sat, Feb 1, 2020 at 8:00 AM Joe Witt <[email protected]>
> wrote:
> > > >> >
> > > >> >> these are usually very easy to find.
> > > >> >>
> > > >> >> run lsof -p pid.  and share results
> > > >> >>
> > > >> >>
> > > >> >> thanks
> > > >> >>
> > > >> >> On Sat, Feb 1, 2020 at 7:56 AM Mike Thomsen <
> > [email protected]>
> > > >> >> wrote:
> > > >> >>
> > > >> >>>
> > > >> >>>
> > > >> >>
> > > >>
> > >
> >
> https://stackoverflow.com/questions/59991035/nifi-1-11-opening-more-than-50k-files/60017064#60017064
> > > >> >>>
> > > >> >>> No idea if this is valid or not. I asked for clarification to
> see
> > if
> > > >> >> there
> > > >> >>> might be a specific processor or something that is triggering
> > this.
> > > >> >>>
> > > >> >>
> > > >> >
> > > >>
> > > >>
> > >
> >
>

Reply via email to