Another data point ... we had archiving turned on at first, and then most
(but not all) files that lsof reported were

/content_repository/0/archive/123456789-123456 (deleted).

We turned archiving off, hoping that was related in some way, but it was
not.

-- Mike


On Wed, Apr 27, 2016 at 11:53 AM, Joe Witt <joe.w...@gmail.com> wrote:

> Mike,
>
> Definitely does not sound familiar.  However, just looked up what you
> describe and I do see it.  In my case there are only three files but
> they are sitting there open for writing by the nifi process and yet
> have been deleted.  So I do believe there is an issue...will dig in a
> bit but obviously if you make more discoveries here please share.
>
> Thanks
> Joe
>
>
>
> On Wed, Apr 27, 2016 at 11:31 AM, Michael Moser <moser...@gmail.com>
> wrote:
> > Devs,
> >
> > We recently upgraded from NiFi 0.4.1 to 0.5.1 on a cluster.  We noticed
> > half of our cluster nodes getting "too many open files" errors that
> require
> > a NiFi restart, while the other half works without this problem.  Using
> > 'lsof -p <pid>' to identify the open file descriptors at the time of the
> > problem, we see most of the file descriptors reference deleted files in
> the
> > content repository like this:
> >
> > java <pid> <user> <fd> ... /content_repository/81/123456789-123456
> (deleted)
> >
> > A 'ls /content_repository/81/123456789-123456' confirms that the file has
> > been deleted.
> >
> > We are continuing our investigation into why some of our nodes have a
> > problem but others don't.  Has anyone else seen this?  Did anything
> change
> > between 0.4.1 and 0.5.1 related to deleting files from the content
> > repository?
> >
> > Regards,
> > -- Mike
>

Reply via email to