Another data point ... we had archiving turned on at first, and then most (but not all) files that lsof reported were
/content_repository/0/archive/123456789-123456 (deleted). We turned archiving off, hoping that was related in some way, but it was not. -- Mike On Wed, Apr 27, 2016 at 11:53 AM, Joe Witt <joe.w...@gmail.com> wrote: > Mike, > > Definitely does not sound familiar. However, just looked up what you > describe and I do see it. In my case there are only three files but > they are sitting there open for writing by the nifi process and yet > have been deleted. So I do believe there is an issue...will dig in a > bit but obviously if you make more discoveries here please share. > > Thanks > Joe > > > > On Wed, Apr 27, 2016 at 11:31 AM, Michael Moser <moser...@gmail.com> > wrote: > > Devs, > > > > We recently upgraded from NiFi 0.4.1 to 0.5.1 on a cluster. We noticed > > half of our cluster nodes getting "too many open files" errors that > require > > a NiFi restart, while the other half works without this problem. Using > > 'lsof -p <pid>' to identify the open file descriptors at the time of the > > problem, we see most of the file descriptors reference deleted files in > the > > content repository like this: > > > > java <pid> <user> <fd> ... /content_repository/81/123456789-123456 > (deleted) > > > > A 'ls /content_repository/81/123456789-123456' confirms that the file has > > been deleted. > > > > We are continuing our investigation into why some of our nodes have a > > problem but others don't. Has anyone else seen this? Did anything > change > > between 0.4.1 and 0.5.1 related to deleting files from the content > > repository? > > > > Regards, > > -- Mike >