Re: Content Repository Cleanup

Alan Jackoway Sun, 11 Dec 2016 09:12:12 -0800

This just filled up again even
with nifi.content.repository.archive.enabled=false.


On the node that is still alive, our queued flowfiles are 91 / 16.47 GB,
but the content repository directory is using 646 GB.

Is there a property I can set to make it clean things up more frequently? I
expected that once I turned archive enabled off, it would delete things
from the content repository as soon as the flow files weren't queued
anywhere. So far the only way I have found to reliably get nifi to clear
out the content repository is to restart it.

Our version string is the following, if that interests you:
11/26/2016 04:39:37 PST
Tagged nifi-1.1.0-RC2
>From ${buildRevision} on branch ${buildBranch}

Maybe we will go to the released 1.1 and see if that helps. Until then I'll
be restarting a lot and digging into the code to figure out where this
cleanup is supposed to happen. Any pointers on code/configs for that would
be appreciated.

Thanks,
Alan

On Sun, Dec 11, 2016 at 8:51 AM, Joe Gresock <jgres...@gmail.com> wrote:

> No, in my scenario a server restart would not affect the content repository
> size.
>
> On Sun, Dec 11, 2016 at 8:46 AM, Alan Jackoway <al...@cloudera.com> wrote:
>
> > If we were in the situation Joe G described, should we expect that when
> we
> > kill and restart nifi it would clean everything up? That behavior has
> been
> > consistent every time - when the disk hits 100%, we kill nifi, delete
> > enough old content files to bring it back up, and before it bring the UI
> up
> > it deletes things to get within the archive policy again. That sounds
> less
> > like the files are stuck and more like it failed trying.
> >
> > For now I just turned off archiving, since we don't really need it for
> > this use case.
> >
> > I attached a jstack from last night's failure, which looks pretty boring
> > to me.
> >
> > On Sun, Dec 11, 2016 at 1:37 AM, Alan Jackoway <al...@cloudera.com>
> wrote:
> >
> >> The scenario Joe G describes is almost exactly what we are doing. We
> >> bring in large files and unpack them into many smaller ones. In the most
> >> recent iteration of this problem, I saw that we had many small files
> queued
> >> up at the time trouble was happening. We will try your suggestion to
> see if
> >> the situation improves.
> >>
> >> Thanks,
> >> Alan
> >>
> >> On Sat, Dec 10, 2016 at 6:57 AM, Joe Gresock <jgres...@gmail.com>
> wrote:
> >>
> >>> Not sure if your scenario is related, but one of the NiFi devs recently
> >>> explained to me that the files in the content repository are actually
> >>> appended together with other flow file content (please correct me if
> I'm
> >>> explaining it wrong).  That means if you have many small flow files in
> >>> your
> >>> current backlog, and several large flow files have recently left the
> >>> flow,
> >>> the large ones could still be hanging around in the content repository
> as
> >>> long as the small ones are still there, if they're in the same appended
> >>> files on disk.
> >>>
> >>> This scenario recently happened to us: we had a flow with ~20 million
> >>> tiny
> >>> flow files queued up, and at the same time we were also processing a
> >>> bunch
> >>> of 1GB files, which left the flow quickly.  The content repository was
> >>> much
> >>> larger than what was actually being reported in the flow stats, and our
> >>> disks were almost full.  On a hunch, I tried the following strategy:
> >>> - MergeContent the tiny flow files using flow-file-v3 format (to
> capture
> >>> all attributes)
> >>> - MergeContent 10,000 of the packaged flow files using tar format for
> >>> easier storage on disk
> >>> - PutFile into a directory
> >>> - GetFile from the same directory, but using back pressure from here on
> >>> out
> >>> (so that the flow simply wouldn't pull the same files from disk until
> it
> >>> was really ready for them)
> >>> - UnpackContent (untar them)
> >>> - UnpackContent (turn them back into flow files with the original
> >>> attributes)
> >>> - Then do the processing they were originally designed for
> >>>
> >>> This had the effect of very quickly reducing the size of my content
> >>> repository to very nearly the actual size I saw reported in the flow,
> and
> >>> my disk usage dropped from ~95% to 50%, which is the configured content
> >>> repository max usage percentage.  I haven't had any problems since.
> >>>
> >>> Hope this helps.
> >>> Joe
> >>>
> >>> On Sat, Dec 10, 2016 at 12:04 AM, Joe Witt <joe.w...@gmail.com> wrote:
> >>>
> >>> > Alan,
> >>> >
> >>> > That retention percentage only has to do with the archive of data
> >>> > which kicks in once a given chunk of content is no longer reachable
> by
> >>> > active flowfiles in the flow.  For it to grow to 100% typically would
> >>> > mean that you have data backlogged in the flow that account for that
> >>> > much space.  If that is certainly not the case for you then we need
> to
> >>> > dig deeper.  If you could do screenshots or share log files and stack
> >>> > dumps around this time those would all be helpful.  If the
> screenshots
> >>> > and such are too sensitive please just share as much as you can.
> >>> >
> >>> > Thanks
> >>> > Joe
> >>> >
> >>> > On Fri, Dec 9, 2016 at 9:55 PM, Alan Jackoway <al...@cloudera.com>
> >>> wrote:
> >>> > > One other note on this, when it came back up there were tons of
> >>> messages
> >>> > > like this:
> >>> > >
> >>> > > 2016-12-09 18:36:36,244 INFO [main] o.a.n.c.repository.
> >>> > FileSystemRepository
> >>> > > Found unknown file /path/to/content_repository/49
> >>> 8/1481329796415-87538
> >>> > > (1071114 bytes) in File System Repository; archiving file
> >>> > >
> >>> > > I haven't dug into what that means.
> >>> > > Alan
> >>> > >
> >>> > > On Fri, Dec 9, 2016 at 9:53 PM, Alan Jackoway <al...@cloudera.com>
> >>> > wrote:
> >>> > >
> >>> > >> Hello,
> >>> > >>
> >>> > >> We have a node on which nifi content repository keeps growing to
> use
> >>> > 100%
> >>> > >> of the disk. It's a relatively high-volume process. It chewed
> >>> through
> >>> > more
> >>> > >> than 100GB in the three hours between when we first saw it hit
> 100%
> >>> of
> >>> > the
> >>> > >> disk and when we just cleaned it up again.
> >>> > >>
> >>> > >> We are running nifi 1.1 for this. Our nifi.properties looked like
> >>> this:
> >>> > >>
> >>> > >> nifi.content.repository.implementation=org.apache.
> >>> > >> nifi.controller.repository.FileSystemRepository
> >>> > >> nifi.content.claim.max.appendable.size=10 MB
> >>> > >> nifi.content.claim.max.flow.files=100
> >>> > >> nifi.content.repository.directory.default=./content_repository
> >>> > >> nifi.content.repository.archive.max.retention.period=12 hours
> >>> > >> nifi.content.repository.archive.max.usage.percentage=50%
> >>> > >> nifi.content.repository.archive.enabled=true
> >>> > >> nifi.content.repository.always.sync=false
> >>> > >>
> >>> > >> I just bumped retention period down to 2 hours, but should max
> usage
> >>> > >> percentage protect us from using 100% of the disk?
> >>> > >>
> >>> > >> Unfortunately we didn't get jstacks on either failure. If it hits
> >>> 100%
> >>> > >> again I will make sure to get that.
> >>> > >>
> >>> > >> Thanks,
> >>> > >> Alan
> >>> > >>
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> I know what it is to be in need, and I know what it is to have
> plenty.  I
> >>> have learned the secret of being content in any and every situation,
> >>> whether well fed or hungry, whether living in plenty or in want.  I can
> >>> do
> >>> all this through him who gives me strength.    *-Philippians 4:12-13*
> >>>
> >>
> >>
> >
>
>
> --
> I know what it is to be in need, and I know what it is to have plenty.  I
> have learned the secret of being content in any and every situation,
> whether well fed or hungry, whether living in plenty or in want.  I can do
> all this through him who gives me strength.    *-Philippians 4:12-13*
>

Re: Content Repository Cleanup

Reply via email to