can you share your flow.xml.gz? On Thu, Sep 17, 2020 at 8:08 AM Ryan Hendrickson < [email protected]> wrote:
> 1.12.0 > > Thanks, > Ryan > > On Thu, Sep 17, 2020 at 11:04 AM Joe Witt <[email protected]> wrote: > >> Ryan >> >> What version are you using? I do think we had an issue that kept items >> around longer than intended that has been addressed. >> >> Thanks >> >> On Thu, Sep 17, 2020 at 7:58 AM Ryan Hendrickson < >> [email protected]> wrote: >> >>> Hello, >>> I've got ~15 million FlowFiles, each roughly 4KB, totally in about 55GB >>> of data on my canvas. >>> >>> However, the content repository (on it's own partition) is >>> completely full with 350GB of data. I'm pretty certain the way Content >>> Claims store the data is responsible for this. In previous experience, >>> we've had files that are larger, and haven't seen this as much. >>> >>> My guess is that as data was streaming through and being added to a >>> claim, it isn't always released as the small files leaves the canvas. >>> >>> We've run into this issue enough times that I figure there's probably a >>> "best practice for small files" for the content claims settings. >>> >>> These are our current settings: >>> >>> nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository >>> nifi.content.claim.max.appendable.size=1 MB >>> nifi.content.claim.max.flow.files=100 >>> nifi.content.repository.directory.default=/var/nifi/repositories/content >>> nifi.content.repository.archive.max.retention.period=12 hours >>> nifi.content.repository.archive.max.usage.percentage=50% >>> nifi.content.repository.archive.enabled=true >>> nifi.content.repository.always.sync=false >>> >>> >>> https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#content-repository >>> >>> >>> There's 1024 folders on the disk (0-1023) for the Content Claims. >>> Each file inside the folders are roughly 2MB to 8 MB (Which is odd >>> because I thought the max appendable size would make this no larger than >>> 1MB.) >>> >>> Is there a way to expand the number of folders and/or reduce the amount >>> of individual FlowFiles that are stored in the claims? >>> >>> I'm hoping there might be a best practice out there though. >>> >>> Thanks, >>> Ryan >>> >>> >>
