Hi, Brandon - agree this shouldn't be dismissed, just saying that's, as you said, a tough one. Ben - out of curiosity, how many flow files can you have in the system at one point? Do you have flow files with a lot of attributes? large attributes (with a lot of content)?
Pierre 2018-04-26 11:26 GMT+02:00 尹文才 <[email protected]>: > hi guys, thanks for all your answers, I actually have seen that the > flowfile repo in one of our openstack centos 7 machine grew up to abour 30 > GB, which as a result used up all the disk space allocated for the virtual > machine and the flow inside > NIFI couldn't proceed and many errors started to appear such as fail to > checkpoint, etc.We used NIFI now as a ETL tool to extract some data from > sql server for data analysis. > I actually have no idea why the flowfile repo would grow up like this, in > my idea it is only used to place all flowfile attributes. It would be great > if there're some options to limit the flowfile repo size. > > Thanks. > Regard, > Ben > > 2018-04-26 2:08 GMT+08:00 Brandon DeVries <[email protected]>: > > > All, > > > > This is something I think we shouldn't dismiss so easily. While the > > FlowFile repo is lighter than the content repo, allowing it to grow too > > large can cause major problems. > > > > Specifically, an "overgrown" FlowFile repo may prevent a NiFi instance > from > > coming back up after a restart due to the way in which records are held > in > > memory. If there is more memory available to give to the JVM, this can > > sometimes be worked around... but if there isn't you may just be out of > > luck. For that matter, allowing the FlowFile repo to grow so large that > it > > consumes all the heap isn't going to be good for system health in general > > (OOM is probably never where you want to be...). > > > > To Pierre's point "you don't want to limit that repository in size since > it > > would prevent the workflows to create new flow files"... that's exactly > why > > I would want to limit the size of the repo. You do then get into > questions > > of how exactly to do this. For example, you may not want to simply block > > all transactions that create a FlowFile, because it may remove even more > > (e.g. MergeContent). Additionally, you have to be concerned about > > deadlocks (e.g. a "Wait" that hangs forever because its "Notify" is being > > starved). Or, perhaps that's all you can do... freeze everything at some > > threshold prior to actual damage being done, and alert operators that > > manual intervention is necessary (e.g. bring up the graph with > > autoResume=false, and bleed off data in a controlled fashion). > > > > In summary, I believe this is a problem. Even if it doesn't come up > often, > > when it does it is significant. While the solution likely isn't simple, > > it's worth putting some thought towards. > > > > Brandon > > > > On Wed, Apr 25, 2018 at 9:43 AM Sivaprasanna <[email protected]> > > wrote: > > > > > No, he actually had mentioned “like content repository”. The answer is, > > > there aren’t any properties that support this, AFAIK. Pierre’s response > > > pretty much sums up why there aren’t any properties. > > > > > > Thanks, > > > Sivaprasanna > > > > > > On Wed, 25 Apr 2018 at 7:10 PM, Mike Thomsen <[email protected]> > > > wrote: > > > > > > > I have a feeling that what Ben meant was how to limit the content > > > > repository size. > > > > > > > > On Wed, Apr 25, 2018 at 8:26 AM Pierre Villard < > > > > [email protected]> > > > > wrote: > > > > > > > > > Hi Ben, > > > > > > > > > > Since the flow file repository contains the information of the flow > > > files > > > > > currently being processed by NiFi, you don't want to limit that > > > > repository > > > > > in size since it would prevent the workflows to create new flow > > files. > > > > > > > > > > Besides this repository is very lightweight, I'm not sure it'd need > > to > > > be > > > > > limited in size. > > > > > Do you have a specific use case in mind? > > > > > > > > > > Pierre > > > > > > > > > > > > > > > 2018-04-25 9:15 GMT+02:00 尹文才 <[email protected]>: > > > > > > > > > > > Hi guys, I checked NIFI's system administrator guide trying to > > find a > > > > > > configuration item so that the size of the flowfile repository > > could > > > be > > > > > > limited similar to the other repositories(e.g. content > repository), > > > > but I > > > > > > didn't find such configuration items, is there currently any > > > > > configuration > > > > > > to limit the flowfile repository's size? thanks. > > > > > > > > > > > > Regards, > > > > > > Ben > > > > > > > > > > > > > > > > > > > > >
