Hi Prabhu, Thanks for sharing your experience with flow file archiving. The case that a single flow.xml.gz file size exceeds archive.max.storage was not considered well when I implemented NIFI-2145.
By looking at the code, it currently works as follows: 1. The original conf/flow.xml.gz (> 1MB) is archived to conf/archive 2. NiFi checks if there's any expired archive files, and delete it if any 3. NiFi checks if the total size of all archived files, then delete the oldest archive. Keep doing this until the total size becomes less than or equal to the configured archive.max.storage. In your case, at step 3, the newly created archive is deleted, because its size was grater than archive.max.storage. In this case, NiFi only logs INFO level message, and it's hard to know what happened from user, as you reported. I'm going to create a JIRA for this, and fix current behavior by either one of following solutions: A. treat archive.max.storage as a HARD limit. If the original flow.xml.gz exceeds configured archive.max.storage in size, then throw an IOException, which results a WAR level log message "Unable to archive flow configuration as requested due to ...". B. treat archive.max.storage as a SOFT limit. By not including the newly created archive file at the step 2 and 3 above, so that it can stay there. Maybe a WAR level log message should be logged. For greater user experience, I'd prefer solution B, so that it can be archived even the flow.xml.gz exceeds archive storage size, since it was able to be written to disk, which means the physical disk had enough space. How do you think? Thanks! Koji On Wed, Jan 18, 2017 at 3:27 PM, prabhu Mahendran <[email protected]> wrote: > i have check below properties used for the backup operations in Nifi-1.0.0 > with respect to JIRA. > > https://issues.apache.org/jira/browse/NIFI-2145 > > nifi.flow.configuration.archive.max.time=1 hours > nifi.flow.configuration.archive.max.storage=1 MB > > Since we have two backup operations first one is "conf/flow.xml.gz" and > "conf/archive/flow.xml.gz" > > I have saved archive workflows(conf/archive/flow.xml.gz) as per hours in > "max.time" property. > > At particular time i have reached "1 MB"[set as size of default storage]. > > So it will delete existing conf/archive/flow.xml.gz completely and doesn't > write new flow files in conf/archive/flow.xml.gz due to size exceeds. > > No logs has shows that new flow.xml.gz has higher size than specified > storage. > > Can we able to > > Why it could delete existing flows and doesn't write new flows due to > storage? > > In this case in one backup operation has failed or not? > > Thanks, > > prabhu
