Hi Prabhu, Thank you for the suggestion.
Keeping latest N archives is nice, it's simple :) The max.time and max.storage have other benefit and since already released, we should keep existing behavior with these settings, too. I've created a JIRA to add archive.max.count property. https://issues.apache.org/jira/browse/NIFI-3373 Thanks, Koji On Thu, Jan 19, 2017 at 2:21 PM, prabhu Mahendran <[email protected]> wrote: > Hi Koji, > > > Thanks for your reply, > > Yes. Solution B may meet as I required. Currently if the storage size meets, > complete folder is getting deleted and the new flow is not tracked in the > archive folder. This behavior is the drawback here. I need atleast last > workflow to be saved in the archive folder and notify the user to increase > the size. At the same time till nifi restarts, atleast last complete > workflow should be backed up. > > > My another suggestion is as follows: > > > Regardless of the max.time and max.storage property, Can we have only few > files in archive(consider only 10 files). Each action from the nifi canvas > should be tracked here, if the flow.xml.gz archive files count reaches it > should delete the old first file and save the latest file, so that the count > 10 is maintained. Here we can maintain the workflow properly and backup is > also achieved without confusing with max.time and max.storage. Only case is > that the disk size exceeds, we should notify user about this. > > > Many thanks. > > > On Thu, Jan 19, 2017 at 6:36 AM, Koji Kawamura <[email protected]> > wrote: >> >> Hi Prabhu, >> >> Thanks for sharing your experience with flow file archiving. >> The case that a single flow.xml.gz file size exceeds >> archive.max.storage was not considered well when I implemented >> NIFI-2145. >> >> By looking at the code, it currently works as follows: >> 1. The original conf/flow.xml.gz (> 1MB) is archived to conf/archive >> 2. NiFi checks if there's any expired archive files, and delete it if any >> 3. NiFi checks if the total size of all archived files, then delete >> the oldest archive. Keep doing this until the total size becomes less >> than or equal to the configured archive.max.storage. >> >> In your case, at step 3, the newly created archive is deleted, because >> its size was grater than archive.max.storage. >> In this case, NiFi only logs INFO level message, and it's hard to know >> what happened from user, as you reported. >> >> I'm going to create a JIRA for this, and fix current behavior by >> either one of following solutions: >> >> A. treat archive.max.storage as a HARD limit. If the original >> flow.xml.gz exceeds configured archive.max.storage in size, then throw >> an IOException, which results a WAR level log message "Unable to >> archive flow configuration as requested due to ...". >> >> B. treat archive.max.storage as a SOFT limit. By not including the >> newly created archive file at the step 2 and 3 above, so that it can >> stay there. Maybe a WAR level log message should be logged. >> >> For greater user experience, I'd prefer solution B, so that it can be >> archived even the flow.xml.gz exceeds archive storage size, since it >> was able to be written to disk, which means the physical disk had >> enough space. >> >> How do you think? >> >> Thanks! >> Koji >> >> On Wed, Jan 18, 2017 at 3:27 PM, prabhu Mahendran >> <[email protected]> wrote: >> > i have check below properties used for the backup operations in >> > Nifi-1.0.0 >> > with respect to JIRA. >> > >> > https://issues.apache.org/jira/browse/NIFI-2145 >> > >> > nifi.flow.configuration.archive.max.time=1 hours >> > nifi.flow.configuration.archive.max.storage=1 MB >> > >> > Since we have two backup operations first one is "conf/flow.xml.gz" and >> > "conf/archive/flow.xml.gz" >> > >> > I have saved archive workflows(conf/archive/flow.xml.gz) as per hours in >> > "max.time" property. >> > >> > At particular time i have reached "1 MB"[set as size of default >> > storage]. >> > >> > So it will delete existing conf/archive/flow.xml.gz completely and >> > doesn't >> > write new flow files in conf/archive/flow.xml.gz due to size exceeds. >> > >> > No logs has shows that new flow.xml.gz has higher size than specified >> > storage. >> > >> > Can we able to >> > >> > Why it could delete existing flows and doesn't write new flows due to >> > storage? >> > >> > In this case in one backup operation has failed or not? >> > >> > Thanks, >> > >> > prabhu > >
