Hi Koji, Thanks for your support.
Many thanks. On Fri, Jan 20, 2017 at 11:03 AM, Koji Kawamura <[email protected]> wrote: > Hi Prabhu, > > Thanks for the confirmation. I can't guarantee if it's included in the > next release, but try my best :) You can watch the JIRA to get updates > when it proceeds. > https://issues.apache.org/jira/browse/NIFI-3373 > > Thanks, > Koji > > On Fri, Jan 20, 2017 at 2:16 PM, prabhu Mahendran > <[email protected]> wrote: > > Hi Koji, > > > > Both simulation looks perfect. I was expected this exact behavior and it > > matches my requirement, also it sounds logical. Shall I expect this > changes > > in next nifi release version?? > > > > > > Thank you so much for this tremendous support. > > > > > > On Fri, Jan 20, 2017 at 6:14 AM, Koji Kawamura <[email protected]> > > wrote: > >> > >> Hi Prabhu, > >> > >> In that case, yes, as your assumption, even the latest archive exceeds > >> 500MB, the latest archive is saved, as long as it was written to disk > >> successfully. > >> > >> After that, when user updates NiFi flow, before new archive is > >> created, the previous one will be removed, because max.storage > >> exceeds. Then the latest will be archived. > >> > >> Let's simulate the scenario with the to-be-updated logic by NIFI-3373, > >> in which the size of flow.xml keeps increasing: > >> > >> # CASE-1 > >> > >> archive.max.storage=10MB > >> archive.max.count = 5 > >> > >> Time | flow.xml | archives | archive total | > >> t1 | f1 5MB | f1 | 5MB > >> t2 | f2 5MB | f1, f2 | 10MB > >> t3 | f3 5MB | f1, f2, f3 | 15MB > >> t4 | f4 10MB | f2, f3, f4 | 20MB > >> t5 | f5 15MB | f4, f5 | 25MB > >> t6 | f6 20MB | f6 | 20MB > >> t7 | f7 25MB | t7 | 25MB > >> > >> * t3: f3 can is archived even total exceeds 10MB. Because f1 + f2 <= > >> 10MB. WAR message starts to be logged from this point, because total > >> archive size > 10MB. > >> * t4: The oldest f1 is removed, because f1 + f2 + f3 > 10MB. > >> * t5: Even if flow.xml size exceeds max.storage, the latest archive is > >> created. f4 are kept because f4 <= 10MB. > >> * t6: f4 and f5 are removed because f4 + f5 > 10MB, and also f5 > 10MB. > >> > >> In this case, NiFi will keep logging WAR (or should be ERR??) message > >> indicating archive storage size is exceeding limit, from t3. > >> After t6, even if archive.max.count = 5, NiFi will only keep the > >> latest flow.xml. > >> > >> # CASE-2 > >> > >> If you'd like to keep at least 5 archives no matter what, then set > >> blank max.storage and max.time. > >> > >> archive.max.storage= > >> archive.max.time= > >> archive.max.count = 5 // Only limit archives by count > >> > >> Time | flow.xml | archives | archive total | > >> t1 | f1 5MB | f1 | 5MB > >> t2 | f2 5MB | f1, f2 | 10MB > >> t3 | f3 5MB | f1, f2, f3 | 15MB > >> t4 | f4 10MB | f1, f2, f3, f4 | 25MB > >> t5 | f5 15MB | f1, f2, f3, f4, f5 | 40MB > >> t6 | f6 20MB | f2, f3, f4, f5, f6 | 55MB > >> t7 | f7 25MB | f3, f4, f5, f6, (f7) | 50MB, (75MB) > >> t8 | f8 30MB | f3, f4, f5, f6 | 50MB > >> > >> * From t6, oldest archive is removed to keep number of archives <= 5 > >> * At t7, if the disk has only 60MB space, f7 won't be archived. And > >> after this point, archive mechanism stop working (Trying to create new > >> archive, but keep getting exception: no space left on device). > >> > >> In either case above, once flow.xml has grown to that size, some human > >> intervention would be needed. > >> Do those simulation look reasonable? > >> > >> Thanks, > >> Koji > >> > >> On Thu, Jan 19, 2017 at 5:48 PM, prabhu Mahendran > >> <[email protected]> wrote: > >> > Hi Koji, > >> > > >> > Thanks for your information. > >> > > >> > Actually the task description looks fine. I have one question here, > >> > consider > >> > the storage limit is 500MB, suppose my latest workflow exceeds this > >> > limit, > >> > which behavior is performed with respect to the properties(max.count, > >> > max.time and max.storage)?? In my assumption latest archive is saved > >> > even it > >> > exceeds 500MB, so what happen from here? Either it will keep on save > the > >> > single latest archive with the large size or it will notify the user > to > >> > increase the size and preserves the latest file till we restart the > >> > flow?? > >> > If so what happens if the size is keep on increasing with respect to > >> > 500MB, > >> > it will save archive based on count or only latest archive throughtout > >> > nifi > >> > is in running status?? > >> > > >> > Many thanks > >> > > >> > On Thu, Jan 19, 2017 at 12:47 PM, Koji Kawamura < > [email protected]> > >> > wrote: > >> >> > >> >> Hi Prabhu, > >> >> > >> >> Thank you for the suggestion. > >> >> > >> >> Keeping latest N archives is nice, it's simple :) > >> >> > >> >> The max.time and max.storage have other benefit and since already > >> >> released, we should keep existing behavior with these settings, too. > >> >> I've created a JIRA to add archive.max.count property. > >> >> https://issues.apache.org/jira/browse/NIFI-3373 > >> >> > >> >> Thanks, > >> >> Koji > >> >> > >> >> On Thu, Jan 19, 2017 at 2:21 PM, prabhu Mahendran > >> >> <[email protected]> wrote: > >> >> > Hi Koji, > >> >> > > >> >> > > >> >> > Thanks for your reply, > >> >> > > >> >> > Yes. Solution B may meet as I required. Currently if the storage > size > >> >> > meets, > >> >> > complete folder is getting deleted and the new flow is not tracked > in > >> >> > the > >> >> > archive folder. This behavior is the drawback here. I need atleast > >> >> > last > >> >> > workflow to be saved in the archive folder and notify the user to > >> >> > increase > >> >> > the size. At the same time till nifi restarts, atleast last > complete > >> >> > workflow should be backed up. > >> >> > > >> >> > > >> >> > My another suggestion is as follows: > >> >> > > >> >> > > >> >> > Regardless of the max.time and max.storage property, Can we have > only > >> >> > few > >> >> > files in archive(consider only 10 files). Each action from the nifi > >> >> > canvas > >> >> > should be tracked here, if the flow.xml.gz archive files count > >> >> > reaches > >> >> > it > >> >> > should delete the old first file and save the latest file, so that > >> >> > the > >> >> > count > >> >> > 10 is maintained. Here we can maintain the workflow properly and > >> >> > backup > >> >> > is > >> >> > also achieved without confusing with max.time and max.storage. Only > >> >> > case > >> >> > is > >> >> > that the disk size exceeds, we should notify user about this. > >> >> > > >> >> > > >> >> > Many thanks. > >> >> > > >> >> > > >> >> > On Thu, Jan 19, 2017 at 6:36 AM, Koji Kawamura > >> >> > <[email protected]> > >> >> > wrote: > >> >> >> > >> >> >> Hi Prabhu, > >> >> >> > >> >> >> Thanks for sharing your experience with flow file archiving. > >> >> >> The case that a single flow.xml.gz file size exceeds > >> >> >> archive.max.storage was not considered well when I implemented > >> >> >> NIFI-2145. > >> >> >> > >> >> >> By looking at the code, it currently works as follows: > >> >> >> 1. The original conf/flow.xml.gz (> 1MB) is archived to > conf/archive > >> >> >> 2. NiFi checks if there's any expired archive files, and delete it > >> >> >> if > >> >> >> any > >> >> >> 3. NiFi checks if the total size of all archived files, then > delete > >> >> >> the oldest archive. Keep doing this until the total size becomes > >> >> >> less > >> >> >> than or equal to the configured archive.max.storage. > >> >> >> > >> >> >> In your case, at step 3, the newly created archive is deleted, > >> >> >> because > >> >> >> its size was grater than archive.max.storage. > >> >> >> In this case, NiFi only logs INFO level message, and it's hard to > >> >> >> know > >> >> >> what happened from user, as you reported. > >> >> >> > >> >> >> I'm going to create a JIRA for this, and fix current behavior by > >> >> >> either one of following solutions: > >> >> >> > >> >> >> A. treat archive.max.storage as a HARD limit. If the original > >> >> >> flow.xml.gz exceeds configured archive.max.storage in size, then > >> >> >> throw > >> >> >> an IOException, which results a WAR level log message "Unable to > >> >> >> archive flow configuration as requested due to ...". > >> >> >> > >> >> >> B. treat archive.max.storage as a SOFT limit. By not including the > >> >> >> newly created archive file at the step 2 and 3 above, so that it > can > >> >> >> stay there. Maybe a WAR level log message should be logged. > >> >> >> > >> >> >> For greater user experience, I'd prefer solution B, so that it can > >> >> >> be > >> >> >> archived even the flow.xml.gz exceeds archive storage size, since > it > >> >> >> was able to be written to disk, which means the physical disk had > >> >> >> enough space. > >> >> >> > >> >> >> How do you think? > >> >> >> > >> >> >> Thanks! > >> >> >> Koji > >> >> >> > >> >> >> On Wed, Jan 18, 2017 at 3:27 PM, prabhu Mahendran > >> >> >> <[email protected]> wrote: > >> >> >> > i have check below properties used for the backup operations in > >> >> >> > Nifi-1.0.0 > >> >> >> > with respect to JIRA. > >> >> >> > > >> >> >> > https://issues.apache.org/jira/browse/NIFI-2145 > >> >> >> > > >> >> >> > nifi.flow.configuration.archive.max.time=1 hours > >> >> >> > nifi.flow.configuration.archive.max.storage=1 MB > >> >> >> > > >> >> >> > Since we have two backup operations first one is > >> >> >> > "conf/flow.xml.gz" > >> >> >> > and > >> >> >> > "conf/archive/flow.xml.gz" > >> >> >> > > >> >> >> > I have saved archive workflows(conf/archive/flow.xml.gz) as per > >> >> >> > hours > >> >> >> > in > >> >> >> > "max.time" property. > >> >> >> > > >> >> >> > At particular time i have reached "1 MB"[set as size of default > >> >> >> > storage]. > >> >> >> > > >> >> >> > So it will delete existing conf/archive/flow.xml.gz completely > and > >> >> >> > doesn't > >> >> >> > write new flow files in conf/archive/flow.xml.gz due to size > >> >> >> > exceeds. > >> >> >> > > >> >> >> > No logs has shows that new flow.xml.gz has higher size than > >> >> >> > specified > >> >> >> > storage. > >> >> >> > > >> >> >> > Can we able to > >> >> >> > > >> >> >> > Why it could delete existing flows and doesn't write new flows > due > >> >> >> > to > >> >> >> > storage? > >> >> >> > > >> >> >> > In this case in one backup operation has failed or not? > >> >> >> > > >> >> >> > Thanks, > >> >> >> > > >> >> >> > prabhu > >> >> > > >> >> > > >> > > >> > > > > > >
