Hi Koji,

Thanks for your support.

Many thanks.

On Fri, Jan 20, 2017 at 11:03 AM, Koji Kawamura <[email protected]>
wrote:

> Hi Prabhu,
>
> Thanks for the confirmation. I can't guarantee if it's included in the
> next release, but try my best :) You can watch the JIRA to get updates
> when it proceeds.
> https://issues.apache.org/jira/browse/NIFI-3373
>
> Thanks,
> Koji
>
> On Fri, Jan 20, 2017 at 2:16 PM, prabhu Mahendran
> <[email protected]> wrote:
> > Hi Koji,
> >
> > Both simulation looks perfect. I was expected this exact behavior and it
> > matches my requirement, also it sounds logical. Shall I expect this
> changes
> > in next nifi release version??
> >
> >
> > Thank you so much for this tremendous support.
> >
> >
> > On Fri, Jan 20, 2017 at 6:14 AM, Koji Kawamura <[email protected]>
> > wrote:
> >>
> >> Hi Prabhu,
> >>
> >> In that case, yes, as your assumption, even the latest archive exceeds
> >> 500MB, the latest archive is saved, as long as it was written to disk
> >> successfully.
> >>
> >> After that, when user updates NiFi flow, before new archive is
> >> created, the previous one will be removed, because max.storage
> >> exceeds. Then the latest will be archived.
> >>
> >> Let's simulate the scenario with the to-be-updated logic by NIFI-3373,
> >> in which the size of flow.xml keeps increasing:
> >>
> >> # CASE-1
> >>
> >> archive.max.storage=10MB
> >> archive.max.count = 5
> >>
> >> Time | flow.xml | archives | archive total |
> >> t1 | f1 5MB  | f1 | 5MB
> >> t2 | f2 5MB  | f1, f2 | 10MB
> >> t3 | f3 5MB  | f1, f2, f3 | 15MB
> >> t4 | f4 10MB | f2, f3, f4 | 20MB
> >> t5 | f5 15MB | f4, f5 | 25MB
> >> t6 | f6 20MB | f6 | 20MB
> >> t7 | f7 25MB | t7 | 25MB
> >>
> >> * t3: f3 can is archived even total exceeds 10MB. Because f1 + f2 <=
> >> 10MB. WAR message starts to be logged from this point, because total
> >> archive size > 10MB.
> >> * t4: The oldest f1 is removed, because f1 + f2 + f3 > 10MB.
> >> * t5: Even if flow.xml size exceeds max.storage, the latest archive is
> >> created. f4 are kept because f4 <= 10MB.
> >> * t6: f4 and f5 are removed because f4 + f5 > 10MB, and also f5 > 10MB.
> >>
> >> In this case, NiFi will keep logging WAR (or should be ERR??) message
> >> indicating archive storage size is exceeding limit, from t3.
> >> After t6, even if archive.max.count = 5, NiFi will only keep the
> >> latest flow.xml.
> >>
> >> # CASE-2
> >>
> >> If you'd like to keep at least 5 archives no matter what, then set
> >> blank max.storage and max.time.
> >>
> >> archive.max.storage=
> >> archive.max.time=
> >> archive.max.count = 5 // Only limit archives by count
> >>
> >> Time | flow.xml | archives | archive total |
> >> t1 | f1 5MB  | f1 | 5MB
> >> t2 | f2 5MB  | f1, f2 | 10MB
> >> t3 | f3 5MB  | f1, f2, f3 | 15MB
> >> t4 | f4 10MB | f1, f2, f3, f4 | 25MB
> >> t5 | f5 15MB | f1, f2, f3, f4, f5 | 40MB
> >> t6 | f6 20MB | f2, f3, f4, f5, f6 | 55MB
> >> t7 | f7 25MB | f3, f4, f5, f6, (f7) | 50MB, (75MB)
> >> t8 | f8 30MB | f3, f4, f5, f6 | 50MB
> >>
> >> * From t6, oldest archive is removed to keep number of archives <= 5
> >> * At t7, if the disk has only 60MB space, f7 won't be archived. And
> >> after this point, archive mechanism stop working (Trying to create new
> >> archive, but keep getting exception: no space left on device).
> >>
> >> In either case above, once flow.xml has grown to that size, some human
> >> intervention would be needed.
> >> Do those simulation look reasonable?
> >>
> >> Thanks,
> >> Koji
> >>
> >> On Thu, Jan 19, 2017 at 5:48 PM, prabhu Mahendran
> >> <[email protected]> wrote:
> >> > Hi Koji,
> >> >
> >> > Thanks for your information.
> >> >
> >> > Actually the task description looks fine. I have one question here,
> >> > consider
> >> > the storage limit is 500MB, suppose my latest workflow exceeds this
> >> > limit,
> >> > which behavior is performed with respect to the properties(max.count,
> >> > max.time and max.storage)?? In my assumption latest archive is saved
> >> > even it
> >> > exceeds 500MB, so what happen from here? Either it will keep on save
> the
> >> > single latest archive with the large size or it will notify the user
> to
> >> > increase the size and preserves the latest file till we restart the
> >> > flow??
> >> > If so what happens if the size is keep on increasing with respect to
> >> > 500MB,
> >> > it will save archive based on count or only latest archive throughtout
> >> > nifi
> >> > is in running status??
> >> >
> >> > Many thanks
> >> >
> >> > On Thu, Jan 19, 2017 at 12:47 PM, Koji Kawamura <
> [email protected]>
> >> > wrote:
> >> >>
> >> >> Hi Prabhu,
> >> >>
> >> >> Thank you for the suggestion.
> >> >>
> >> >> Keeping latest N archives is nice, it's simple :)
> >> >>
> >> >> The max.time and max.storage have other benefit and since already
> >> >> released, we should keep existing behavior with these settings, too.
> >> >> I've created a JIRA to add archive.max.count property.
> >> >> https://issues.apache.org/jira/browse/NIFI-3373
> >> >>
> >> >> Thanks,
> >> >> Koji
> >> >>
> >> >> On Thu, Jan 19, 2017 at 2:21 PM, prabhu Mahendran
> >> >> <[email protected]> wrote:
> >> >> > Hi Koji,
> >> >> >
> >> >> >
> >> >> > Thanks for your reply,
> >> >> >
> >> >> > Yes. Solution B may meet as I required. Currently if the storage
> size
> >> >> > meets,
> >> >> > complete folder is getting deleted and the new flow is not tracked
> in
> >> >> > the
> >> >> > archive folder. This behavior is the drawback here. I need atleast
> >> >> > last
> >> >> > workflow to be saved in the archive folder and notify the user to
> >> >> > increase
> >> >> > the size. At the same time till nifi restarts, atleast last
> complete
> >> >> > workflow should be backed up.
> >> >> >
> >> >> >
> >> >> > My another suggestion is as follows:
> >> >> >
> >> >> >
> >> >> > Regardless of the max.time and max.storage property, Can we have
> only
> >> >> > few
> >> >> > files in archive(consider only 10 files). Each action from the nifi
> >> >> > canvas
> >> >> > should be tracked here, if the flow.xml.gz archive files count
> >> >> > reaches
> >> >> > it
> >> >> > should delete the old first file and save the latest file, so that
> >> >> > the
> >> >> > count
> >> >> > 10 is maintained. Here we can maintain the workflow properly and
> >> >> > backup
> >> >> > is
> >> >> > also achieved without confusing with max.time and max.storage. Only
> >> >> > case
> >> >> > is
> >> >> > that the disk size exceeds, we should notify user about this.
> >> >> >
> >> >> >
> >> >> > Many thanks.
> >> >> >
> >> >> >
> >> >> > On Thu, Jan 19, 2017 at 6:36 AM, Koji Kawamura
> >> >> > <[email protected]>
> >> >> > wrote:
> >> >> >>
> >> >> >> Hi Prabhu,
> >> >> >>
> >> >> >> Thanks for sharing your experience with flow file archiving.
> >> >> >> The case that a single flow.xml.gz file size exceeds
> >> >> >> archive.max.storage was not considered well when I implemented
> >> >> >> NIFI-2145.
> >> >> >>
> >> >> >> By looking at the code, it currently works as follows:
> >> >> >> 1. The original conf/flow.xml.gz (> 1MB) is archived to
> conf/archive
> >> >> >> 2. NiFi checks if there's any expired archive files, and delete it
> >> >> >> if
> >> >> >> any
> >> >> >> 3. NiFi checks if the total size of all archived files, then
> delete
> >> >> >> the oldest archive. Keep doing this until the total size becomes
> >> >> >> less
> >> >> >> than or equal to the configured archive.max.storage.
> >> >> >>
> >> >> >> In your case, at step 3, the newly created archive is deleted,
> >> >> >> because
> >> >> >> its size was grater than archive.max.storage.
> >> >> >> In this case, NiFi only logs INFO level message, and it's hard to
> >> >> >> know
> >> >> >> what happened from user, as you reported.
> >> >> >>
> >> >> >> I'm going to create a JIRA for this, and fix current behavior by
> >> >> >> either one of following solutions:
> >> >> >>
> >> >> >> A. treat archive.max.storage as a HARD limit. If the original
> >> >> >> flow.xml.gz exceeds configured archive.max.storage in size, then
> >> >> >> throw
> >> >> >> an IOException, which results a WAR level log message "Unable to
> >> >> >> archive flow configuration as requested due to ...".
> >> >> >>
> >> >> >> B. treat archive.max.storage as a SOFT limit. By not including the
> >> >> >> newly created archive file at the step 2 and 3 above, so that it
> can
> >> >> >> stay there. Maybe a WAR level log message should be logged.
> >> >> >>
> >> >> >> For greater user experience, I'd prefer solution B, so that it can
> >> >> >> be
> >> >> >> archived even the flow.xml.gz exceeds archive storage size, since
> it
> >> >> >> was able to be written to disk, which means the physical disk had
> >> >> >> enough space.
> >> >> >>
> >> >> >> How do you think?
> >> >> >>
> >> >> >> Thanks!
> >> >> >> Koji
> >> >> >>
> >> >> >> On Wed, Jan 18, 2017 at 3:27 PM, prabhu Mahendran
> >> >> >> <[email protected]> wrote:
> >> >> >> > i have check below properties used for the backup operations in
> >> >> >> > Nifi-1.0.0
> >> >> >> > with respect to JIRA.
> >> >> >> >
> >> >> >> > https://issues.apache.org/jira/browse/NIFI-2145
> >> >> >> >
> >> >> >> > nifi.flow.configuration.archive.max.time=1 hours
> >> >> >> > nifi.flow.configuration.archive.max.storage=1 MB
> >> >> >> >
> >> >> >> > Since we have two backup operations first one is
> >> >> >> > "conf/flow.xml.gz"
> >> >> >> > and
> >> >> >> > "conf/archive/flow.xml.gz"
> >> >> >> >
> >> >> >> > I have saved archive workflows(conf/archive/flow.xml.gz) as per
> >> >> >> > hours
> >> >> >> > in
> >> >> >> > "max.time" property.
> >> >> >> >
> >> >> >> > At particular time i have reached "1 MB"[set as size of default
> >> >> >> > storage].
> >> >> >> >
> >> >> >> > So it will delete existing conf/archive/flow.xml.gz completely
> and
> >> >> >> > doesn't
> >> >> >> > write new flow files in conf/archive/flow.xml.gz due to size
> >> >> >> > exceeds.
> >> >> >> >
> >> >> >> > No logs has shows that new flow.xml.gz has higher size than
> >> >> >> > specified
> >> >> >> > storage.
> >> >> >> >
> >> >> >> > Can we able to
> >> >> >> >
> >> >> >> > Why it could delete existing flows and doesn't write new flows
> due
> >> >> >> > to
> >> >> >> > storage?
> >> >> >> >
> >> >> >> > In this case in one backup operation has failed or not?
> >> >> >> >
> >> >> >> > Thanks,
> >> >> >> >
> >> >> >> > prabhu
> >> >> >
> >> >> >
> >> >
> >> >
> >
> >
>

Reply via email to