Hi Koji,

Thanks for your information.

Actually the task description looks fine. I have one question here,
consider the storage limit is 500MB, suppose my latest workflow exceeds
this limit, which behavior is performed with respect to the
properties(max.count, max.time and max.storage)?? In my assumption latest
archive is saved even it exceeds 500MB, so what happen from here? Either it
will keep on save the single latest archive with the large size or it will
notify the user to increase the size and preserves the latest file till we
restart the flow?? If so what happens if the size is keep on increasing
with respect to 500MB, it will save archive based on count or only latest
archive throughtout nifi is in running status??

Many thanks

On Thu, Jan 19, 2017 at 12:47 PM, Koji Kawamura <[email protected]>
wrote:

> Hi Prabhu,
>
> Thank you for the suggestion.
>
> Keeping latest N archives is nice, it's simple :)
>
> The max.time and max.storage have other benefit and since already
> released, we should keep existing behavior with these settings, too.
> I've created a JIRA to add archive.max.count property.
> https://issues.apache.org/jira/browse/NIFI-3373
>
> Thanks,
> Koji
>
> On Thu, Jan 19, 2017 at 2:21 PM, prabhu Mahendran
> <[email protected]> wrote:
> > Hi Koji,
> >
> >
> > Thanks for your reply,
> >
> > Yes. Solution B may meet as I required. Currently if the storage size
> meets,
> > complete folder is getting deleted and the new flow is not tracked in the
> > archive folder. This behavior is the drawback here. I need atleast last
> > workflow to be saved in the archive folder and notify the user to
> increase
> > the size. At the same time till nifi restarts, atleast last complete
> > workflow should be backed up.
> >
> >
> > My another suggestion is as follows:
> >
> >
> > Regardless of the max.time and max.storage property, Can we have only few
> > files in archive(consider only 10 files). Each action from the nifi
> canvas
> > should be tracked here, if the flow.xml.gz archive files count reaches it
> > should delete the old first file and save the latest file, so that the
> count
> > 10 is maintained. Here we can maintain the workflow properly and backup
> is
> > also achieved without confusing with max.time and max.storage. Only case
> is
> > that the disk size exceeds, we should notify user about this.
> >
> >
> > Many thanks.
> >
> >
> > On Thu, Jan 19, 2017 at 6:36 AM, Koji Kawamura <[email protected]>
> > wrote:
> >>
> >> Hi Prabhu,
> >>
> >> Thanks for sharing your experience with flow file archiving.
> >> The case that a single flow.xml.gz file size exceeds
> >> archive.max.storage was not considered well when I implemented
> >> NIFI-2145.
> >>
> >> By looking at the code, it currently works as follows:
> >> 1. The original conf/flow.xml.gz (> 1MB) is archived to conf/archive
> >> 2. NiFi checks if there's any expired archive files, and delete it if
> any
> >> 3. NiFi checks if the total size of all archived files, then delete
> >> the oldest archive. Keep doing this until the total size becomes less
> >> than or equal to the configured archive.max.storage.
> >>
> >> In your case, at step 3, the newly created archive is deleted, because
> >> its size was grater than archive.max.storage.
> >> In this case, NiFi only logs INFO level message, and it's hard to know
> >> what happened from user, as you reported.
> >>
> >> I'm going to create a JIRA for this, and fix current behavior by
> >> either one of following solutions:
> >>
> >> A. treat archive.max.storage as a HARD limit. If the original
> >> flow.xml.gz exceeds configured archive.max.storage in size, then throw
> >> an IOException, which results a WAR level log message "Unable to
> >> archive flow configuration as requested due to ...".
> >>
> >> B. treat archive.max.storage as a SOFT limit. By not including the
> >> newly created archive file at the step 2 and 3 above, so that it can
> >> stay there. Maybe a WAR level log message should be logged.
> >>
> >> For greater user experience, I'd prefer solution B, so that it can be
> >> archived even the flow.xml.gz exceeds archive storage size, since it
> >> was able to be written to disk, which means the physical disk had
> >> enough space.
> >>
> >> How do you think?
> >>
> >> Thanks!
> >> Koji
> >>
> >> On Wed, Jan 18, 2017 at 3:27 PM, prabhu Mahendran
> >> <[email protected]> wrote:
> >> > i have check below properties used for the backup operations in
> >> > Nifi-1.0.0
> >> > with respect to JIRA.
> >> >
> >> > https://issues.apache.org/jira/browse/NIFI-2145
> >> >
> >> > nifi.flow.configuration.archive.max.time=1 hours
> >> > nifi.flow.configuration.archive.max.storage=1 MB
> >> >
> >> > Since we have two backup operations first one is "conf/flow.xml.gz"
> and
> >> > "conf/archive/flow.xml.gz"
> >> >
> >> > I have saved archive workflows(conf/archive/flow.xml.gz) as per
> hours in
> >> > "max.time" property.
> >> >
> >> > At particular time i have reached "1 MB"[set as size of default
> >> > storage].
> >> >
> >> > So it will delete existing conf/archive/flow.xml.gz completely and
> >> > doesn't
> >> > write new flow files in conf/archive/flow.xml.gz due to size exceeds.
> >> >
> >> > No logs has shows that new flow.xml.gz has higher size than specified
> >> > storage.
> >> >
> >> > Can we able to
> >> >
> >> > Why it could delete existing flows and doesn't write new flows due to
> >> > storage?
> >> >
> >> > In this case in one backup operation has failed or not?
> >> >
> >> > Thanks,
> >> >
> >> > prabhu
> >
> >
>

Reply via email to