Hi Koji,

Thanks for your reply,

Yes. Solution B may meet as I required. Currently if the storage size
meets, complete folder is getting deleted and the new flow is not tracked
in the archive folder. This behavior is the drawback here. I need atleast
last workflow to be saved in the archive folder and notify the user to
increase the size. At the same time till nifi restarts, atleast last
complete workflow should be backed up.


My another suggestion is as follows:


Regardless of the max.time and max.storage property, Can we have only few
files in archive(consider only 10 files). Each action from the nifi canvas
should be tracked here, if the flow.xml.gz archive files count reaches it
should delete the old first file and save the latest file, so that the
count 10 is maintained. Here we can maintain the workflow properly and
backup is also achieved without confusing with max.time and max.storage.
Only case is that the disk size exceeds, we should notify user about this.


Many thanks.

On Thu, Jan 19, 2017 at 6:36 AM, Koji Kawamura <[email protected]>
wrote:

> Hi Prabhu,
>
> Thanks for sharing your experience with flow file archiving.
> The case that a single flow.xml.gz file size exceeds
> archive.max.storage was not considered well when I implemented
> NIFI-2145.
>
> By looking at the code, it currently works as follows:
> 1. The original conf/flow.xml.gz (> 1MB) is archived to conf/archive
> 2. NiFi checks if there's any expired archive files, and delete it if any
> 3. NiFi checks if the total size of all archived files, then delete
> the oldest archive. Keep doing this until the total size becomes less
> than or equal to the configured archive.max.storage.
>
> In your case, at step 3, the newly created archive is deleted, because
> its size was grater than archive.max.storage.
> In this case, NiFi only logs INFO level message, and it's hard to know
> what happened from user, as you reported.
>
> I'm going to create a JIRA for this, and fix current behavior by
> either one of following solutions:
>
> A. treat archive.max.storage as a HARD limit. If the original
> flow.xml.gz exceeds configured archive.max.storage in size, then throw
> an IOException, which results a WAR level log message "Unable to
> archive flow configuration as requested due to ...".
>
> B. treat archive.max.storage as a SOFT limit. By not including the
> newly created archive file at the step 2 and 3 above, so that it can
> stay there. Maybe a WAR level log message should be logged.
>
> For greater user experience, I'd prefer solution B, so that it can be
> archived even the flow.xml.gz exceeds archive storage size, since it
> was able to be written to disk, which means the physical disk had
> enough space.
>
> How do you think?
>
> Thanks!
> Koji
>
> On Wed, Jan 18, 2017 at 3:27 PM, prabhu Mahendran
> <[email protected]> wrote:
> > i have check below properties used for the backup operations in
> Nifi-1.0.0
> > with respect to JIRA.
> >
> > https://issues.apache.org/jira/browse/NIFI-2145
> >
> > nifi.flow.configuration.archive.max.time=1 hours
> > nifi.flow.configuration.archive.max.storage=1 MB
> >
> > Since we have two backup operations first one is "conf/flow.xml.gz" and
> > "conf/archive/flow.xml.gz"
> >
> > I have saved archive workflows(conf/archive/flow.xml.gz) as per hours in
> > "max.time" property.
> >
> > At particular time i have reached "1 MB"[set as size of default storage].
> >
> > So it will delete existing conf/archive/flow.xml.gz completely and
> doesn't
> > write new flow files in conf/archive/flow.xml.gz due to size exceeds.
> >
> > No logs has shows that new flow.xml.gz has higher size than specified
> > storage.
> >
> > Can we able to
> >
> > Why it could delete existing flows and doesn't write new flows due to
> > storage?
> >
> > In this case in one backup operation has failed or not?
> >
> > Thanks,
> >
> > prabhu
>

Reply via email to