Hi Prabhu,

Thank you for the suggestion.

Keeping latest N archives is nice, it's simple :)

The max.time and max.storage have other benefit and since already
released, we should keep existing behavior with these settings, too.
I've created a JIRA to add archive.max.count property.
https://issues.apache.org/jira/browse/NIFI-3373

Thanks,
Koji

On Thu, Jan 19, 2017 at 2:21 PM, prabhu Mahendran
<[email protected]> wrote:
> Hi Koji,
>
>
> Thanks for your reply,
>
> Yes. Solution B may meet as I required. Currently if the storage size meets,
> complete folder is getting deleted and the new flow is not tracked in the
> archive folder. This behavior is the drawback here. I need atleast last
> workflow to be saved in the archive folder and notify the user to increase
> the size. At the same time till nifi restarts, atleast last complete
> workflow should be backed up.
>
>
> My another suggestion is as follows:
>
>
> Regardless of the max.time and max.storage property, Can we have only few
> files in archive(consider only 10 files). Each action from the nifi canvas
> should be tracked here, if the flow.xml.gz archive files count reaches it
> should delete the old first file and save the latest file, so that the count
> 10 is maintained. Here we can maintain the workflow properly and backup is
> also achieved without confusing with max.time and max.storage. Only case is
> that the disk size exceeds, we should notify user about this.
>
>
> Many thanks.
>
>
> On Thu, Jan 19, 2017 at 6:36 AM, Koji Kawamura <[email protected]>
> wrote:
>>
>> Hi Prabhu,
>>
>> Thanks for sharing your experience with flow file archiving.
>> The case that a single flow.xml.gz file size exceeds
>> archive.max.storage was not considered well when I implemented
>> NIFI-2145.
>>
>> By looking at the code, it currently works as follows:
>> 1. The original conf/flow.xml.gz (> 1MB) is archived to conf/archive
>> 2. NiFi checks if there's any expired archive files, and delete it if any
>> 3. NiFi checks if the total size of all archived files, then delete
>> the oldest archive. Keep doing this until the total size becomes less
>> than or equal to the configured archive.max.storage.
>>
>> In your case, at step 3, the newly created archive is deleted, because
>> its size was grater than archive.max.storage.
>> In this case, NiFi only logs INFO level message, and it's hard to know
>> what happened from user, as you reported.
>>
>> I'm going to create a JIRA for this, and fix current behavior by
>> either one of following solutions:
>>
>> A. treat archive.max.storage as a HARD limit. If the original
>> flow.xml.gz exceeds configured archive.max.storage in size, then throw
>> an IOException, which results a WAR level log message "Unable to
>> archive flow configuration as requested due to ...".
>>
>> B. treat archive.max.storage as a SOFT limit. By not including the
>> newly created archive file at the step 2 and 3 above, so that it can
>> stay there. Maybe a WAR level log message should be logged.
>>
>> For greater user experience, I'd prefer solution B, so that it can be
>> archived even the flow.xml.gz exceeds archive storage size, since it
>> was able to be written to disk, which means the physical disk had
>> enough space.
>>
>> How do you think?
>>
>> Thanks!
>> Koji
>>
>> On Wed, Jan 18, 2017 at 3:27 PM, prabhu Mahendran
>> <[email protected]> wrote:
>> > i have check below properties used for the backup operations in
>> > Nifi-1.0.0
>> > with respect to JIRA.
>> >
>> > https://issues.apache.org/jira/browse/NIFI-2145
>> >
>> > nifi.flow.configuration.archive.max.time=1 hours
>> > nifi.flow.configuration.archive.max.storage=1 MB
>> >
>> > Since we have two backup operations first one is "conf/flow.xml.gz" and
>> > "conf/archive/flow.xml.gz"
>> >
>> > I have saved archive workflows(conf/archive/flow.xml.gz) as per hours in
>> > "max.time" property.
>> >
>> > At particular time i have reached "1 MB"[set as size of default
>> > storage].
>> >
>> > So it will delete existing conf/archive/flow.xml.gz completely and
>> > doesn't
>> > write new flow files in conf/archive/flow.xml.gz due to size exceeds.
>> >
>> > No logs has shows that new flow.xml.gz has higher size than specified
>> > storage.
>> >
>> > Can we able to
>> >
>> > Why it could delete existing flows and doesn't write new flows due to
>> > storage?
>> >
>> > In this case in one backup operation has failed or not?
>> >
>> > Thanks,
>> >
>> > prabhu
>
>

Reply via email to