[ 
https://issues.apache.org/jira/browse/NIFI-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15869007#comment-15869007
 ] 

ASF GitHub Bot commented on NIFI-3373:
--------------------------------------

Github user ijokarumawak commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1460#discussion_r101433687
  
    --- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
    @@ -1934,9 +1934,10 @@ The first section of the _nifi.properties_ file is 
for the Core Properties. Thes
     |nifi.version|The version number of the current release. If upgrading but 
reusing this file, be sure to update this value.
     |nifi.flow.configuration.file*|The location of the flow configuration file 
(i.e., the file that contains what is currently displayed on the NiFi graph). 
The default value is ./conf/flow.xml.gz.
     |nifi.flow.configuration.archive.enabled*|Specifies whether NiFi creates a 
backup copy of the flow automatically when the flow is updated. The default 
value is _true_.
    -|nifi.flow.configuration.archive.dir*|The location of the archive 
directory where backup copies of the flow.xml are saved. The default value is 
./conf/archive. NiFi removes old archive files to limit disk usage based on 
file lifespan and total size, as specified with max.time and max.storage 
properties below. However, this cleanup mechanism takes into account only 
automatically created archived flow.xml files. That is, if there are other 
files or directories in this archive directory, NiFi will ignore them. 
Automatically created archives have filename with ISO 8601 format timestamp 
prefix followed by '_<original-filename>'. That is 
<year><month><day>T<hour><minute><second>+<timezone offset>_<original 
filename>. For example, `20160706T160719+0900_flow.xml.gz`. NiFi checks 
filenames when it cleans archive directory. If you would like to keep a 
particular archive in this directory without worrying about NiFi deleting it, 
you can do so by copying it with a different filename pattern.
    -|nifi.flow.configuration.archive.max.time*|The lifespan of archived 
flow.xml files. NiFi will delete expired archive files when it updates 
flow.xml. Expiration is determined based on current system time and the last 
modified timestamp of an archived flow.xml. The default value is 30 days.
    -|nifi.flow.configuration.archive.max.storage*|The total data size allowed 
for the archived flow.xml files. NiFi will delete the oldest archive files 
until the total archived file size becomes less than this configuration value. 
The default value is 500 MB.
    +|nifi.flow.configuration.archive.dir*|The location of the archive 
directory where backup copies of the flow.xml are saved. The default value is 
./conf/archive. NiFi removes old archive files to limit disk usage based on 
file lifespan total size, and number of files, as specified with max.time, 
max.storage and max.count properties described below. This cleanup mechanism 
takes into account only automatically created archived flow.xml files. If there 
are other files or directories in this archive directory, NiFi will ignore 
them. Automatically created archives have filename with ISO 8601 format 
timestamp prefix followed by '_<original-filename>'. That is 
<year><month><day>T<hour><minute><second>+<timezone offset>_<original 
filename>. For example, `20160706T160719+0900_flow.xml.gz`. NiFi checks 
filenames when it cleans archive directory. If you would like to keep a 
particular archive in this directory without worrying about NiFi deleting it, 
you can do so by copying it with a different filename pattern.
    --- End diff --
    
    Updated admin doc to describe how limitation settings work if no 
configuration is set for those.


> Add nifi.flow.configuration.archive.max.count property
> ------------------------------------------------------
>
>                 Key: NIFI-3373
>                 URL: https://issues.apache.org/jira/browse/NIFI-3373
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework
>            Reporter: Koji Kawamura
>            Assignee: Koji Kawamura
>
> Currently we can limit the number of flow.xml.gz archive files by:
> * total archive size (nifi.flow.configuration.archive.max.storage)
> * archive file age (nifi.flow.configuration.archive.max.time)
> In addition to these conditions to manage old archives, there's a demand that 
> simply limiting number of archive files regardless time or size constraint.
> https://lists.apache.org/thread.html/4d2d9cec46ee896318a5492bf020f60c28396e2850c077dad40d45d2@%3Cusers.nifi.apache.org%3E
> We can provide that by adding new property 
> 'nifi.flow.configuration.archive.max.count', so that If specified, only N 
> latest config files can be archived.
> Make those properties optional, and process in following order:
> - If max.count is specified, any archive other than the latest (N-1) is 
> removed
> - If max.time is specified, any archive that is older than max.time is removed
> - If max.storage is specified, old archives are deleted while total size is 
> greater than the configuration
> - Create new archive, keep the latest archive regardless of above limitations
> To illustrate how flow.xml archiving works, here are simulations with the 
> updated logic, where the size of flow.xml keeps increasing:
> h3. CASE-1
> archive.max.storage=10MB
> archive.max.count = 5
> ||Time || flow.xml || archives || archive total ||
> |t1 | f1 5MB  | f1 | 5MB|
> |t2 | f2 5MB  | f1, f2 | 10MB|
> |t3 | f3 5MB  | f2, f3 | 10MB|
> |t4 | f4 10MB | f4 | 10MB|
> |t5 | f5 15MB | f5 | 15MB|
> |t6 | f6 20MB | f6 | 20MB|
> |t7 | f7 25MB | t7 | 25MB|
> * t3: The oldest f1 is removed, because f1 + f2 + f3 > 10MB.
> * t5: Even if flow.xml size exceeds max.storage, the latest archive is
> created. f4 is removed because f4 + f5 > 10MB. WAR message is logged because 
> f5 is greater than 10MB.
> In this case, NiFi will keep logging WAR message
> indicating archive storage size is exceeding limit, from t5.
> After t5, NiFi will only keep the latest flow.xml.
> h3. CASE-2
> If at least 5 archives need to be kept no matter what, then set
> blank max.storage and max.time.
> archive.max.storage=
> archive.max.time=
> archive.max.count = 5 // Only limit archives by count
> |Time || flow.xml || archives || archive total ||
> |t1 | f1 5MB  | f1 | 5MB|
> |t2 | f2 5MB  | f1, f2 | 10MB|
> |t3 | f3 5MB  | f1, f2, f3 | 15MB|
> |t4 | f4 10MB | f1, f2, f3, f4 | 25MB|
> |t5 | f5 15MB | f1, f2, f3, f4, f5 | 40MB|
> |t6 | f6 20MB | f2, f3, f4, f5, f6 | 55MB|
> |t7 | f7 25MB | f3, f4, f5, f6, (f7) | 50MB, (75MB)|
> |t8 | f8 30MB | f3, f4, f5, f6 | 50MB|
> * From t6, oldest archive is removed to keep number of archives <= 5
> * At t7, if the disk has only 60MB space, f7 won't be archived. And
> after this point, archive mechanism stop working (Trying to create new
> archive, but keep getting exception: no space left on device).
> In either case above, once flow.xml has grown to that size, some human
> intervention would be needed



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to