Github user pvillard31 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/980#discussion_r78652271
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java
 ---
    @@ -117,31 +173,78 @@
                 .allowableValues(LOCATION_LOCAL, LOCATION_REMOTE)
                 .defaultValue(LOCATION_LOCAL.getValue())
                 .build();
    +
         static final PropertyDescriptor START_POSITION = new 
PropertyDescriptor.Builder()
                 .name("Initial Start Position")
    -            .description("When the Processor first begins to tail data, 
this property specifies where the Processor should begin reading data. Once 
data has been ingested from the file, "
    +            .description("When the Processor first begins to tail data, 
this property specifies where the Processor should begin reading data. Once 
data has been ingested from a file, "
                         + "the Processor will continue from the last point 
from which it has received data.")
                 .allowableValues(START_BEGINNING_OF_TIME, START_CURRENT_FILE, 
START_CURRENT_TIME)
                 .defaultValue(START_CURRENT_FILE.getValue())
                 .required(true)
                 .build();
     
    +    static final PropertyDescriptor RECURSIVE = new 
PropertyDescriptor.Builder()
    +            .name("tailfile-recursive-lookup")
    +            .displayName("Recursive lookup")
    +            .description("When using Multiple files mode, this property 
defines if files must be listed recursively or not"
    +                    + " in the base directory.")
    +            .allowableValues("true", "false")
    +            .defaultValue("true")
    +            .required(true)
    +            .build();
    +
    +    static final PropertyDescriptor ROLLING_STRATEGY = new 
PropertyDescriptor.Builder()
    +            .name("tailfile-rolling-strategy")
    +            .displayName("Rolling Strategy")
    +            .description("Specifies if the files to tail have a fixed name 
or not.")
    +            .required(true)
    +            .allowableValues(FIXED_NAME, CHANGING_NAME)
    +            .defaultValue(FIXED_NAME.getValue())
    +            .build();
    +
    +    static final PropertyDescriptor LOOKUP_FREQUENCY = new 
PropertyDescriptor.Builder()
    +            .name("tailfile-lookup-frequency")
    +            .displayName("Lookup frequency")
    +            .description("Only used in Multiple files mode and Changing 
name rolling strategy, it specifies the minimum "
    +                    + "duration the processor will wait before listing 
again the files to tail.")
    +            .required(false)
    +            .addValidator(StandardValidators.TIME_PERIOD_VALIDATOR)
    +            .defaultValue("10 minutes")
    --- End diff --
    
    In my mind the lookup frequency would depend on the rolling configuration 
of the log files. Let's say there is a daily rollover, then the lookup 
frequency could be something like one hour (in fact the maximum duration the 
user is ready to wait before receiving log messages from a fresh new file). On 
the other side, the maximum age duration is the time we need to wait after the 
last received message in a log file before considering there won't be any more 
message in this file and that we can "forget" about its state (and if it 
happens that we do receive new messages, then we will duplicate data). With a 
daily rollover, it could be 24h.
    
    If we set the lookup frequency to infinity, it means that only the files 
that exist when the processor is started will be considered. I don't know if 
it's really what we want for the user to be the default, don't you think?
    
    On the other hand, I could agree that for the maximum age we set it to 
infinity by default to ensure there is no data duplication unless the user is 
overriding the property. But in any case, how would you set a default value to 
infinity in the property descriptor configuration? ``.defaultValue()`` is 
expecting a String, so I'm not sure to see how you would use Max Long in 
combination with the time period validator. Otherwise, I could use the standard 
'0' to represent this case. Let me know what you think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to