[ 
https://issues.apache.org/jira/browse/FLUME-2866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15163270#comment-15163270
 ] 

ASF GitHub Bot commented on FLUME-2866:
---------------------------------------

GitHub user phlantin opened a pull request:

    https://github.com/apache/flume/pull/38

    Flume 2866

    This resolves FLUME-2866.
    
    This adds a property "fileTimeMinOffsetSeconds" in the Spooling Directory 
Source. This prevents processing of files that are not older or newer than the 
amount of seconds defined by this property. The default value is "0" seconds, 
which preserves current behavior.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/phlantin/flume flume-2866

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flume/pull/38.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #38
    
----
commit 7fc19ffc50f8fb0706c025cbcb69f99c00f2fc5a
Author: Lantin <[email protected]>
Date:   2016-02-23T23:07:09Z

    FLUME-2866: Add fileTimeMinOffsetSeconds property to Spooling Directory 
Source.

commit 7859d50ec8f8f56861dba49cdb3ed0c2fbb02ee9
Author: Lantin <[email protected]>
Date:   2016-02-24T16:01:38Z

    Use single current time reference.

----


> Add fileTimeMinOffsetSeconds property to Spooling Directory Source
> ------------------------------------------------------------------
>
>                 Key: FLUME-2866
>                 URL: https://issues.apache.org/jira/browse/FLUME-2866
>             Project: Flume
>          Issue Type: New Feature
>          Components: Sinks+Sources
>            Reporter: Philippe Lantin
>            Priority: Minor
>
> When using a spooling directory source, it would be useful to have the 
> ability to specify that files must have a last modified timestamp that is off 
> by a configurable amount of seconds, either in the future or the past.
> For example, if I copy a large file to the spooling directory and it takes 
> several minutes to copy, I do not want my file to started being processed 
> before it is completed. A practical way to do this is by looking at the last 
> modified timestamp: files that are being transferred will update this 
> timestamp.
> In many filesystems, it is possible for clients to set the time in the 
> future, though this is usually done after a file has been completed 
> transferred. For example "cp -p" on linux.
> I propose a new property for the Spooling Directory Source: 
> fileTimeMinOffsetSeconds. The default would be "0", preserving current 
> behavior.
> If fileTimeMinOffsetSeconds=60, files will only be picked up if the last 
> modified time is +/- 60 seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to