GitHub user turcsanyip opened a pull request:

    https://github.com/apache/flume/pull/240

    FLUME-3101 Add maxBatchCount config property to Taildir Source.

    If there are multiple files in the path(s) that need to be tailed and there
    is a file written by high frequency, then Taildir can read the batchSize 
size
    events from that file every time. This can lead to an endless loop and 
Taildir
    will only read data from the busy file, while other files will not be
    processed.
    Another problem is that in this case TaildirSource will be unresponsive to
    stop requests too.
    
    This commit handles this situation by introducing a new config property 
called
    maxBatchCount. It controls the number of batches being read consecutively
    from the same file. After reading maxBatchCount rounds from a file, Taildir
    will switch to another file / will have a break in the processing.
    
    This change is based on hunshenshi's patch.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/turcsanyip/flume FLUME-3101

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flume/pull/240.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #240
    
----
commit 8ecb0ed1931d84e00962b996f89c6a5985b9d7c7
Author: turcsanyi <turcsanyi@...>
Date:   2018-11-21T15:06:04Z

    FLUME-3101 Add maxBatchCount config property to Taildir Source.
    
    If there are multiple files in the path(s) that need to be tailed and there
    is a file written by high frequency, then Taildir can read the batchSize 
size
    events from that file every time. This can lead to an endless loop and 
Taildir
    will only read data from the busy file, while other files will not be
    processed.
    Another problem is that in this case TaildirSource will be unresponsive to
    stop requests too.
    
    This commit handles this situation by introducing a new config property 
called
    maxBatchCount. It controls the number of batches being read consecutively
    from the same file. After reading maxBatchCount rounds from a file, Taildir
    will switch to another file / will have a break in the processing.
    
    This change is based on hunshenshi's patch.

----


---

Reply via email to