[
https://issues.apache.org/jira/browse/FLUME-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16696078#comment-16696078
]
ASF subversion and git services commented on FLUME-3101:
--------------------------------------------------------
Commit b252267ed297b849a8c3d900f7263e4abe5101c9 in flume's branch
refs/heads/trunk from [~turcsanyip]
[ https://git-wip-us.apache.org/repos/asf?p=flume.git;h=b252267 ]
FLUME-3101 Add maxBatchCount config property to Taildir Source.
If there are multiple files in the path(s) that need to be tailed and there
is a file written by high frequency, then Taildir can read the batchSize size
events from that file every time. This can lead to an endless loop and Taildir
will only read data from the busy file, while other files will not be
processed.
Another problem is that in this case TaildirSource will be unresponsive to
stop requests too.
This commit handles this situation by introducing a new config property called
maxBatchCount. It controls the number of batches being read consecutively
from the same file. After reading maxBatchCount rounds from a file, Taildir
will switch to another file / will have a break in the processing.
This change is based on hunshenshi's patch.
This closes #240
Reviewers: Ferenc Szabo, Endre Major
(Peter Turcsanyi via Ferenc Szabo)
> taildir source may endless loop when tail a file
> ------------------------------------------------
>
> Key: FLUME-3101
> URL: https://issues.apache.org/jira/browse/FLUME-3101
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: 1.7.0
> Reporter: hunshenshi
> Assignee: Peter Turcsanyi
> Priority: Major
> Labels: patch, taildirsource
> Fix For: 1.9.0
>
> Attachments: FLUME-3101-0.patch, FLUME-3101-1.patch,
> FLUME-3101-2.patch, FLUME-3101-3.patch
>
>
> If there are many files in the path that need to be tail, and there is a file
> written by *high frequency* (for example, there are file a, file b and file c
> in the path, file a is written at high frequency), *taildir can read the
> batchSize size event from file a everytime*, then taildir will only read data
> from file a, other files will not to be read, because in
> TaildirSource.tailFileProcess will into an endless loop.
> code:
> {code:title=TaildirSource.java|borderStyle=solid}
> private void tailFileProcess(TailFile tf, boolean backoffWithoutNL)
> throws IOException, InterruptedException {
> while (true) {
> // if events.size >= batchSize will not break while,
> // then into endless loop to only read tf
> if (events.size() < batchSize) {
> break;
> }
> }
> }
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]