[
https://issues.apache.org/jira/browse/FLUME-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13950403#comment-13950403
]
Muhammad Ehsan ul Haque commented on FLUME-2309:
------------------------------------------------
Your changes for selecting YOUNGEST/OLDEST file is not deterministic and
violates what is written in the document, and also the old behavior which used
to select the OLDSET file.
The documented and the old behavior was that in case of a tie on timestamp, the
file with the smallest lexicographical name will be picked. This is missing in
the patch that has been committed. Also this is a reason that tests are failing
as mentioned in FLUME-2350.
> Spooling directory should not always consume the oldest file first.
> -------------------------------------------------------------------
>
> Key: FLUME-2309
> URL: https://issues.apache.org/jira/browse/FLUME-2309
> Project: Flume
> Issue Type: New Feature
> Affects Versions: v1.4.0
> Reporter: Muhammad Ehsan ul Haque
> Assignee: Muhammad Ehsan ul Haque
> Priority: Minor
> Labels: feature, patch
> Fix For: v1.5.0
>
> Attachments: FLUME-2309-0.patch, FLUME-2309-0.patch,
> FLUME-2309-1.patch, FLUME-2309-commit.patch
>
>
> The ReliableSpoolingFileEventReader reads the oldest file in the spooling
> directory first. This is done by listing the directory contents and then
> sorting file list based on timestamp. This may be very slow if there are a
> lot of files (of the order of 100K or more) in the directory.
> However, this is not always needed, there can be simple cases in which the
> order to consume the file is not important.
> There should be an option of consuming the files in arbitrary order, allowing
> the files to be consumed quickly without any delay.
--
This message was sent by Atlassian JIRA
(v6.2#6252)