[
https://issues.apache.org/jira/browse/FLUME-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948604#comment-13948604
]
Hari Shreedharan commented on FLUME-2309:
-----------------------------------------
FYI - This does not really solve the listing problem. The iterateFiles method
does seem to list the files anyway (I looked at the commons source code). I'd
recommend using the older code for listing.
This looks good. Only one change I'd recommend, the default order cannot
change. It must stay random, since the additional overhead of comparison is
something that most people don't expect. Other than that, this patch looks good
to go. If you change that to default to the old one, I will commit this.
> Spooling directory should not always consume the oldest file first.
> -------------------------------------------------------------------
>
> Key: FLUME-2309
> URL: https://issues.apache.org/jira/browse/FLUME-2309
> Project: Flume
> Issue Type: New Feature
> Affects Versions: v1.4.0
> Reporter: Muhammad Ehsan ul Haque
> Priority: Minor
> Labels: feature, patch
> Fix For: v1.4.0
>
> Attachments: FLUME-2309-0.patch, FLUME-2309-0.patch
>
>
> The ReliableSpoolingFileEventReader reads the oldest file in the spooling
> directory first. This is done by listing the directory contents and then
> sorting file list based on timestamp. This may be very slow if there are a
> lot of files (of the order of 100K or more) in the directory.
> However, this is not always needed, there can be simple cases in which the
> order to consume the file is not important.
> There should be an option of consuming the files in arbitrary order, allowing
> the files to be consumed quickly without any delay.
--
This message was sent by Atlassian JIRA
(v6.2#6252)