[
https://issues.apache.org/jira/browse/FLUME-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13888865#comment-13888865
]
Muhammad Ehsan ul Haque commented on FLUME-2309:
------------------------------------------------
Perhaps we can use [apache commons
FileUtils.iterateFiles|http://commons.apache.org/proper/commons-io/javadocs/api-2.1/org/apache/commons/io/FileUtils.html#iterateFiles(java.io.File,
org.apache.commons.io.filefilter.IOFileFilter,
org.apache.commons.io.filefilter.IOFileFilter)]
In case we want the files to be consumed in an arbitrary order, then I believe
an iterator will be very cheap.
Okay I will work on this.
> Spooling directory should not always consume the oldest file first.
> -------------------------------------------------------------------
>
> Key: FLUME-2309
> URL: https://issues.apache.org/jira/browse/FLUME-2309
> Project: Flume
> Issue Type: New Feature
> Reporter: Muhammad Ehsan ul Haque
> Priority: Minor
>
> The ReliableSpoolingFileEventReader reads the oldest file in the spooling
> directory first. This is done by listing the directory contents and then
> sorting file list based on timestamp. This may be very slow if there are a
> lot of files (of the order of 100K or more) in the directory.
> However, this is not always needed, there can be simple cases in which the
> order to consume the file is not important.
> There should be an option of consuming the files in arbitrary order, allowing
> the files to be consumed quickly without any delay.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)