My implementation is synchronized on the writer map, and the append and close operations on the bucketwriter are synchronized. It is possible for a writer to rarely get closed before it's about to append but that is harmless as it will just back off and get a fresh writer the next cycle. Also, if possible, please add comments to the jira thread when the mail is generated from there :)

On 10/19/2012 05:13 AM, Roshan Naik wrote:
Will need to handle race conditions like..  a thread resumes writing
immediately after the watcher thread decides to close the file handle. In
that sense a deterministic close is nicer than a timeout based 'garbage
collection'
-roshan


On Thu, Oct 18, 2012 at 12:04 PM, Mike Percy (JIRA) <[email protected]> wrote:

     [
https://issues.apache.org/jira/browse/FLUME-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13479255#comment-13479255]

Mike Percy commented on FLUME-1350:
-----------------------------------

Hi Juhani, something like a close-on-idle timeout makes sense. I'd be
happy to review it if you want to work on it.

HDFS file handle not closed properly when date bucketing
---------------------------------------------------------

                 Key: FLUME-1350
                 URL: https://issues.apache.org/jira/browse/FLUME-1350
             Project: Flume
          Issue Type: Bug
          Components: Sinks+Sources
    Affects Versions: v1.1.0, v1.2.0
            Reporter: Robert Mroczkowski
         Attachments: HDFSEventSink.java.patch


With configuration:
agent.sinks.hdfs-cafe-access.type = hdfs
agent.sinks.hdfs-cafe-access.hdfs.path =
  hdfs://nga/nga/apache/access/%y-%m-%d/
agent.sinks.hdfs-cafe-access.hdfs.fileType = DataStream
agent.sinks.hdfs-cafe-access.hdfs.filePrefix = cafe_access
agent.sinks.hdfs-cafe-access.hdfs.rollInterval = 21600
agent.sinks.hdfs-cafe-access.hdfs.rollSize = 10485760
agent.sinks.hdfs-cafe-access.hdfs.rollCount = 0
agent.sinks.hdfs-cafe-access.hdfs.txnEventMax = 1000
agent.sinks.hdfs-cafe-access.hdfs.batchSize = 1000
#agent.sinks.hdfs-cafe-access.hdfs.codeC = snappy
agent.sinks.hdfs-cafe-access.hdfs.hdfs.maxOpenFiles = 5000
agent.sinks.hdfs-cafe-access.channel = memo-1
When new directory is created previous file handle remains opened.
rollInterval setting is used only with files in current date bucket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA
administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to