[jira] Commented: (HDFS-895) Allow hflush/sync to occur in parallel with new writes to the file

Todd Lipcon (JIRA) Tue, 09 Nov 2010 15:02:49 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930340#action_12930340
 ]


Todd Lipcon commented on HDFS-895:
----------------------------------

The bug JD found is an NPE that happens if close() is called concurrent with 
hflush(). I have a patch that fixes this to IOE, but Nicolas and I have been 
discussing whether it should be a no-op instead. The logic is that if you 
append something, then some other thread close()s, then you call hflush(), your 
data has indeed already been flushed (ie is on disk). Right now hflush() checks 
that the stream is open first, but instead should it just return if the stream 
was closed in a non-error state?

> Allow hflush/sync to occur in parallel with new writes to the file
> ------------------------------------------------------------------
>
>                 Key: HDFS-895
>                 URL: https://issues.apache.org/jira/browse/HDFS-895
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs client
>    Affects Versions: 0.22.0
>            Reporter: dhruba borthakur
>            Assignee: Todd Lipcon
>             Fix For: 0.22.0
>
>         Attachments: hdfs-895-0.20-append.txt, hdfs-895-20.txt, 
> hdfs-895-review.txt, hdfs-895-trunk.txt, hdfs-895.txt, hdfs-895.txt
>
>
> In the current trunk, the HDFS client methods writeChunk() and hflush./sync 
> are syncronized. This means that if a hflush/sync is in progress, an 
> applicationn cannot write data to the HDFS client buffer. This reduces the 
> write throughput of the transaction log in HBase. 
> The hflush/sync should allow new writes to happen to the HDFS client even 
> when a hflush/sync is in progress. It can record the seqno of the message for 
> which it should receice the ack, indicate to the DataStream thread to star 
> flushing those messages, exit the synchronized section  and just wai for that 
> ack to arrive.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-895) Allow hflush/sync to occur in parallel with new writes to the file

Reply via email to