[ 
https://issues.apache.org/jira/browse/HDFS-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931280#action_12931280
 ] 

Todd Lipcon commented on HDFS-895:
----------------------------------

Hey Hairong. I had actually recalled incorrectly which part of that confusing 
code is new - only the "currentSeqno--" code is new, to prevent skipping a 
sequence number. Here's a diff that ignores whitespace change:

{code}
       // Flush only if we haven't already flushed till this offset.
       if (lastFlushOffset != bytesCurBlock) {
-
+          assert bytesCurBlock > lastFlushOffset;
         // record the valid offset of this flush
         lastFlushOffset = bytesCurBlock;
-
-        // wait for all packets to be sent and acknowledged
-        flushInternal();
+          queueCurrentPacket();
       } else {
         // just discard the current packet since it is already been sent.
+          if (oldCurrentPacket == null && currentPacket != null) {
+            // If we didn't previously have a packet queued, and now we do,
+            // but we don't plan on sending it, then we should not
+            // skip a sequence number for it!
+            currentSeqno--;
+          }
         currentPacket = null;
       }
{code}

As you can see we already had the code that avoided duplicate packets.

> Allow hflush/sync to occur in parallel with new writes to the file
> ------------------------------------------------------------------
>
>                 Key: HDFS-895
>                 URL: https://issues.apache.org/jira/browse/HDFS-895
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs client
>    Affects Versions: 0.22.0
>            Reporter: dhruba borthakur
>            Assignee: Todd Lipcon
>             Fix For: 0.22.0
>
>         Attachments: 895-delta-for-review.txt, hdfs-895-0.20-append.txt, 
> hdfs-895-20.txt, hdfs-895-review.txt, hdfs-895-trunk.txt, hdfs-895.txt, 
> hdfs-895.txt, hdfs-895.txt
>
>
> In the current trunk, the HDFS client methods writeChunk() and hflush./sync 
> are syncronized. This means that if a hflush/sync is in progress, an 
> applicationn cannot write data to the HDFS client buffer. This reduces the 
> write throughput of the transaction log in HBase. 
> The hflush/sync should allow new writes to happen to the HDFS client even 
> when a hflush/sync is in progress. It can record the seqno of the message for 
> which it should receice the ack, indicate to the DataStream thread to star 
> flushing those messages, exit the synchronized section  and just wai for that 
> ack to arrive.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to