[ 
https://issues.apache.org/jira/browse/HDFS-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929734#action_12929734
 ] 

Todd Lipcon commented on HDFS-895:
----------------------------------

Thanks for the review.

bq. Does this work with the heartbeat packet?

Line 657-658 check for the heartbeat sequence number before we set 
lastAckedSeqno in ResponseProcessor.run(), so it should work as before.

bq. dataQueue.wait(1000); is it possible to use dataQueue.wait();

Yes, I think that would also work. In theory, the timeout isn't necessary, but 
I've seen bugs before where a slight race causes us to miss a notify. Perhaps 
we can switch the synchronized (dataQueue) and the while (!closed) in this 
function and avoid the race? It seems to me that waking up once a second just 
to be extra safe doesn't really harm us, but maybe it's a bit of a band-aid.

bq. lines 1294-1299: when would this happen?

This is a bit subtle - it was one of the bugs in the original version of the 
patch. Here's the sequence of events that it covers:
- write 10 bytes to a new file (ie no packet yet)
- call hflush()
-- it calls flushBuffer, so that it enqueues a new "packet"
-- it sends that packet - now we have no currentPacket, and the 10 bytes are 
still in the checksum buffer, lastFlushOffset=10
- we call hflush() again without writing any more data
-- it calls flushBuffer, so it creates a new "packet" with the same 10 bytes
-- we notice that the bytesCurBlock is the same as lastFlushOffset, hence we 
don't want to actually re-send this packet (there's no new data, so it's a 
no-op flush)
-- hence, we need to get rid of the packet and also decrement the sequence 
number so we don't "skip" one

Without this fix, we were triggering 'assert seqno == lastAckedSeqno + 1;' when 
calling hflush() twice without writing any data in between

> Allow hflush/sync to occur in parallel with new writes to the file
> ------------------------------------------------------------------
>
>                 Key: HDFS-895
>                 URL: https://issues.apache.org/jira/browse/HDFS-895
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs client
>    Affects Versions: 0.22.0
>            Reporter: dhruba borthakur
>            Assignee: Todd Lipcon
>             Fix For: 0.22.0
>
>         Attachments: hdfs-895-0.20-append.txt, hdfs-895-20.txt, 
> hdfs-895-review.txt, hdfs-895-trunk.txt, hdfs-895.txt, hdfs-895.txt
>
>
> In the current trunk, the HDFS client methods writeChunk() and hflush./sync 
> are syncronized. This means that if a hflush/sync is in progress, an 
> applicationn cannot write data to the HDFS client buffer. This reduces the 
> write throughput of the transaction log in HBase. 
> The hflush/sync should allow new writes to happen to the HDFS client even 
> when a hflush/sync is in progress. It can record the seqno of the message for 
> which it should receice the ack, indicate to the DataStream thread to star 
> flushing those messages, exit the synchronized section  and just wai for that 
> ack to arrive.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to