[jira] [Commented] (HBASE-14790) Implement a new DFSOutputStream for logging WAL only

Duo Zhang (JIRA) Fri, 05 Feb 2016 18:39:07 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-14790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135485#comment-15135485
 ]


Duo Zhang commented on HBASE-14790:
-----------------------------------

[~eclark] AFAIK, the ack of hflush only means that datanode has received the 
packet. Here is the comments of hflush method in DFSOutputStream.

{code}
   * Flushes out to all replicas of the block. The data is in the buffers
   * of the DNs but not necessarily in the DN's OS buffers.
{code}

And in BlockReceiver's receivePacket method, you can see this
{code:title=BlockReceiver.java}
    // put in queue for pending acks, unless sync was requested
    if (responder != null && !syncBlock && !shouldVerifyChecksum()) {
      ((PacketResponder) responder.getRunnable()).enqueue(seqno,
          lastPacketInBlock, offsetInBlock, Status.SUCCESS);
    }
{code}

This is happened before we write the data out, so theoretically it is possible 
that we get ack back from the pipeline but the actual data has not been written 
out yet. And in the real world, the latency of network is much greater than 
local disk io so when you get ack back from the pipeline then usually the data 
should have already been written out. That's why 'kill -9' does not loss data 
most times. But theoretically, it could...

Thanks.

> Implement a new DFSOutputStream for logging WAL only
> ----------------------------------------------------
>
>                 Key: HBASE-14790
>                 URL: https://issues.apache.org/jira/browse/HBASE-14790
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Duo Zhang
>
> The original {{DFSOutputStream}} is very powerful and aims to serve all 
> purposes. But in fact, we do not need most of the features if we only want to 
> log WAL. For example, we do not need pipeline recovery since we could just 
> close the old logger and open a new one. And also, we do not need to write 
> multiple blocks since we could also open a new logger if the old file is too 
> large.
> And the most important thing is that, it is hard to handle all the corner 
> cases to avoid data loss or data inconsistency(such as HBASE-14004) when 
> using original DFSOutputStream due to its complicated logic. And the 
> complicated logic also force us to use some magical tricks to increase 
> performance. For example, we need to use multiple threads to call {{hflush}} 
> when logging, and now we use 5 threads. But why 5 not 10 or 100?
> So here, I propose we should implement our own {{DFSOutputStream}} when 
> logging WAL. For correctness, and also for performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14790) Implement a new DFSOutputStream for logging WAL only

Reply via email to