1.1 -> 2: don't forget about the block cache which can invalidate the
need for any HDFS read.
I think you're over-simplifying the write-path quite a bit. I'm not sure
what you mean by an 'asynchronous write', but that doesn't exist at the
HBase RPC layer as that would invalidate the consistency guarantees (if
an RPC returns to the client that data was "put", then it is durable).
Going off of memory (sorry in advance if I misstate something): the
general way that data is written to the WAL is a "group commit". You
have many threads all trying to append data to the WAL -- performance
would be terrible if you serially applied all of these writes. Instead,
many writes can be accepted and a the caller receives a Future. The
caller must wait for the Future to complete. What's happening behind the
scene is that the writes are being bundled together to reduce the number
of syncs to the WAL ("grouping" the writes together). When one caller's
future would complete, what really happened is that the write/sync which
included the caller's update was committed (along with others). All of
this is happening inside the RS's implementation of accepting an update.
https://github.com/apache/hbase/blob/55d6dcaf877cc5223e679736eb613173229c18be/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java#L74-L106
杨苏立 Yang Su Li wrote:
The attachment can be found in the following URL:
http://pages.cs.wisc.edu/~suli/hbase.pdf
Sorry for the inconvenience...
On Mon, Mar 27, 2017 at 8:25 PM, Ted Yu<[email protected]> wrote:
Again, attachment didn't come thru.
Is it possible to formulate as google doc ?
Thanks
On Mon, Mar 27, 2017 at 6:19 PM, 杨苏立 Yang Su Li<[email protected]>
wrote:
Hi,
I am a graduate student working on scheduling on storage systems, and we
are interested in how different threads in HBase interact with each other
and how it might affect scheduling.
I have written down my understanding on how HBase/HDFS works based on its
current thread architecture (attached). I am wondering if the developers
of
HBase could take a look at it and let me know if anything is incorrect or
inaccurate, or if I have missed anything.
Thanks a lot for your help!
On Wed, Mar 22, 2017 at 3:39 PM, 杨苏立 Yang Su Li<[email protected]>
wrote:
Hi,
I am a graduate student working on scheduling on storage systems, and we
are interested in how different threads in HBase interact with each
other
and how it might affect scheduling.
I have written down my understanding on how HBase/HDFS works based on
its
current thread architecture (attached). I am wondering if the
developers of
HBase could take a look at it and let me know if anything is incorrect
or
inaccurate, or if I have missed anything.
Thanks a lot for your help!
--
Suli Yang
Department of Physics
University of Wisconsin Madison
4257 Chamberlin Hall
Madison WI 53703
--
Suli Yang
Department of Physics
University of Wisconsin Madison
4257 Chamberlin Hall
Madison WI 53703