Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for 
change notification.

The following page has been changed by JimKellerman:
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseArchitecture

The comment on the change is:
add issue: when is region server dead?

------------------------------------------------------------------------------
  
  The multi-machine stuff (the HMaster and the H!RegionServer) are actively 
being enhanced and debugged.
  
- Other related features and TODOs:
+ Issues and TODOs:
+  1. How do we know if a region server is really dead, or if the network is 
partitioned or if the region server is merely late in reporting in or getting 
its lease renewed? If we decide that a region server is dead, and it is not, it 
could still be doing updates on behalf of clients, adding to its log. It is not 
until it does successfully report in that it knows the master has "delisted" 
it. Only at that point does it start flushing the cache, finishing the log, 
etc. In the mean time the master may be ripping the rug out from under it by 
trying to split its log file (the most recent of which will be zero length 
because it is visible, but has no content until the region server closes it), 
and may have already reassigned the regions being served by the region server 
to another one, which will at a minimum lose data, and in the worst case, 
corrupt the region. This issue is being addressed in 
[https://issues.apache.org/jira/browse/HADOOP-1937 HADOOP-1937]
   1. Vuk Ercegovac [[MailTo(vercego AT SPAMFREE us DOT ibm DOT com)]] of IBM 
Almaden Research pointed out that keeping HBase HRegion edit logs in HDFS is 
currently flawed.  HBase writes edits to logs and to a memcache.  The 'atomic' 
write to the log is meant to serve as insurance against abnormal !RegionServer 
exit: on startup, the log is rerun to reconstruct an HRegion's last wholesome 
state. But files in HDFS do not 'exist' until they are cleanly closed -- 
something that will not happen if !RegionServer exits without running its 
'close'.
   1. The HMemcache lookup structure is relatively inefficient
   1. Implement some kind of block caching in HRegion. While the DFS isn't 
hitting the disk to fetch blocks, HRegion is making IPC calls to DFS (via 
!MapFile)

Reply via email to