[Hadoop Wiki] Update of "Hbase/FAQ" by JeffHammerbacher

Apache Wiki Thu, 17 Jun 2010 17:36:05 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The "Hbase/FAQ" page has been changed by JeffHammerbacher.
http://wiki.apache.org/hadoop/Hbase/FAQ?action=diff&rev1=62&rev2=63

--------------------------------------------------

   1. [[#21|How do I add/remove a node?]]
   1. [[#22|Why do servers have start codes?]]
   1. [[#23|What is the maximum recommended cell size?]]
+  1. [[#24|Why can't I iterate through the rows a table in reverse order?]]
  
  == Answers ==
  
@@ -229, +230 @@

  
  A rough rule of thumb, with little empirical validation, is to keep the data 
in HDFS and store pointers to the data in HBase if you expect the cell size to 
be consistently above 10 MB. If you do expect large cell values and you still 
plan to use HBase for the storage of cell contents, you'll want to increase the 
block size and the maximum region size for the table to keep the index size 
reasonable and the split frequency acceptable.
  
+ '''24. <<Anchor(24)>> Why can't I iterate through the rows a table in reverse 
order?'''
+ 
+ Because of the way 
[[http://hbase.apache.org/docs/current/api/org/apache/hadoop/hbase/io/hfile/HFile.html|HFile]]
 works: for efficiency, column values are put on disk with the length of the 
value written first and then the bytes of the actual value written second. To 
navigate through these values in reverse order, these length values would need 
to be stored twice (at the end as well) or in a side file. A robust secondary 
index implementation is the likely solution here to ensure the primary use case 
remains fast.
+

[Hadoop Wiki] Update of "Hbase/FAQ" by JeffHammerbacher

Reply via email to