Dear Wiki user, You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.
The following page has been changed by JimKellerman: http://wiki.apache.org/lucene-hadoop/Hbase/HbaseArchitecture The comment on the change is: document how data is organized in a Hadoop MapFile ------------------------------------------------------------------------------ the value for ''t6'' and the value for an ''"anchor"'' for "my.look.ca" if no time stamp is supplied is the value for time stamp ''t8''. - + + === Example === + + The current unit test for HBase included in the patch on + [http://issues.apache.org/jira/browse/HADOOP-1045 Hadoop Jira Issue 1045], + first writes rows with row id's of the form "row_[0-9]+" where the row + number goes from 0 to 999. It writes to two column families: + "contents:basic" and "anchor:anchornum-[0-9]+" (again the range of + numbers for the anchornum family goes from 0 to 999). It then writes + rows with row id's of "row_vals_nnn" where nnn is a three digit, + leading zero filled number from 000 to 999. Two column families are + written: "contents:firstcol" and anchor:secondcol". After a + compaction, dumping the + !MapFile which contains the "anchor:" family we see that the keys, + displayed as column-family(row-key)/timestamp are ordered as follows: + + {{{ + anchor:anchornum-0(row_0)/1174176403717 + anchor:anchornum-1(row_1)/1174176403723 + anchor:anchornum-10(row_10)/1174176403726 + anchor:anchornum-100(row_100)/1174176403769 + anchor:anchornum-101(row_101)/1174176403770 + anchor:anchornum-102(row_102)/1174176403771 + anchor:anchornum-103(row_103)/1174176403771 + anchor:anchornum-104(row_104)/1174176403772 + anchor:anchornum-105(row_105)/1174176403772 + anchor:anchornum-106(row_106)/1174176403773 + anchor:anchornum-107(row_107)/1174176403773 + anchor:anchornum-108(row_108)/1174176403774 + anchor:anchornum-109(row_109)/1174176403774 + anchor:anchornum-11(row_11)/1174176403727 + ... + anchor:anchornum-99(row_99)/1174176403769 + anchor:anchornum-990(row_990)/1174176403966 + anchor:anchornum-991(row_991)/1174176403966 + anchor:anchornum-992(row_992)/1174176403966 + anchor:anchornum-993(row_993)/1174176403966 + anchor:anchornum-994(row_994)/1174176403966 + anchor:anchornum-995(row_995)/1174176403966 + anchor:anchornum-996(row_996)/1174176403966 + anchor:anchornum-997(row_997)/1174176403966 + anchor:anchornum-998(row_998)/1174176403966 + anchor:anchornum-999(row_999)/1174176403966 + anchor:secondcol(row_vals1_000)/1174176435765 + anchor:secondcol(row_vals1_001)/1174176435766 + anchor:secondcol(row_vals1_002)/1174176435767 + anchor:secondcol(row_vals1_003)/1174176435767 + anchor:secondcol(row_vals1_004)/1174176435767 + anchor:secondcol(row_vals1_005)/1174176435767 + anchor:secondcol(row_vals1_006)/1174176435768 + anchor:secondcol(row_vals1_007)/1174176435768 + anchor:secondcol(row_vals1_008)/1174176435769 + anchor:secondcol(row_vals1_009)/1174176435769 + anchor:secondcol(row_vals1_010)/1174176435770 + ... + }}} + + If the row keys had had the same format (say row_nnn), dumping the + !MapFile we would see: + + {{{ + anchor:anchornum-0(row_000)/1174176403717 + anchor:secondcol(row_000)/1174176435765 + anchor:anchornum-1(row_001)/1174176403723 + anchor:secondcol(row_001)/1174176435766 + ... + }}} + [[Anchor(hregion)]] = HRegion (Tablet) Server =