Shaheen <sbahauddin@...> writes:

> 
> 
> Stack <stack@...> writes:
> > 
> > You saw my previous set of questions about your issue? ('Wed, Mar 2,
> > 2011 at 10:39 AM')
> > St>Ack
> > 


Here is more information on what we are doing and answers to your questions

Our "BulkLoader" calls createTable with a set of startkeys.
The startKeys are keys sampled from the data that goes into the table.


>>If you scan '.META.', do regions show for your just-added files?
>>hbase> scan ".META."

The .META. does show the table I created and the region's startkeys are the 
startkeys I passed to createTable.

Here is a row from .META. 

SB_TEST,U|C|C C||H||9|V R|S P R E B,
1299194954706.7ac2a0c323cd9fe965b974aac1a149c3.

column=info:regioninfo, timestamp=1299194954840, value=REGION => {NAME =>
SB_TEST,,1299194954706.4b63f58a7aee013c884717562dff2c3f.', STARTKEY => '',
ENDKEY => 'U|C|C C||H||9|V R|S P R E B', ENCODED =>
4b63f58a7aee013c884717562dff2c3f, TABLE => {{NAME => 'SB_TEST', FAMILIES =>
[{NAME => 'entity_key', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0',
COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', BLOCKSIZE =>
'65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}

>>Take one of these regions and try 'getting' its start key:
>>hbase> get 'TABLENAME', 'STARTKEY'
>>Does that work?
I picked one of the startKeys and did a get, and it did returned 0 rows. a count
also returned 0.


>>Did you mess w/ timestamps when you were inserting?
We do pass in a timestamp when we call put.add. The timestamp is the
currentTimeMillis


>>You used the totalorderpartitioner or something else?
We use a TotalOrderPartitioner with HFileOutputFormat.configureIncrementalLoad.


>>Try with a small subset of the data first?
Yes we tried on a small subset of the data. 379 rows.

-----
To check if the problem was with the startkeys used to call createTable, I
created the table and then called the BulkLoader. Below is the output of scan
.META. with a pre-created table (table created in HBase shell).

>scan '.META.'
SB_TEST,,1299251737810.1cea0a172f273279744470411947698a.

column=info:regioninfo, timestamp=1299251737900, value=REGION => {NAME =>
'SB_TEST,,1299251737810.1cea0a172f273279744470411947698a.', STARTKEY => '',
ENDKEY => '', ENCODED => 1cea0a172f273279744470411947698a, TABLE => {{NAME =>
'SB_TEST', FAMILIES => [{NAME => 'entity_key', BLOOMFILTER => 'NONE',
REPLICATION_SCOPE => '0', COMPRESSION=> 'NONE', VERSIONS => '3', TTL =>
'2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 
'true'}]}}

>count 'SB_TEST'
0 row(s) in 0.0530 seconds

----
If I directly load the data to table here is what I get in .META.
SB_TEST,,1299253653711.91771393876cb5701e7ddabe663cf3c1.

column=info:regioninfo, timestamp=1299253653763, value=REGION => {NAME =>
'SB_TEST,,1299253653711.91771393876cb5701e7ddabe663cf3c1.', STARTKEY => '',
ENDKEY => '', ENCODED => 91771393876cb5701e7ddabe663cf3c1, TABLE => {{NAME =>
'SB_TEST', FAMILIES => [{NAME => 'entity_key', BLOOMFILTER => 'NONE',
REPLICATION_SCOPE => '0', COMPRESSION=> 'NONE', VERSIONS => '3', TTL =>
'2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 
'true'}]}}

> count 'SB_TEST'
379 row(s) in 0.0600 seconds


Reply via email to