I have an MR job that sends streams of updates (puts and deletes) to an 
existing db and all the tasks are crashing complaining of the exceptions 
similar to the following:



   Exception in thread "main" 
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact 
region server Some server, retryOnlyOne=true, index=0, islastrow=false, 
tries=9, numtries=10, i=78, listsize=390, 
region=DocData,0000001013071992,1279835733117 for region 
DocData,0000001013071992,1279835733117, row '0000001013115520', but failed 
after 10 attempts.



I ran this job on 180 nodes with a max of 6 tasks per node; I thought this was 
possibly due to overload so I ran it with just 2 tasks per node but again got 
similar exceptions..

Then I tried issuing a put on the hbase shell: And it complained of the same 
issue..

I checked the meta table entry and it seems fine.. I checked the corresponding 
region server (web ui) and it is indeed hosting the region.



DocData,0000001013071992,12 column=info:regioninfo, timestamp=1280305164242, 
value=REGION => {NAME => 'DocDat
 79835733117                 a,0000001013071992,1279835733117', STARTKEY => 
'0000001013071992', ENDKEY => '000
                             0001013205991', ENCODED => 1962005300, TABLE => 
{{NAME => 'DocData', MAX_FILESIZE
                              => '4402341480', FAMILIES => [{NAME => 
'bigColumn', VERSIONS => '1', COMPRESSION
                              => 'NONE', TTL => '2147483647', BLOCKSIZE => 
'1048576', IN_MEMORY => 'false', BL
                             OCKCACHE => 'false'}]}}
 DocData,0000001013071992,12 column=info:server, timestamp=1280317959911, 
value=63.250.207.87:60020
 79835733117
 DocData,0000001013071992,12 column=info:serverstartcode, 
timestamp=1280317959911, value=1279926520261
 79835733117


Can you see what is wrong here?

Thank you
Vidhya

Reply via email to