I have an MR job that sends streams of updates (puts and deletes) to an
existing db and all the tasks are crashing complaining of the exceptions
similar to the following:
Exception in thread "main"
org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact
region server Some server, retryOnlyOne=true, index=0, islastrow=false,
tries=9, numtries=10, i=78, listsize=390,
region=DocData,0000001013071992,1279835733117 for region
DocData,0000001013071992,1279835733117, row '0000001013115520', but failed
after 10 attempts.
I ran this job on 180 nodes with a max of 6 tasks per node; I thought this was
possibly due to overload so I ran it with just 2 tasks per node but again got
similar exceptions..
Then I tried issuing a put on the hbase shell: And it complained of the same
issue..
I checked the meta table entry and it seems fine.. I checked the corresponding
region server (web ui) and it is indeed hosting the region.
DocData,0000001013071992,12 column=info:regioninfo, timestamp=1280305164242,
value=REGION => {NAME => 'DocDat
79835733117 a,0000001013071992,1279835733117', STARTKEY =>
'0000001013071992', ENDKEY => '000
0001013205991', ENCODED => 1962005300, TABLE =>
{{NAME => 'DocData', MAX_FILESIZE
=> '4402341480', FAMILIES => [{NAME =>
'bigColumn', VERSIONS => '1', COMPRESSION
=> 'NONE', TTL => '2147483647', BLOCKSIZE =>
'1048576', IN_MEMORY => 'false', BL
OCKCACHE => 'false'}]}}
DocData,0000001013071992,12 column=info:server, timestamp=1280317959911,
value=63.250.207.87:60020
79835733117
DocData,0000001013071992,12 column=info:serverstartcode,
timestamp=1280317959911, value=1279926520261
79835733117
Can you see what is wrong here?
Thank you
Vidhya