[ https://issues.apache.org/jira/browse/HBASE-15121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HBASE-15121: -------------------------- Fix Version/s: (was: 2.0.0) 3.0.0 > ConnectionImplementation#locateRegionInMeta() issue when master is restarted > ---------------------------------------------------------------------------- > > Key: HBASE-15121 > URL: https://issues.apache.org/jira/browse/HBASE-15121 > Project: HBase > Issue Type: Bug > Components: Client > Affects Versions: 2.0.0 > Reporter: Samir Ahmic > Assignee: Samir Ahmic > Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-15121-v0.patch, HBASE-15121-v0.patch > > > I notice this issue while i was running > IntegrationTestMTTR#testRestartMaster() test was failing on put operation. > Here is sequence of events from logs leading to failed put operation: > Master restart > {code} > INFO [pool-5-thread-1] util.Shell: Executing full command [/usr/bin/ssh > hnode2 "sudo -u hbase ps aux | grep proc_master | grep -v grep | tr -s ' ' | > cut -d ' ' -f2 | xargs kill -s SIGKILL"] > {code} > Client trying to locate region for row=70efdf2ec9b086079795c442636b55fb-17 > (this is additional logging inspecting metaKey which is used to search > hbase:meta ) > {code} > 2016-01-15 10:26:05,169 INFO [HBaseWriterThread_9] > client.ConnectionImplementation: metaKey inspection: > table=IntegrationTestMTTRLoadTestTool row= > 70efdf2ec9b086079795c442636b55fb-17 metaKey= > IntegrationTestMTTRLoadTestTool,70efdf2ec9b086079795c442636b55fb-17,99999999999999 > {code} > Client throwing TableNotFoundException (hbase:meta scan returned null) > {code} > 2016-01-15 10:32:58,154 INFO [HBaseWriterThread_5] > client.ConnectionImplementation: regionInfo result is null: > HBaseWriterThread_5 throwing TableNotFoundException logging details > table=IntegrationTestMTTRLoadTestTool row=70efdf2ec9b086079795c442636b55fb-17 > metaKey=IntegrationTestMTTRLoadTestTool,70efdf2ec9b086079795c442636b55fb-17,99999999999999 > 2016-01-15 10:32:58,154 ERROR [HBaseWriterThread_5] client.AsyncProcess: > Failed to get region location > org.apache.hadoop.hbase.TableNotFoundException: > IntegrationTestMTTRLoadTestTool > at > org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:890) > at > org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:781) > at > org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:396) > at > org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:344) > at > org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:239) > at > org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:191) > at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:949) > at org.apache.hadoop.hbase.client.HTable.put(HTable.java:569) > at > org.apache.hadoop.hbase.util.MultiThreadedWriter$HBaseWriterThread.insert(MultiThreadedWriter.java:146) > at > org.apache.hadoop.hbase.util.MultiThreadedWriter$HBaseWriterThread.run(MultiThreadedWriter.java:111) > {code} > And as result we have failed insert operation: > {code} > 2016-01-15 10:32:58,179 ERROR [HBaseWriterThread_5] util.MultiThreadedWriter: > Failed to insert: 17 after 60046ms; region information: cached: > region=IntegrationTestMTTRLoadTestTool,66666660,1452849956427.05b437185a9437f178726a55a29a79b7., > hostname=hnode4,16020,1452776418437, seqNum=5; cache is up to date; errors: > exception from null for 70efdf2ec9b086079795c442636b55fb-17 > org.apache.hadoop.hbase.TableNotFoundException: > IntegrationTestMTTRLoadTestTool > at > org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:890) > at > org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:781) > at > org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:396) > at > org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:344) > at > org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:239) > at > org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:191) > at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:949) > at org.apache.hadoop.hbase.client.HTable.put(HTable.java:569) > at > org.apache.hadoop.hbase.util.MultiThreadedWriter$HBaseWriterThread.insert(MultiThreadedWriter.java:146) > at > org.apache.hadoop.hbase.util.MultiThreadedWriter$HBaseWriterThread.run(MultiThreadedWriter.java:111) > {code} > leading to test failing: > {code} > Failed to write key: 17 > 2016-01-15 10:33:53,984 INFO [main] mttr.IntegrationTestMTTR: RestartMaster > failed after 469878ms. > java.util.concurrent.ExecutionException: java.lang.AssertionError: Load > failed expected:<0> but was:<1> > {code} > Here is snippet from ConnectionImplementation#locateRegionInMeta() throwing > exception: > {code} > try { > Result regionInfoRow = null; > ReversedClientScanner rcs = null; > try { > rcs = new ClientSmallReversedScanner(conf, s, > TableName.META_TABLE_NAME, this, > rpcCallerFactory, rpcControllerFactory, getMetaLookupPool(), 0); > regionInfoRow = rcs.next(); > } finally { > if (rcs != null) { > rcs.close(); > } > } > if (regionInfoRow == null) { > throw new TableNotFoundException(tableName); > {code} > I was able to avoid this issue by removing thrown declaration and adding > continue allowing client to retry to locate region. This sounds like simplest > solution here. > Thoughts ? -- This message was sent by Atlassian JIRA (v7.6.3#76005)