[
https://issues.apache.org/jira/browse/HBASE-14533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949463#comment-14949463
]
stack commented on HBASE-14533:
-------------------------------
Test fails like this if I back out Validimir's HBASE-14196 fix:
{code}
-------------------------------------------------------
Running org.apache.hadoop.hbase.thrift.TestThriftServer
2015-10-08 14:16:11.197 java[71831:570651] Unable to load realm info from
SCDynamicStore
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 13.299 sec <<<
FAILURE! - in org.apache.hadoop.hbase.thrift.TestThriftServer
testConnectionCache(org.apache.hadoop.hbase.thrift.TestThriftServer) Time
elapsed: 3.341 sec <<< ERROR!
java.lang.RuntimeException:
IOError(message:org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException:
Failed 1 action: IOException: 1 time,
at
org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.makeException(AsyncProcess.java:228)
at
org.apache.hadoop.hbase.client.AsyncProcess$BatchErrors.access$1800(AsyncProcess.java:208)
at
org.apache.hadoop.hbase.client.AsyncProcess.waitForAllPreviousOpsAndReset(AsyncProcess.java:1695)
at
org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:208)
at
org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:183)
at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1430)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1021)
at
org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.mutateRowTs(ThriftServerRunner.java:1262)
at
org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.mutateRow(ThriftServerRunner.java:1215)
at
org.apache.hadoop.hbase.thrift.TestThriftServer.testConnectionCache(TestThriftServer.java:126)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:275)
at
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
at
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:149)
at
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
at
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
at
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
at
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
)
at
org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.mutateRowTs(ThriftServerRunner.java:1265)
at
org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.mutateRow(ThriftServerRunner.java:1215)
at
org.apache.hadoop.hbase.thrift.TestThriftServer.testConnectionCache(TestThriftServer.java:126)
Results :
Tests in error:
TestThriftServer.testConnectionCache:129 Runtime
IOError(message:org.apache.ha...
Tests run: 1, Failures: 0, Errors: 1, Skipped: 0
{code}
... if I look at the log of the test I see...the above "2015-09-30 07:22:59,427
ERROR org.apache.hadoop.hbase.client.AsyncProcess: Failed to get region location
java.io.IOException: hconnection-0x4142991e closed"
> Thrift client gets "AsyncProcess: Failed to get region location .... closed"
> ----------------------------------------------------------------------------
>
> Key: HBASE-14533
> URL: https://issues.apache.org/jira/browse/HBASE-14533
> Project: HBase
> Issue Type: Bug
> Components: REST, Thrift
> Affects Versions: 1.0.0
> Reporter: stack
> Assignee: stack
> Attachments: 14533.test.patch, test.patch
>
>
> An internal python client has been getting below stack trace since
> HBASE-134347
> {code}
> 2015-09-30 11:27:31,670 runner ERROR : scheduler
> executor error
> 2015-09-30 11:27:31,674 runner ERROR : Traceback (most
> recent call last):
> File
> "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsRtiFetcher-0.1-py2.6.egg/cops_rti/fetcher/runner.py",
> line 82, in run
> fetch_list = self.__scheduler_executor.run()
> File
> "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsRtiFetcher-0.1-py2.6.egg/cops_rti/fetcher/scheduler.py",
> line 35, in run
> with self.__fetch_db_dao.get_scanner() as scanner:
> File
> "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsHbaseCommon-f796bf2929be11c26536c3e8f3e9c0b0ecb382b3-py2.6.egg/cops/hbase/common/hbase_dao.py",
> line 57, in get_scanner
> caching=caching, field_filter_list=field_filter_list)
> File
> "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsHbaseCommon-f796bf2929be11c26536c3e8f3e9c0b0ecb382b3-py2.6.egg/cops/hbase/common/hbase_client_template.py",
> line 104, in get_entity_scanner
> self.__fix_cfs(self.__filter_columns(field_filter_list)), caching)
> File
> "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsHbaseCommon-f796bf2929be11c26536c3e8f3e9c0b0ecb382b3-py2.6.egg/cops/hbase/common/hbase_entity_scanner.py",
> line 81, in open
> self.__scanner_id = client.scannerOpenWithScan(table_name, scan)
> File
> "/opt/cops/cops-related-ticket-info-fetcher/.crepo/cops-hbase-common/ext-py/hbase/Hbase.py",
> line 1494, in scannerOpenWithScan
> return self.recv_scannerOpenWithScan()
> File
> "/opt/cops/cops-related-ticket-info-fetcher/.crepo/cops-hbase-common/ext-py/hbase/Hbase.py",
> line 1518, in recv_scannerOpenWithScan
> raise result.io
> IOError:
> IOError(message="org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Can't get the location\n\tat
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:308)\n\tat
>
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:149)\n\tat
>
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:57)\n\tat
>
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)\n\tat
>
> org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:293)\n\tat
>
> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:268)\n\tat
>
> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:140)\n\tat
>
> org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:135)\n\tat
> org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:888)\n\tat
> org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.scannerOpenWithScan(ThriftServerRunner.java:1446)\n\tat
> sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)\n\tat
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat
> java.lang.reflect.Method.invoke(Method.java:606)\n\tat
> org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.invoke(HbaseHandlerMetricsProxy.java:67)\n\tat
> com.sun.proxy.$Proxy14.scannerOpenWithScan(Unknown Source)\n\tat
> org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$scannerOpenWithScan.getResult(Hbase.java:4609)\n\tat
>
> org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$scannerOpenWithScan.getResult(Hbase.java:4593)\n\tat
> org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)\n\tat
> org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)\n\tat
> org.apache.hadoop.hbase.thrift.ThriftServerRunner$3.process(ThriftServerRunner.java:502)\n\tat
>
> org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:289)\n\tat
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\n\tat
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat
> java.lang.Thread.run(Thread.java:745)\nCaused by: java.io.IOException:
> hconnection-0xa8e1bf9 closed\n\tat
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1117)\n\tat
>
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:299)\n\t...
> 23 more\n")
> {code}
> On the thrift server side we see this:
> {code}
> 2015-09-30 07:22:59,427 ERROR org.apache.hadoop.hbase.client.AsyncProcess:
> Failed to get region location
> java.io.IOException: hconnection-0x4142991e closed
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1117)
> at
> org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:369)
> at
> org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:320)
> at
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:206)
> at
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:183)
> at
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1496)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1107)
> at
> org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.mutateRowTs(ThriftServerRunner.java:1256)
> at
> org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.mutateRow(ThriftServerRunner.java:1209)
> at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.invoke(HbaseHandlerMetricsProxy.java:67)
> at com.sun.proxy.$Proxy14.mutateRow(Unknown Source)
> at
> org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$mutateRow.getResult(Hbase.java:4334)
> at
> org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$mutateRow.getResult(Hbase.java:4318)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at
> org.apache.hadoop.hbase.thrift.ThriftServerRunner$3.process(ThriftServerRunner.java:502)
> at
> org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:289)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> HBASE-13437 has us actual execute a close on timeout -- before we'd mark
> connection closed but would never call close on it.
> A background chore is going around stamping Connections in the
> ConnectionCache as 'closed' if they have not been used in ten minutes. The
> 'close' can come in at any time..... In particular between the point at which
> we get the table/connection and when we go to use it: i.e. flush puts. It is
> at the flush puts point that we get the above 'AsyncProcess: Failed to get
> region location' (It is not a failure to find region location but rather our
> noticing that the connection has been closed).
> Attempts at reproducing this issue locally letting the Connection timeout can
> generate the above exception if a certain dance is done but it is hard to do;
> I am not reproducing the actual usage by the aforementioned client.
> Next steps would be setting up python client talking via thrift and then try
> using connection after it has been evicted from the connection cache. Another
> thing to try is a pool of connections on the python side...connections are
> identified by user and table.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)