[
https://issues.apache.org/jira/browse/HBASE-14533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173527#comment-15173527
]
Will Hardman commented on HBASE-14533:
--------------------------------------
Hi Pankay,
Sorry - completely neglected to follow this up! I'm using version
1.1.1.2.3.0.0-2557 on HDP 2.3. Is a newer version available on Hortonworks?
With thanks,
Will
> Thrift client gets "AsyncProcess: Failed to get region location .... closed"
> ----------------------------------------------------------------------------
>
> Key: HBASE-14533
> URL: https://issues.apache.org/jira/browse/HBASE-14533
> Project: HBase
> Issue Type: Bug
> Components: REST, Thrift
> Affects Versions: 1.0.0
> Reporter: stack
> Assignee: stack
> Attachments: 14533.test.patch, 14533v2.branch-1.patch, test.patch
>
>
> An internal python client has been getting below stack trace since
> HBASE-134347
> {code}
> 2015-09-30 11:27:31,670 runner ERROR : scheduler
> executor error
> 2015-09-30 11:27:31,674 runner ERROR : Traceback (most
> recent call last):
> File
> "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsRtiFetcher-0.1-py2.6.egg/cops_rti/fetcher/runner.py",
> line 82, in run
> fetch_list = self.__scheduler_executor.run()
> File
> "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsRtiFetcher-0.1-py2.6.egg/cops_rti/fetcher/scheduler.py",
> line 35, in run
> with self.__fetch_db_dao.get_scanner() as scanner:
> File
> "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsHbaseCommon-f796bf2929be11c26536c3e8f3e9c0b0ecb382b3-py2.6.egg/cops/hbase/common/hbase_dao.py",
> line 57, in get_scanner
> caching=caching, field_filter_list=field_filter_list)
> File
> "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsHbaseCommon-f796bf2929be11c26536c3e8f3e9c0b0ecb382b3-py2.6.egg/cops/hbase/common/hbase_client_template.py",
> line 104, in get_entity_scanner
> self.__fix_cfs(self.__filter_columns(field_filter_list)), caching)
> File
> "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsHbaseCommon-f796bf2929be11c26536c3e8f3e9c0b0ecb382b3-py2.6.egg/cops/hbase/common/hbase_entity_scanner.py",
> line 81, in open
> self.__scanner_id = client.scannerOpenWithScan(table_name, scan)
> File
> "/opt/cops/cops-related-ticket-info-fetcher/.crepo/cops-hbase-common/ext-py/hbase/Hbase.py",
> line 1494, in scannerOpenWithScan
> return self.recv_scannerOpenWithScan()
> File
> "/opt/cops/cops-related-ticket-info-fetcher/.crepo/cops-hbase-common/ext-py/hbase/Hbase.py",
> line 1518, in recv_scannerOpenWithScan
> raise result.io
> IOError:
> IOError(message="org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Can't get the location\n\tat
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:308)\n\tat
>
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:149)\n\tat
>
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:57)\n\tat
>
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)\n\tat
>
> org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:293)\n\tat
>
> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:268)\n\tat
>
> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:140)\n\tat
>
> org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:135)\n\tat
> org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:888)\n\tat
> org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.scannerOpenWithScan(ThriftServerRunner.java:1446)\n\tat
> sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)\n\tat
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat
> java.lang.reflect.Method.invoke(Method.java:606)\n\tat
> org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.invoke(HbaseHandlerMetricsProxy.java:67)\n\tat
> com.sun.proxy.$Proxy14.scannerOpenWithScan(Unknown Source)\n\tat
> org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$scannerOpenWithScan.getResult(Hbase.java:4609)\n\tat
>
> org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$scannerOpenWithScan.getResult(Hbase.java:4593)\n\tat
> org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)\n\tat
> org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)\n\tat
> org.apache.hadoop.hbase.thrift.ThriftServerRunner$3.process(ThriftServerRunner.java:502)\n\tat
>
> org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:289)\n\tat
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\n\tat
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat
> java.lang.Thread.run(Thread.java:745)\nCaused by: java.io.IOException:
> hconnection-0xa8e1bf9 closed\n\tat
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1117)\n\tat
>
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:299)\n\t...
> 23 more\n")
> {code}
> On the thrift server side we see this:
> {code}
> 2015-09-30 07:22:59,427 ERROR org.apache.hadoop.hbase.client.AsyncProcess:
> Failed to get region location
> java.io.IOException: hconnection-0x4142991e closed
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1117)
> at
> org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:369)
> at
> org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:320)
> at
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:206)
> at
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:183)
> at
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1496)
> at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1107)
> at
> org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.mutateRowTs(ThriftServerRunner.java:1256)
> at
> org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.mutateRow(ThriftServerRunner.java:1209)
> at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.invoke(HbaseHandlerMetricsProxy.java:67)
> at com.sun.proxy.$Proxy14.mutateRow(Unknown Source)
> at
> org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$mutateRow.getResult(Hbase.java:4334)
> at
> org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$mutateRow.getResult(Hbase.java:4318)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at
> org.apache.hadoop.hbase.thrift.ThriftServerRunner$3.process(ThriftServerRunner.java:502)
> at
> org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:289)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> HBASE-13437 has us actual execute a close on timeout -- before we'd mark
> connection closed but would never call close on it.
> A background chore is going around stamping Connections in the
> ConnectionCache as 'closed' if they have not been used in ten minutes. The
> 'close' can come in at any time..... In particular between the point at which
> we get the table/connection and when we go to use it: i.e. flush puts. It is
> at the flush puts point that we get the above 'AsyncProcess: Failed to get
> region location' (It is not a failure to find region location but rather our
> noticing that the connection has been closed).
> Attempts at reproducing this issue locally letting the Connection timeout can
> generate the above exception if a certain dance is done but it is hard to do;
> I am not reproducing the actual usage by the aforementioned client.
> Next steps would be setting up python client talking via thrift and then try
> using connection after it has been evicted from the connection cache. Another
> thing to try is a pool of connections on the python side...connections are
> identified by user and table.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)