[jira] [Commented] (HBASE-14533) Thrift client gets "AsyncProcess: Failed to get region location .... closed"

Hadoop QA (JIRA) Fri, 09 Oct 2015 15:17:22 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-14533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14951299#comment-14951299
 ]


Hadoop QA commented on HBASE-14533:
-----------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12765884/14533v2.branch-1.patch
  against branch-1 branch at commit c7efd14726e932235dff52bd73f76cb861b5b1b3.
  ATTACHMENT ID: 12765884

    {color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

    {color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

    {color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 
2.7.1)

    {color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

    {color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

                {color:red}-1 checkstyle{color}.  The applied patch generated 
3780 checkstyle errors (more than the master's current 3779 errors).

    {color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

    {color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

     {color:red}-1 core tests{color}.  The patch failed these unit tests:
     

     {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):       
at 
org.apache.hadoop.hbase.thrift.TestThriftServer.testAll(TestThriftServer.java:157)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15931//testReport/
Release Findbugs (version 2.0.3)        warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15931//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15931//artifact/patchprocess/checkstyle-aggregate.html

                Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15931//console

This message is automatically generated.

> Thrift client gets "AsyncProcess: Failed to get region location .... closed"
> ----------------------------------------------------------------------------
>
>                 Key: HBASE-14533
>                 URL: https://issues.apache.org/jira/browse/HBASE-14533
>             Project: HBase
>          Issue Type: Bug
>          Components: REST, Thrift
>    Affects Versions: 1.0.0
>            Reporter: stack
>            Assignee: stack
>         Attachments: 14533.test.patch, 14533v2.branch-1.patch, test.patch
>
>
> An internal python client has been getting below stack trace since 
> HBASE-134347
> {code}
> 2015-09-30 11:27:31,670 runner                    ERROR   : scheduler 
> executor error
> 2015-09-30 11:27:31,674 runner                    ERROR   : Traceback (most 
> recent call last):
>   File 
> "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsRtiFetcher-0.1-py2.6.egg/cops_rti/fetcher/runner.py",
>  line 82, in run
>     fetch_list = self.__scheduler_executor.run()
>   File 
> "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsRtiFetcher-0.1-py2.6.egg/cops_rti/fetcher/scheduler.py",
>  line 35, in run
>     with self.__fetch_db_dao.get_scanner() as scanner:
>   File 
> "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsHbaseCommon-f796bf2929be11c26536c3e8f3e9c0b0ecb382b3-py2.6.egg/cops/hbase/common/hbase_dao.py",
>  line 57, in get_scanner
>     caching=caching, field_filter_list=field_filter_list)
>   File 
> "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsHbaseCommon-f796bf2929be11c26536c3e8f3e9c0b0ecb382b3-py2.6.egg/cops/hbase/common/hbase_client_template.py",
>  line 104, in get_entity_scanner
>     self.__fix_cfs(self.__filter_columns(field_filter_list)), caching)
>   File 
> "/opt/cops/cops-related-ticket-info-fetcher/fetcher/.virtenv/lib/python2.6/site-packages/CopsHbaseCommon-f796bf2929be11c26536c3e8f3e9c0b0ecb382b3-py2.6.egg/cops/hbase/common/hbase_entity_scanner.py",
>  line 81, in open
>     self.__scanner_id = client.scannerOpenWithScan(table_name, scan)
>   File 
> "/opt/cops/cops-related-ticket-info-fetcher/.crepo/cops-hbase-common/ext-py/hbase/Hbase.py",
>  line 1494, in scannerOpenWithScan
>     return self.recv_scannerOpenWithScan()
>   File 
> "/opt/cops/cops-related-ticket-info-fetcher/.crepo/cops-hbase-common/ext-py/hbase/Hbase.py",
>  line 1518, in recv_scannerOpenWithScan
>     raise result.io
> IOError: 
> IOError(message="org.apache.hadoop.hbase.client.RetriesExhaustedException: 
> Can't get the location\n\tat 
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:308)\n\tat
>  
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:149)\n\tat
>  
> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:57)\n\tat
>  
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)\n\tat
>  
> org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:293)\n\tat
>  
> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:268)\n\tat
>  
> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:140)\n\tat
>  
> org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:135)\n\tat
>  org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:888)\n\tat 
> org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.scannerOpenWithScan(ThriftServerRunner.java:1446)\n\tat
>  sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)\n\tat 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat
>  java.lang.reflect.Method.invoke(Method.java:606)\n\tat 
> org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.invoke(HbaseHandlerMetricsProxy.java:67)\n\tat
>  com.sun.proxy.$Proxy14.scannerOpenWithScan(Unknown Source)\n\tat 
> org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$scannerOpenWithScan.getResult(Hbase.java:4609)\n\tat
>  
> org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$scannerOpenWithScan.getResult(Hbase.java:4593)\n\tat
>  org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)\n\tat 
> org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)\n\tat 
> org.apache.hadoop.hbase.thrift.ThriftServerRunner$3.process(ThriftServerRunner.java:502)\n\tat
>  
> org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:289)\n\tat
>  
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\n\tat
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat
>  java.lang.Thread.run(Thread.java:745)\nCaused by: java.io.IOException: 
> hconnection-0xa8e1bf9 closed\n\tat 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1117)\n\tat
>  
> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:299)\n\t...
>  23 more\n")
> {code}
> On the thrift server side we see this:
> {code}
> 2015-09-30 07:22:59,427 ERROR org.apache.hadoop.hbase.client.AsyncProcess: 
> Failed to get region location
> java.io.IOException: hconnection-0x4142991e closed
>         at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1117)
>         at 
> org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:369)
>         at 
> org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:320)
>         at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.backgroundFlushCommits(BufferedMutatorImpl.java:206)
>         at 
> org.apache.hadoop.hbase.client.BufferedMutatorImpl.flush(BufferedMutatorImpl.java:183)
>         at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1496)
>         at org.apache.hadoop.hbase.client.HTable.put(HTable.java:1107)
>         at 
> org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.mutateRowTs(ThriftServerRunner.java:1256)
>         at 
> org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.mutateRow(ThriftServerRunner.java:1209)
>         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at 
> org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.invoke(HbaseHandlerMetricsProxy.java:67)
>         at com.sun.proxy.$Proxy14.mutateRow(Unknown Source)
>         at 
> org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$mutateRow.getResult(Hbase.java:4334)
>         at 
> org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$mutateRow.getResult(Hbase.java:4318)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>         at 
> org.apache.hadoop.hbase.thrift.ThriftServerRunner$3.process(ThriftServerRunner.java:502)
>         at 
> org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:289)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> HBASE-13437 has us actual execute a close on timeout -- before we'd mark 
> connection closed but would never call close on it.
> A background chore is going around stamping Connections in the 
> ConnectionCache as 'closed' if they have not been used in ten minutes. The 
> 'close' can come in at any time..... In particular between the point at which 
> we get the table/connection and when we go to use it: i.e. flush puts.  It is 
> at the flush puts point that we get the above 'AsyncProcess: Failed to get 
> region location' (It is not a failure to find region location but rather our 
> noticing that the connection has been closed).
> Attempts at reproducing this issue locally letting the Connection timeout can 
> generate the above exception if a certain dance is done but it is hard to do; 
> I am not reproducing the actual usage by the aforementioned client.
> Next steps would be setting up python client talking via thrift and then try 
> using connection after it has been evicted from the connection cache. Another 
> thing to try is a pool of connections on the python side...connections are 
> identified by user and table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14533) Thrift client gets "AsyncProcess: Failed to get region location .... closed"

Reply via email to