[jira] Commented: (HBASE-2481) Client is not getting UnknownScannerExceptions; they are being eaten

Jean-Daniel Cryans (JIRA) Fri, 23 Apr 2010 16:29:16 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12860417#action_12860417
 ]


Jean-Daniel Cryans commented on HBASE-2481:
-------------------------------------------

This was caused by HBASE-1671, this changed in ScannerCallable: 

{code} 
   public Result [] call() throws IOException { 
     if (scannerId != -1L && closed) { 
- server.close(scannerId); 
- scannerId = -1L; 
+ close(); 
     } else if (scannerId == -1L && !closed) { 
- // open the scanner 
- scannerId = openScanner(); 
+ this.scannerId = openScanner(); 
     } else { 
- Result [] rrs = server.next(scannerId, caching); 
+ Result [] rrs = null; 
+ try { 
+ rrs = server.next(scannerId, caching); 
+ } catch (IOException e) { 
+ IOException ioe = null; 
+ if (e instanceof RemoteException) { 
+ ioe = RemoteExceptionHandler.decodeRemoteException((RemoteException)e); 
+ } 
+ if (ioe != null && ioe instanceof NotServingRegionException) { 
+ // Throw a DNRE so that we break out of cycle of calling NSRE 
+ // when what we need is to open scanner against new location. 
+ // Attach NSRE to signal client that it needs to resetup scanner. 
+ throw new DoNotRetryIOException("Reset scanner", ioe); 
+ } 
+ } 
       return rrs == null || rrs.length == 0? null: rrs; 
     } 
      
{code} 

We now eat the exception if it's not NSRE, throwing it if the exception is a 
DoNotRetryIOException is the right thing to do, but the client code is still 
broken. In HTable.ClientScanner.next: 

{code} 
try { 
            // Server returns a null values if scanning is to stop. Else, 
            // returns an empty array if scanning is to go on and we've just 
            // exhausted current region. 
            values = getConnection().getRegionServerWithRetries(callable); 
            if (skipFirst) { 
              skipFirst = false; 
              // Reget. 
              values = getConnection().getRegionServerWithRetries(callable); 
            } 
          } catch (DoNotRetryIOException e) { 
            Throwable cause = e.getCause(); 
            if (cause == null || !(cause instanceof NotServingRegionException)) 
{ 
              throw e; 
            } 
            // Else, its signal from depths of ScannerCallable that we got an 
            // NSRE on a next and that we need to reset the scanner. 
            if (this.lastResult != null) { 
              this.scan.setStartRow(this.lastResult.getRow()); 
              // Skip first row returned. We already let it out on previous 
              // invocation. 
              skipFirst = true; 
            } 
            // Clear region 
            this.currentRegion = null; 
            continue; 
          } catch (IOException e) { 
            if (e instanceof UnknownScannerException && 
                lastNext + scannerTimeout < System.currentTimeMillis()) { 
              ScannerTimeoutException ex = new ScannerTimeoutException(); 
              ex.initCause(e); 
              throw ex; 
            } 
            throw e; 
          } 
{code} 

We catch the DoNotRetryIOException first and in the other catch clause we check 
for UnknownScannerException, which extends DoNotRetryIOException... so 
ScannerTimeoutException is never used! Easy fix.

> Client is not getting UnknownScannerExceptions; they are being eaten
> --------------------------------------------------------------------
>
>                 Key: HBASE-2481
>                 URL: https://issues.apache.org/jira/browse/HBASE-2481
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.4
>            Reporter: stack
>            Priority: Blocker
>
> This was reported by mudphone on IRC and confirmed by myself in quick test.  
> If the client takes too long going back to the RS, the RS will throw an 
> UnknownScannerException but it doesn't get back to the client.  Instead, the 
> client scan silently ends.  Marking this blocker.  Its actually in 0.20.4.  
> Thats what I was testing.  Mayhaps an RC sinker?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2481) Client is not getting UnknownScannerExceptions; they are being eaten

Reply via email to