[ 
https://issues.apache.org/jira/browse/HBASE-26997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Beaudreault updated HBASE-26997:
--------------------------------------
    Status: Open  (was: Patch Available)

> Auto renew scanner lease in TableRecordReader
> ---------------------------------------------
>
>                 Key: HBASE-26997
>                 URL: https://issues.apache.org/jira/browse/HBASE-26997
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Bryan Beaudreault
>            Assignee: Bryan Beaudreault
>            Priority: Major
>              Labels: patch-available
>
> A common problem with hadoop jobs is when the mapper takes too long to 
> process individual inputs. This is especially problematic with 
> TableInputFormat because if you don't process a scanner.next() batch within 
> the scanner timeout period your job will fail with UnknownScannerException.
> The fix here is usually to reduce Scan.setCaching, so that fewer rows are 
> returned within each batch. This isn't always a great solution because maybe 
> not all batches are uniform in their processing time, or maybe even 
> processing a single row (the smallest caching size) might take a while.
> We can improve this for users by providing a configurable period at which the 
> TableRecordReader will automatically call scanner.renewLease() unless next() 
> was recently called.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to