[ https://issues.apache.org/jira/browse/HBASE-26997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bryan Beaudreault updated HBASE-26997: -------------------------------------- Status: Open (was: Patch Available) > Auto renew scanner lease in TableRecordReader > --------------------------------------------- > > Key: HBASE-26997 > URL: https://issues.apache.org/jira/browse/HBASE-26997 > Project: HBase > Issue Type: New Feature > Reporter: Bryan Beaudreault > Assignee: Bryan Beaudreault > Priority: Major > Labels: patch-available > > A common problem with hadoop jobs is when the mapper takes too long to > process individual inputs. This is especially problematic with > TableInputFormat because if you don't process a scanner.next() batch within > the scanner timeout period your job will fail with UnknownScannerException. > The fix here is usually to reduce Scan.setCaching, so that fewer rows are > returned within each batch. This isn't always a great solution because maybe > not all batches are uniform in their processing time, or maybe even > processing a single row (the smallest caching size) might take a while. > We can improve this for users by providing a configurable period at which the > TableRecordReader will automatically call scanner.renewLease() unless next() > was recently called. -- This message was sent by Atlassian Jira (v8.20.7#820007)