[jira] [Updated] (HBASE-26997) Auto renew scanner lease in TableRecordReader
[ https://issues.apache.org/jira/browse/HBASE-26997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Beaudreault updated HBASE-26997: -- Status: Open (was: Patch Available) > Auto renew scanner lease in TableRecordReader > - > > Key: HBASE-26997 > URL: https://issues.apache.org/jira/browse/HBASE-26997 > Project: HBase > Issue Type: New Feature >Reporter: Bryan Beaudreault >Assignee: Bryan Beaudreault >Priority: Major > Labels: patch-available > > A common problem with hadoop jobs is when the mapper takes too long to > process individual inputs. This is especially problematic with > TableInputFormat because if you don't process a scanner.next() batch within > the scanner timeout period your job will fail with UnknownScannerException. > The fix here is usually to reduce Scan.setCaching, so that fewer rows are > returned within each batch. This isn't always a great solution because maybe > not all batches are uniform in their processing time, or maybe even > processing a single row (the smallest caching size) might take a while. > We can improve this for users by providing a configurable period at which the > TableRecordReader will automatically call scanner.renewLease() unless next() > was recently called. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (HBASE-26997) Auto renew scanner lease in TableRecordReader
[ https://issues.apache.org/jira/browse/HBASE-26997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Beaudreault updated HBASE-26997: -- Labels: patch-available (was: ) Status: Patch Available (was: Open) > Auto renew scanner lease in TableRecordReader > - > > Key: HBASE-26997 > URL: https://issues.apache.org/jira/browse/HBASE-26997 > Project: HBase > Issue Type: New Feature >Reporter: Bryan Beaudreault >Assignee: Bryan Beaudreault >Priority: Major > Labels: patch-available > > A common problem with hadoop jobs is when the mapper takes too long to > process individual inputs. This is especially problematic with > TableInputFormat because if you don't process a scanner.next() batch within > the scanner timeout period your job will fail with UnknownScannerException. > The fix here is usually to reduce Scan.setCaching, so that fewer rows are > returned within each batch. This isn't always a great solution because maybe > not all batches are uniform in their processing time, or maybe even > processing a single row (the smallest caching size) might take a while. > We can improve this for users by providing a configurable period at which the > TableRecordReader will automatically call scanner.renewLease() unless next() > was recently called. -- This message was sent by Atlassian Jira (v8.20.7#820007)