[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2019-04-11 Thread Swapna (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815712#comment-16815712
 ] 

Swapna commented on HBASE-20618:


This is done specific to our use case. We can get rid of server side filtering 
on a CF(large) and take benefit of JoinedScanner if we have some way to handle 
big rows  on server side.

In order to generalize this optimization to list of filters with MUST_PASS_ALL, 
filter api’s need to be modified and involves big effort.

Would love to hear if that will be useful for the community. 

Waiting to hear some suggestions. Will be happy to incorporate the changes. 
Otherwise can be closed if this is not useful for many users.Thanks.

> Skip large rows instead of throwing an exception to client
> --
>
> Key: HBASE-20618
> URL: https://issues.apache.org/jira/browse/HBASE-20618
> Project: HBase
>  Issue Type: New Feature
>Reporter: Swapna
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.3.0
>
> Attachments: HBASE-20618.hbasemaster.v01.patch, 
> HBASE-20618.hbasemaster.v02.patch, HBASE-20618.v1.branch-1.patch, 
> HBASE-20618.v1.branch-1.patch
>
>
> Currently HBase supports throwing RowTooBigException incase there is a row 
> with one of the column family data exceeds the configured maximum
> https://issues.apache.org/jira/browse/HBASE-10925?attachmentOrder=desc
> We have some bad rows growing very large. We need a way to skip these rows 
> for most of our jobs.
> Some of the options we considered:
> Option 1:
> Hbase client handle the exception and restart the scanner past bad row by 
> capturing the row key where it failed. Can be by adding the rowkey to the 
> exception stack trace, which seems brittle. Client would ignore the setting 
> if its upgraded before server.
> Option 2:
> Skip through big rows on Server.Go with server level config similar to 
> "hbase.table.max.rowsize" or request based by changing the scan request api. 
> If allowed to do per request, based on the scan request config, Client will 
> have to ignore the setting if its upgraded before server.
> {code}
> try {
>  populateResult(results, this.storeHeap, scannerContext, current);
>  } catch(RowTooBigException e) {
>  LOG.info("Row exceeded the limit in storeheap. Skipping row with 
> key:"+Bytes.toString(current.getRowArray()));
>  this.storeHeap.reseek(PrivateCellUtil.createLastOnRow(current));
>  results.clear();
>  scannerContext.clearProgress();
>  continue;
>  }
> {code}
> Prefer the option 2 with server level config. Please share your inputs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2019-04-11 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16815619#comment-16815619
 ] 

Andrew Purtell commented on HBASE-20618:


Any progress here? Or unschedule it? Or close it?

> Skip large rows instead of throwing an exception to client
> --
>
> Key: HBASE-20618
> URL: https://issues.apache.org/jira/browse/HBASE-20618
> Project: HBase
>  Issue Type: New Feature
>Reporter: Swapna
>Priority: Minor
> Fix For: 3.0.0, 1.5.0, 2.3.0
>
> Attachments: HBASE-20618.hbasemaster.v01.patch, 
> HBASE-20618.hbasemaster.v02.patch, HBASE-20618.v1.branch-1.patch, 
> HBASE-20618.v1.branch-1.patch
>
>
> Currently HBase supports throwing RowTooBigException incase there is a row 
> with one of the column family data exceeds the configured maximum
> https://issues.apache.org/jira/browse/HBASE-10925?attachmentOrder=desc
> We have some bad rows growing very large. We need a way to skip these rows 
> for most of our jobs.
> Some of the options we considered:
> Option 1:
> Hbase client handle the exception and restart the scanner past bad row by 
> capturing the row key where it failed. Can be by adding the rowkey to the 
> exception stack trace, which seems brittle. Client would ignore the setting 
> if its upgraded before server.
> Option 2:
> Skip through big rows on Server.Go with server level config similar to 
> "hbase.table.max.rowsize" or request based by changing the scan request api. 
> If allowed to do per request, based on the scan request config, Client will 
> have to ignore the setting if its upgraded before server.
> {code}
> try {
>  populateResult(results, this.storeHeap, scannerContext, current);
>  } catch(RowTooBigException e) {
>  LOG.info("Row exceeded the limit in storeheap. Skipping row with 
> key:"+Bytes.toString(current.getRowArray()));
>  this.storeHeap.reseek(PrivateCellUtil.createLastOnRow(current));
>  results.clear();
>  scannerContext.clearProgress();
>  continue;
>  }
> {code}
> Prefer the option 2 with server level config. Please share your inputs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2019-01-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16758010#comment-16758010
 ] 

Hadoop QA commented on HBASE-20618:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  5m 
43s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
 6s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} branch-1 passed with JDK v1.8.0_201 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} branch-1 passed with JDK v1.7.0_201 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
43s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
31s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} branch-1 passed with JDK v1.8.0_201 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} branch-1 passed with JDK v1.7.0_201 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed with JDK v1.8.0_201 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed with JDK v1.7.0_201 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
25s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
1m 34s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed with JDK v1.8.0_201 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed with JDK v1.7.0_201 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
24s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}108m 
40s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}139m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:61288f8 |
| 

[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2018-06-18 Thread churro morales (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516009#comment-16516009
 ] 

churro morales commented on HBASE-20618:


that would be ideal, but we don't have any nice way of sending back the rowkey 
of the large row back to the client.  I definitely don't want to parse the 
exception message for it.  

> Skip large rows instead of throwing an exception to client
> --
>
> Key: HBASE-20618
> URL: https://issues.apache.org/jira/browse/HBASE-20618
> Project: HBase
>  Issue Type: New Feature
>Reporter: Swapna
>Priority: Minor
> Fix For: 3.0.0, 2.0.1, 1.4.6
>
> Attachments: HBASE-20618.hbasemaster.v01.patch, 
> HBASE-20618.hbasemaster.v02.patch, HBASE-20618.v1.branch-1.patch, 
> HBASE-20618.v1.branch-1.patch
>
>
> Currently HBase supports throwing RowTooBigException incase there is a row 
> with one of the column family data exceeds the configured maximum
> https://issues.apache.org/jira/browse/HBASE-10925?attachmentOrder=desc
> We have some bad rows growing very large. We need a way to skip these rows 
> for most of our jobs.
> Some of the options we considered:
> Option 1:
> Hbase client handle the exception and restart the scanner past bad row by 
> capturing the row key where it failed. Can be by adding the rowkey to the 
> exception stack trace, which seems brittle. Client would ignore the setting 
> if its upgraded before server.
> Option 2:
> Skip through big rows on Server.Go with server level config similar to 
> "hbase.table.max.rowsize" or request based by changing the scan request api. 
> If allowed to do per request, based on the scan request config, Client will 
> have to ignore the setting if its upgraded before server.
> {code}
> try {
>  populateResult(results, this.storeHeap, scannerContext, current);
>  } catch(RowTooBigException e) {
>  LOG.info("Row exceeded the limit in storeheap. Skipping row with 
> key:"+Bytes.toString(current.getRowArray()));
>  this.storeHeap.reseek(PrivateCellUtil.createLastOnRow(current));
>  results.clear();
>  scannerContext.clearProgress();
>  continue;
>  }
> {code}
> Prefer the option 2 with server level config. Please share your inputs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2018-06-18 Thread Swapna (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516001#comment-16516001
 ] 

Swapna commented on HBASE-20618:


[~elserj] , [~eclark]

Any suggestions or alternatives? Do you prefer option 1 (Handling the same on 
client side ) ? 

 

> Skip large rows instead of throwing an exception to client
> --
>
> Key: HBASE-20618
> URL: https://issues.apache.org/jira/browse/HBASE-20618
> Project: HBase
>  Issue Type: New Feature
>Reporter: Swapna
>Priority: Minor
> Fix For: 3.0.0, 2.0.1, 1.4.6
>
> Attachments: HBASE-20618.hbasemaster.v01.patch, 
> HBASE-20618.hbasemaster.v02.patch, HBASE-20618.v1.branch-1.patch, 
> HBASE-20618.v1.branch-1.patch
>
>
> Currently HBase supports throwing RowTooBigException incase there is a row 
> with one of the column family data exceeds the configured maximum
> https://issues.apache.org/jira/browse/HBASE-10925?attachmentOrder=desc
> We have some bad rows growing very large. We need a way to skip these rows 
> for most of our jobs.
> Some of the options we considered:
> Option 1:
> Hbase client handle the exception and restart the scanner past bad row by 
> capturing the row key where it failed. Can be by adding the rowkey to the 
> exception stack trace, which seems brittle. Client would ignore the setting 
> if its upgraded before server.
> Option 2:
> Skip through big rows on Server.Go with server level config similar to 
> "hbase.table.max.rowsize" or request based by changing the scan request api. 
> If allowed to do per request, based on the scan request config, Client will 
> have to ignore the setting if its upgraded before server.
> {code}
> try {
>  populateResult(results, this.storeHeap, scannerContext, current);
>  } catch(RowTooBigException e) {
>  LOG.info("Row exceeded the limit in storeheap. Skipping row with 
> key:"+Bytes.toString(current.getRowArray()));
>  this.storeHeap.reseek(PrivateCellUtil.createLastOnRow(current));
>  results.clear();
>  scannerContext.clearProgress();
>  continue;
>  }
> {code}
> Prefer the option 2 with server level config. Please share your inputs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2018-06-08 Thread Swapna (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506309#comment-16506309
 ] 

Swapna commented on HBASE-20618:


Some more context about our use case to discuss more on available options. 

We have a server side filter on 2 cf's  - A (small cf), B (large cf). B can go 
very large in some odd cases. Currently we filter rows on server side based on 
estimated size as we read through B to avoid OOM/ Exception to client.  We want 
to remove the filtering on B to take benefit of JoinedScanner and skip through 
B when rows are filtered out based on A. We also need filterRow() to filter 
rows with missing cells.

So we wanted a way to handle big rows outside the filter. 

> Skip large rows instead of throwing an exception to client
> --
>
> Key: HBASE-20618
> URL: https://issues.apache.org/jira/browse/HBASE-20618
> Project: HBase
>  Issue Type: New Feature
>Reporter: Swapna
>Priority: Minor
> Fix For: 3.0.0, 2.0.1, 1.4.6
>
> Attachments: HBASE-20618.hbasemaster.v01.patch, 
> HBASE-20618.hbasemaster.v02.patch, HBASE-20618.v1.branch-1.patch, 
> HBASE-20618.v1.branch-1.patch
>
>
> Currently HBase supports throwing RowTooBigException incase there is a row 
> with one of the column family data exceeds the configured maximum
> https://issues.apache.org/jira/browse/HBASE-10925?attachmentOrder=desc
> We have some bad rows growing very large. We need a way to skip these rows 
> for most of our jobs.
> Some of the options we considered:
> Option 1:
> Hbase client handle the exception and restart the scanner past bad row by 
> capturing the row key where it failed. Can be by adding the rowkey to the 
> exception stack trace, which seems brittle. Client would ignore the setting 
> if its upgraded before server.
> Option 2:
> Skip through big rows on Server.Go with server level config similar to 
> "hbase.table.max.rowsize" or request based by changing the scan request api. 
> If allowed to do per request, based on the scan request config, Client will 
> have to ignore the setting if its upgraded before server.
> {code}
> try {
>  populateResult(results, this.storeHeap, scannerContext, current);
>  } catch(RowTooBigException e) {
>  LOG.info("Row exceeded the limit in storeheap. Skipping row with 
> key:"+Bytes.toString(current.getRowArray()));
>  this.storeHeap.reseek(PrivateCellUtil.createLastOnRow(current));
>  results.clear();
>  scannerContext.clearProgress();
>  continue;
>  }
> {code}
> Prefer the option 2 with server level config. Please share your inputs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2018-06-07 Thread Josh Elser (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504816#comment-16504816
 ] 

Josh Elser commented on HBASE-20618:


{quote}This seems like the wrong way to go about this. HBase has always been 
about strong consistency. We fail things rather than return the fastest easiest 
answer. That seems like the pattern we should take.
{quote}
Yeah, +1 to this. This seems like the wrong thing to encourage.
{quote}If a row is too big then we already provide the ability to allow partial 
results that can facilitate reading rows too large to send in one rpc.
{quote}
 
{quote}we have a server side filter with hasFilterRow set to true. We drop 
results based on some cells missing for a row. And this is incompatible with 
partial results as row boundaries are not known.
{quote}
So, the real problem is that your custom server-side filter can't work in 
conjunction with the existing functionality to chunk up a row? Shouldn't the 
fix be around that?

> Skip large rows instead of throwing an exception to client
> --
>
> Key: HBASE-20618
> URL: https://issues.apache.org/jira/browse/HBASE-20618
> Project: HBase
>  Issue Type: New Feature
>Reporter: Swapna
>Priority: Minor
> Fix For: 3.0.0, 2.0.1, 1.4.6
>
> Attachments: HBASE-20618.hbasemaster.v01.patch, 
> HBASE-20618.hbasemaster.v02.patch, HBASE-20618.v1.branch-1.patch, 
> HBASE-20618.v1.branch-1.patch
>
>
> Currently HBase supports throwing RowTooBigException incase there is a row 
> with one of the column family data exceeds the configured maximum
> https://issues.apache.org/jira/browse/HBASE-10925?attachmentOrder=desc
> We have some bad rows growing very large. We need a way to skip these rows 
> for most of our jobs.
> Some of the options we considered:
> Option 1:
> Hbase client handle the exception and restart the scanner past bad row by 
> capturing the row key where it failed. Can be by adding the rowkey to the 
> exception stack trace, which seems brittle. Client would ignore the setting 
> if its upgraded before server.
> Option 2:
> Skip through big rows on Server.Go with server level config similar to 
> "hbase.table.max.rowsize" or request based by changing the scan request api. 
> If allowed to do per request, based on the scan request config, Client will 
> have to ignore the setting if its upgraded before server.
> {code}
> try {
>  populateResult(results, this.storeHeap, scannerContext, current);
>  } catch(RowTooBigException e) {
>  LOG.info("Row exceeded the limit in storeheap. Skipping row with 
> key:"+Bytes.toString(current.getRowArray()));
>  this.storeHeap.reseek(PrivateCellUtil.createLastOnRow(current));
>  results.clear();
>  scannerContext.clearProgress();
>  continue;
>  }
> {code}
> Prefer the option 2 with server level config. Please share your inputs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2018-05-31 Thread Swapna (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497492#comment-16497492
 ] 

Swapna commented on HBASE-20618:


Thanks [~eclark]

Looked into that option. But we have a server side filter with hasFilterRow set 
to true. We drop results based on some cells missing for a row. And this is 
incompatible with partial results as row boundaries are not known.

> Skip large rows instead of throwing an exception to client
> --
>
> Key: HBASE-20618
> URL: https://issues.apache.org/jira/browse/HBASE-20618
> Project: HBase
>  Issue Type: New Feature
>Reporter: Swapna
>Priority: Minor
> Fix For: 3.0.0, 2.0.1, 1.4.5
>
> Attachments: HBASE-20618.hbasemaster.v01.patch, 
> HBASE-20618.hbasemaster.v02.patch, HBASE-20618.v1.branch-1.patch, 
> HBASE-20618.v1.branch-1.patch
>
>
> Currently HBase supports throwing RowTooBigException incase there is a row 
> with one of the column family data exceeds the configured maximum
> https://issues.apache.org/jira/browse/HBASE-10925?attachmentOrder=desc
> We have some bad rows growing very large. We need a way to skip these rows 
> for most of our jobs.
> Some of the options we considered:
> Option 1:
> Hbase client handle the exception and restart the scanner past bad row by 
> capturing the row key where it failed. Can be by adding the rowkey to the 
> exception stack trace, which seems brittle. Client would ignore the setting 
> if its upgraded before server.
> Option 2:
> Skip through big rows on Server.Go with server level config similar to 
> "hbase.table.max.rowsize" or request based by changing the scan request api. 
> If allowed to do per request, based on the scan request config, Client will 
> have to ignore the setting if its upgraded before server.
> {code}
> try {
>  populateResult(results, this.storeHeap, scannerContext, current);
>  } catch(RowTooBigException e) {
>  LOG.info("Row exceeded the limit in storeheap. Skipping row with 
> key:"+Bytes.toString(current.getRowArray()));
>  this.storeHeap.reseek(PrivateCellUtil.createLastOnRow(current));
>  results.clear();
>  scannerContext.clearProgress();
>  continue;
>  }
> {code}
> Prefer the option 2 with server level config. Please share your inputs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2018-05-31 Thread Elliott Clark (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497357#comment-16497357
 ] 

Elliott Clark commented on HBASE-20618:
---

This seems like the wrong way to go about this. HBase has always been about 
strong consistency. We fail things rather than return the fastest easiest 
answer. That seems like the pattern we should take.

If a row is too big then we already provide the ability to allow partial 
results that can facilitate reading rows too large to send in one rpc.

> Skip large rows instead of throwing an exception to client
> --
>
> Key: HBASE-20618
> URL: https://issues.apache.org/jira/browse/HBASE-20618
> Project: HBase
>  Issue Type: New Feature
>Reporter: Swapna
>Priority: Minor
> Fix For: 3.0.0, 2.0.1, 1.4.5
>
> Attachments: HBASE-20618.hbasemaster.v01.patch, 
> HBASE-20618.hbasemaster.v02.patch, HBASE-20618.v1.branch-1.patch, 
> HBASE-20618.v1.branch-1.patch
>
>
> Currently HBase supports throwing RowTooBigException incase there is a row 
> with one of the column family data exceeds the configured maximum
> https://issues.apache.org/jira/browse/HBASE-10925?attachmentOrder=desc
> We have some bad rows growing very large. We need a way to skip these rows 
> for most of our jobs.
> Some of the options we considered:
> Option 1:
> Hbase client handle the exception and restart the scanner past bad row by 
> capturing the row key where it failed. Can be by adding the rowkey to the 
> exception stack trace, which seems brittle. Client would ignore the setting 
> if its upgraded before server.
> Option 2:
> Skip through big rows on Server.Go with server level config similar to 
> "hbase.table.max.rowsize" or request based by changing the scan request api. 
> If allowed to do per request, based on the scan request config, Client will 
> have to ignore the setting if its upgraded before server.
> {code}
> try {
>  populateResult(results, this.storeHeap, scannerContext, current);
>  } catch(RowTooBigException e) {
>  LOG.info("Row exceeded the limit in storeheap. Skipping row with 
> key:"+Bytes.toString(current.getRowArray()));
>  this.storeHeap.reseek(PrivateCellUtil.createLastOnRow(current));
>  results.clear();
>  scannerContext.clearProgress();
>  continue;
>  }
> {code}
> Prefer the option 2 with server level config. Please share your inputs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2018-05-31 Thread Swapna (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497347#comment-16497347
 ] 

Swapna commented on HBASE-20618:


Seems unrelated to my changes.

> Skip large rows instead of throwing an exception to client
> --
>
> Key: HBASE-20618
> URL: https://issues.apache.org/jira/browse/HBASE-20618
> Project: HBase
>  Issue Type: New Feature
>Reporter: Swapna
>Priority: Minor
> Fix For: 3.0.0, 2.0.1, 1.4.5
>
> Attachments: HBASE-20618.hbasemaster.v01.patch, 
> HBASE-20618.hbasemaster.v02.patch, HBASE-20618.v1.branch-1.patch, 
> HBASE-20618.v1.branch-1.patch
>
>
> Currently HBase supports throwing RowTooBigException incase there is a row 
> with one of the column family data exceeds the configured maximum
> https://issues.apache.org/jira/browse/HBASE-10925?attachmentOrder=desc
> We have some bad rows growing very large. We need a way to skip these rows 
> for most of our jobs.
> Some of the options we considered:
> Option 1:
> Hbase client handle the exception and restart the scanner past bad row by 
> capturing the row key where it failed. Can be by adding the rowkey to the 
> exception stack trace, which seems brittle. Client would ignore the setting 
> if its upgraded before server.
> Option 2:
> Skip through big rows on Server.Go with server level config similar to 
> "hbase.table.max.rowsize" or request based by changing the scan request api. 
> If allowed to do per request, based on the scan request config, Client will 
> have to ignore the setting if its upgraded before server.
> {code}
> try {
>  populateResult(results, this.storeHeap, scannerContext, current);
>  } catch(RowTooBigException e) {
>  LOG.info("Row exceeded the limit in storeheap. Skipping row with 
> key:"+Bytes.toString(current.getRowArray()));
>  this.storeHeap.reseek(PrivateCellUtil.createLastOnRow(current));
>  results.clear();
>  scannerContext.clearProgress();
>  continue;
>  }
> {code}
> Prefer the option 2 with server level config. Please share your inputs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2018-05-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497313#comment-16497313
 ] 

Hadoop QA commented on HBASE-20618:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
16s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
58s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
48s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
38s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
24s{color} | {color:red} hbase-common: The patch generated 1 new + 9 unchanged 
- 1 fixed = 10 total (was 10) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
22s{color} | {color:red} hbase-server: The patch generated 3 new + 313 
unchanged - 3 fixed = 316 total (was 316) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
36s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
1m 33s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m  
4s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 97m 20s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
44s{color} | {color:green} The patch does not generate ASF 

[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2018-05-31 Thread churro morales (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497176#comment-16497176
 ] 

churro morales commented on HBASE-20618:


after of course tests pass. 

> Skip large rows instead of throwing an exception to client
> --
>
> Key: HBASE-20618
> URL: https://issues.apache.org/jira/browse/HBASE-20618
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 3.0.0, 2.1.0, 1.4.5
>Reporter: Swapna
>Priority: Minor
> Attachments: HBASE-20618.hbasemaster.v01.patch, 
> HBASE-20618.hbasemaster.v02.patch, HBASE-20618.v1.branch-1.patch, 
> HBASE-20618.v1.branch-1.patch
>
>
> Currently HBase supports throwing RowTooBigException incase there is a row 
> with one of the column family data exceeds the configured maximum
> https://issues.apache.org/jira/browse/HBASE-10925?attachmentOrder=desc
> We have some bad rows growing very large. We need a way to skip these rows 
> for most of our jobs.
> Some of the options we considered:
> Option 1:
> Hbase client handle the exception and restart the scanner past bad row by 
> capturing the row key where it failed. Can be by adding the rowkey to the 
> exception stack trace, which seems brittle. Client would ignore the setting 
> if its upgraded before server.
> Option 2:
> Skip through big rows on Server.Go with server level config similar to 
> "hbase.table.max.rowsize" or request based by changing the scan request api. 
> If allowed to do per request, based on the scan request config, Client will 
> have to ignore the setting if its upgraded before server.
> {code}
> try {
>  populateResult(results, this.storeHeap, scannerContext, current);
>  } catch(RowTooBigException e) {
>  LOG.info("Row exceeded the limit in storeheap. Skipping row with 
> key:"+Bytes.toString(current.getRowArray()));
>  this.storeHeap.reseek(PrivateCellUtil.createLastOnRow(current));
>  results.clear();
>  scannerContext.clearProgress();
>  continue;
>  }
> {code}
> Prefer the option 2 with server level config. Please share your inputs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2018-05-31 Thread churro morales (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497175#comment-16497175
 ] 

churro morales commented on HBASE-20618:


lgtm, anyone else watching have any objections?  I'll commit in a day or two if 
I don't hear any objections.  Thank you for the patch [~mswapna]

> Skip large rows instead of throwing an exception to client
> --
>
> Key: HBASE-20618
> URL: https://issues.apache.org/jira/browse/HBASE-20618
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 3.0.0, 2.1.0, 1.4.5
>Reporter: Swapna
>Priority: Minor
> Attachments: HBASE-20618.hbasemaster.v01.patch, 
> HBASE-20618.hbasemaster.v02.patch, HBASE-20618.v1.branch-1.patch, 
> HBASE-20618.v1.branch-1.patch
>
>
> Currently HBase supports throwing RowTooBigException incase there is a row 
> with one of the column family data exceeds the configured maximum
> https://issues.apache.org/jira/browse/HBASE-10925?attachmentOrder=desc
> We have some bad rows growing very large. We need a way to skip these rows 
> for most of our jobs.
> Some of the options we considered:
> Option 1:
> Hbase client handle the exception and restart the scanner past bad row by 
> capturing the row key where it failed. Can be by adding the rowkey to the 
> exception stack trace, which seems brittle. Client would ignore the setting 
> if its upgraded before server.
> Option 2:
> Skip through big rows on Server.Go with server level config similar to 
> "hbase.table.max.rowsize" or request based by changing the scan request api. 
> If allowed to do per request, based on the scan request config, Client will 
> have to ignore the setting if its upgraded before server.
> {code}
> try {
>  populateResult(results, this.storeHeap, scannerContext, current);
>  } catch(RowTooBigException e) {
>  LOG.info("Row exceeded the limit in storeheap. Skipping row with 
> key:"+Bytes.toString(current.getRowArray()));
>  this.storeHeap.reseek(PrivateCellUtil.createLastOnRow(current));
>  results.clear();
>  scannerContext.clearProgress();
>  continue;
>  }
> {code}
> Prefer the option 2 with server level config. Please share your inputs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2018-05-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16497107#comment-16497107
 ] 

Hadoop QA commented on HBASE-20618:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 19m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
24s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
30s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
11s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m  
9s{color} | {color:red} hbase-server: The patch generated 5 new + 313 unchanged 
- 3 fixed = 318 total (was 316) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
11s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
1m 20s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m  
5s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}102m 50s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}149m 48s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | 

[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2018-05-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495666#comment-16495666
 ] 

Hadoop QA commented on HBASE-20618:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
14s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
50s{color} | {color:blue} hbase-server in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 7s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
12m 50s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
22s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}104m  
0s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}150m 43s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-20618 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925792/HBASE-20618.hbasemaster.v02.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux c8003282b26b 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / c17be2e622 |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) 

[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2018-05-30 Thread churro morales (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495445#comment-16495445
 ] 

churro morales commented on HBASE-20618:


Looks like the test failure is unrelated, lots of checkstyle issues in 
TestSkipBigRowScanner.  If you fix those issues, I am +1 on this patch.

> Skip large rows instead of throwing an exception to client
> --
>
> Key: HBASE-20618
> URL: https://issues.apache.org/jira/browse/HBASE-20618
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 3.0.0
>Reporter: Swapna
>Priority: Minor
> Attachments: HBASE-20618.hbasemaster.v01.patch
>
>
> Currently HBase supports throwing RowTooBigException incase there is a row 
> with one of the column family data exceeds the configured maximum
> https://issues.apache.org/jira/browse/HBASE-10925?attachmentOrder=desc
> We have some bad rows growing very large. We need a way to skip these rows 
> for most of our jobs.
> Some of the options we considered:
> Option 1:
> Hbase client handle the exception and restart the scanner past bad row by 
> capturing the row key where it failed. Can be by adding the rowkey to the 
> exception stack trace, which seems brittle. Client would ignore the setting 
> if its upgraded before server.
> Option 2:
> Skip through big rows on Server.Go with server level config similar to 
> "hbase.table.max.rowsize" or request based by changing the scan request api. 
> If allowed to do per request, based on the scan request config, Client will 
> have to ignore the setting if its upgraded before server.
> {code}
> try {
>  populateResult(results, this.storeHeap, scannerContext, current);
>  } catch(RowTooBigException e) {
>  LOG.info("Row exceeded the limit in storeheap. Skipping row with 
> key:"+Bytes.toString(current.getRowArray()));
>  this.storeHeap.reseek(PrivateCellUtil.createLastOnRow(current));
>  results.clear();
>  scannerContext.clearProgress();
>  continue;
>  }
> {code}
> Prefer the option 2 with server level config. Please share your inputs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20618) Skip large rows instead of throwing an exception to client

2018-05-30 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16494857#comment-16494857
 ] 

Hadoop QA commented on HBASE-20618:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
52s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
14s{color} | {color:red} hbase-server: The patch generated 105 new + 220 
unchanged - 0 fixed = 325 total (was 220) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
 2s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
15m 19s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
27s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}122m 58s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  2m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}179m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.regionserver.TestSkipBigRowScanner |
|   | hadoop.hbase.master.TestShutdownBackupMaster |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-20618 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12925679/HBASE-20618.hbasemaster.v01.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 2328a7fdb000 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |