date:20160821

[jira] [Commented] (HBASE-16463) Add new crypto provider with Commons CRYPTO for Transparent encryption

2016-08-21 Thread Dapeng Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430090#comment-15430090
 ] 

Dapeng Sun commented on HBASE-16463:


The initial patch added a new provider with Commons Crypto. We could change 
Commons CRYPTO as default provider in future.

> Add new crypto provider with Commons CRYPTO for Transparent encryption
> --
>
> Key: HBASE-16463
> URL: https://issues.apache.org/jira/browse/HBASE-16463
> Project: HBase
>  Issue Type: New Feature
>  Components: encryption
>Affects Versions: 2.0.0
>Reporter: Dapeng Sun
> Attachments: HBASE-16463.001.patch
>
>
> Apache Commons Crypto is a cryptographic library optimized with AES-NI.
> https://commons.apache.org/proper/commons-crypto/index.html
> The jira will use Commons Crypto to accelerate the transparent encryption of 
> HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15871) Memstore flush doesn't finish because of backwardseek() in memstore scanner.

2016-08-21 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430086#comment-15430086
 ] 

Hadoop QA commented on HBASE-15871:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
18s {color} | {color:green} branch-1.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} branch-1.1 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} branch-1.1 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
33s {color} | {color:green} branch-1.1 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
22s {color} | {color:green} branch-1.1 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 43s 
{color} | {color:red} hbase-server in branch-1.1 has 80 extant Findbugs 
warnings. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 31s 
{color} | {color:red} hbase-server in branch-1.1 failed with JDK v1.8.0_101. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s 
{color} | {color:green} branch-1.1 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
39s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 14 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 7m 36s 
{color} | {color:red} The patch causes 11 errors with Hadoop v2.6.1. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 8m 48s 
{color} | {color:red} The patch causes 11 errors with Hadoop v2.6.2. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 9m 59s 
{color} | {color:red} The patch causes 11 errors with Hadoop v2.6.3. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 11m 10s 
{color} | {color:red} The patch causes 11 errors with Hadoop v2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
58s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 23s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_101. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 29s 
{color} | {color:red} hbase-server-jdk1.7.0_101 with JDK v1.7.0_101 generated 1 
new + 16 unchanged - 0 fixed = 17 total (was 16) {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 36s {color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
29s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 115m 22s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.mapreduce.TestHFileOutputFormat |
|   | hadoop.hbase.master.TestDistributedLogSplitting |

[jira] [Updated] (HBASE-16463) Add new crypto provider with Commons CRYPTO for Transparent encryption

2016-08-21 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HBASE-16463:
---
Attachment: HBASE-16463.001.patch

> Add new crypto provider with Commons CRYPTO for Transparent encryption
> --
>
> Key: HBASE-16463
> URL: https://issues.apache.org/jira/browse/HBASE-16463
> Project: HBase
>  Issue Type: New Feature
>  Components: encryption
>Affects Versions: 2.0.0
>Reporter: Dapeng Sun
> Attachments: HBASE-16463.001.patch
>
>
> Apache Commons Crypto is a cryptographic library optimized with AES-NI.
> https://commons.apache.org/proper/commons-crypto/index.html
> The jira will use Commons Crypto to accelerate the transparent encryption of 
> HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16463) Add new crypto provider with Commons CRYPTO for Transparent encryption

2016-08-21 Thread Dapeng Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dapeng Sun updated HBASE-16463:
---
Status: Patch Available  (was: Open)

> Add new crypto provider with Commons CRYPTO for Transparent encryption
> --
>
> Key: HBASE-16463
> URL: https://issues.apache.org/jira/browse/HBASE-16463
> Project: HBase
>  Issue Type: New Feature
>  Components: encryption
>Affects Versions: 2.0.0
>Reporter: Dapeng Sun
> Attachments: HBASE-16463.001.patch
>
>
> Apache Commons Crypto is a cryptographic library optimized with AES-NI.
> https://commons.apache.org/proper/commons-crypto/index.html
> The jira will use Commons Crypto to accelerate the transparent encryption of 
> HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16444) CellUtil#getSumOfCellKeyElementLengths() should consider KEY_INFRASTRUCTURE_SIZE

2016-08-21 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430080#comment-15430080
 ] 

Anoop Sam John commented on HBASE-16444:


U mean when flush to HFile and as part of that write key length? If that is the 
case, then it must be always the key size need as per the KV serialize way. 
Whatever be the cell type.. If not happening now that is a bug!

> CellUtil#getSumOfCellKeyElementLengths() should consider 
> KEY_INFRASTRUCTURE_SIZE
> 
>
> Key: HBASE-16444
> URL: https://issues.apache.org/jira/browse/HBASE-16444
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Minor
> Attachments: HBASE-16444.patch
>
>
> Currently CellUtil#getSumOfCellKeyElementLengths() considers 
> {code}
> return cell.getRowLength() + cell.getFamilyLength() +
> cell.getQualifierLength() +
> KeyValue.TIMESTAMP_TYPE_SIZE;
> {code}
> It can consider the 2 byte ROWLEN and 1 byte FAMILY_LEN also because with the 
> current way of things we are sure how our key is structured.
> But pls note that
> {code}
> // This will be a low estimate.  Will do for now.
> return getSumOfCellKeyElementLengths(cell);
> {code}
> It says clearly it is going to be a low estimate. But in the write path there 
> should be no harm in adding the complete KEY_INFRA_SIZE. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16444) CellUtil#getSumOfCellKeyElementLengths() should consider KEY_INFRASTRUCTURE_SIZE

2016-08-21 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430079#comment-15430079
 ] 

ramkrishna.s.vasudevan commented on HBASE-16444:


There is another API
{code}
estimatedHeapSizeOf()
{code}
Here it is fine. Estimates can be closer enough. 

> CellUtil#getSumOfCellKeyElementLengths() should consider 
> KEY_INFRASTRUCTURE_SIZE
> 
>
> Key: HBASE-16444
> URL: https://issues.apache.org/jira/browse/HBASE-16444
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Minor
> Attachments: HBASE-16444.patch
>
>
> Currently CellUtil#getSumOfCellKeyElementLengths() considers 
> {code}
> return cell.getRowLength() + cell.getFamilyLength() +
> cell.getQualifierLength() +
> KeyValue.TIMESTAMP_TYPE_SIZE;
> {code}
> It can consider the 2 byte ROWLEN and 1 byte FAMILY_LEN also because with the 
> current way of things we are sure how our key is structured.
> But pls note that
> {code}
> // This will be a low estimate.  Will do for now.
> return getSumOfCellKeyElementLengths(cell);
> {code}
> It says clearly it is going to be a low estimate. But in the write path there 
> should be no harm in adding the complete KEY_INFRA_SIZE. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16444) CellUtil#getSumOfCellKeyElementLengths() should consider KEY_INFRASTRUCTURE_SIZE

2016-08-21 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430077#comment-15430077
 ] 

ramkrishna.s.vasudevan commented on HBASE-16444:


bq. So when we have non KV cells, this will make key heap size bigger. So what 
way it will impact? (Good or bad way)
Am not changing the key 'heap' size. It is the actual key size that we are 
going to write. What ever may  be the cell type this is the key size we will be 
writing. Am not sure why it was not done uniformly. 'Heap' size I agree but 
actual key size should be same as far as our key structure is going to be same. 

> CellUtil#getSumOfCellKeyElementLengths() should consider 
> KEY_INFRASTRUCTURE_SIZE
> 
>
> Key: HBASE-16444
> URL: https://issues.apache.org/jira/browse/HBASE-16444
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Minor
> Attachments: HBASE-16444.patch
>
>
> Currently CellUtil#getSumOfCellKeyElementLengths() considers 
> {code}
> return cell.getRowLength() + cell.getFamilyLength() +
> cell.getQualifierLength() +
> KeyValue.TIMESTAMP_TYPE_SIZE;
> {code}
> It can consider the 2 byte ROWLEN and 1 byte FAMILY_LEN also because with the 
> current way of things we are sure how our key is structured.
> But pls note that
> {code}
> // This will be a low estimate.  Will do for now.
> return getSumOfCellKeyElementLengths(cell);
> {code}
> It says clearly it is going to be a low estimate. But in the write path there 
> should be no harm in adding the complete KEY_INFRA_SIZE. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-16463) Add new crypto provider with Commons CRYPTO for Transparent encryption

2016-08-21 Thread Dapeng Sun (JIRA)

Dapeng Sun created HBASE-16463:
--

 Summary: Add new crypto provider with Commons CRYPTO for 
Transparent encryption
 Key: HBASE-16463
 URL: https://issues.apache.org/jira/browse/HBASE-16463
 Project: HBase
  Issue Type: New Feature
  Components: encryption
Affects Versions: 2.0.0
Reporter: Dapeng Sun


Apache Commons Crypto is a cryptographic library optimized with AES-NI.
https://commons.apache.org/proper/commons-crypto/index.html
The jira will use Commons Crypto to accelerate the transparent encryption of 
HBase.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16444) CellUtil#getSumOfCellKeyElementLengths() should consider KEY_INFRASTRUCTURE_SIZE

2016-08-21 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430056#comment-15430056
 ] 

Anoop Sam John commented on HBASE-16444:


What will be the adv if this change comes in?  Just code uniform is not a point 
to add this change IMHO.  Any other? Sorry I did not check code in detail.  
Write path u were mentioning. So when we have non KV cells, this will make key 
heap size bigger. So what way it will impact? (Good or bad way)

> CellUtil#getSumOfCellKeyElementLengths() should consider 
> KEY_INFRASTRUCTURE_SIZE
> 
>
> Key: HBASE-16444
> URL: https://issues.apache.org/jira/browse/HBASE-16444
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Minor
> Attachments: HBASE-16444.patch
>
>
> Currently CellUtil#getSumOfCellKeyElementLengths() considers 
> {code}
> return cell.getRowLength() + cell.getFamilyLength() +
> cell.getQualifierLength() +
> KeyValue.TIMESTAMP_TYPE_SIZE;
> {code}
> It can consider the 2 byte ROWLEN and 1 byte FAMILY_LEN also because with the 
> current way of things we are sure how our key is structured.
> But pls note that
> {code}
> // This will be a low estimate.  Will do for now.
> return getSumOfCellKeyElementLengths(cell);
> {code}
> It says clearly it is going to be a low estimate. But in the write path there 
> should be no harm in adding the complete KEY_INFRA_SIZE. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions

2016-08-21 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430054#comment-15430054
 ] 

Anoop Sam John commented on HBASE-16417:


Mostly looks good.  Though on the general use case (where not many 
updates/deletes) why cannot we flush all the segments in pipeline together when 
a flush to disk arise? In that case also, doing an in memory compaction for 
segments in pipeline (eg: You say when segments# >3) is to reduce #files 
flushed to disk.  So another way for that is flush whole pipeline together. In 
fact I feel at flush to file comes, we should be flushing all segments in 
pipeline + active.   So it is just like default memstore other than the in btw 
flush to in memory flattened structure.  When MSLAB in place, CellChunkMap 
would be ideal.  For off heap, any way we must need it,   As a first step, 
CellArrayMap being default is fine.
And good to see that ur tests reveal the overhead of scan for compact decision. 
And ya we should be doing that with out any compaction based test. And ya it is 
upto the user to know the pros and cons of in memory compaction and select that 
wisely. We should be well documenting that..   
Great.. We are mostly in sync now :-)

> In-Memory MemStore Policy for Flattening and Compactions
> 
>
> Key: HBASE-16417
> URL: https://issues.apache.org/jira/browse/HBASE-16417
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16444) CellUtil#getSumOfCellKeyElementLengths() should consider KEY_INFRASTRUCTURE_SIZE

2016-08-21 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430052#comment-15430052
 ] 

Hadoop QA commented on HBASE-16444:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
30s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} master passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} master passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
50s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} master passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} master passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
29m 54s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 49s 
{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
9s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 41m 51s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:date2016-08-22 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12824763/HBASE-16444.patch |
| JIRA Issue | HBASE-16444 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux bc3c85530c5a 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |

[jira] [Updated] (HBASE-16455) Provide API for obtaining highest file number among all the WAL files

2016-08-21 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-16455:
---
Attachment: 16455.branch-1.v4.txt

> Provide API for obtaining highest file number among all the WAL files
> -
>
> Key: HBASE-16455
> URL: https://issues.apache.org/jira/browse/HBASE-16455
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.4.0
>
> Attachments: 16455.branch-1.v4.txt, 16455.v1.txt, 16455.v2.txt, 
> 16455.v3.txt, 16455.v4.txt
>
>
> Currently RegionServerServices has the following API:
> {code}
>   WAL getWAL(HRegionInfo regionInfo) throws IOException;
> {code}
> Caller can only obtain filenum for a specific WAL.
> When multi wal is in use, we should add API for obtaining highest file number 
> among all the outstanding WAL files.
> User can pass null to getWAL() method above, but the filenum for the returned 
> WAL may not be the highest among all the WAL files.
> See log snippet in the first comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16372) References to previous cell in read path should be avoided

2016-08-21 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430036#comment-15430036
 ] 

ramkrishna.s.vasudevan commented on HBASE-16372:


bq. we change the curBlock and move the old cur block to prevBlock. If there 
was an already block pointed by prevBlock move that to oldBlocks. So when the 
call returnBlocks(boolean returnAll) comes, we will return only oldBlocks if 
param is false. If true we return from all 3 refs.
My initial impl to solve this was to always return the (last - 1) blocks in a 
shipped() call and then if the shipped() call is with true return all blocks. 
We won't have 3 references.  So the list holding the previous blocks will 
always have one element in it (the last block that was accessed). Internally 
which means we still maintain a ref to the actual previous block.

In all the above ways - the problem of not giving a chance for the block to be 
evicted happens - yes though it is LRU stil this is there. So did not want to 
do that way. But yes we can explore it no problem.

bq.One more thing to note is that in read flow, when a seek or next call result 
in jumping out many blocks in btw, are we assigning curBlock with the in btw 
blocks?
I don't think we do it. Once we are sure we are going to read from this block 
only that we update as the curBlock ref. I can verify once again.



> References to previous cell in read path should be avoided
> --
>
> Key: HBASE-16372
> URL: https://issues.apache.org/jira/browse/HBASE-16372
> Project: HBase
>  Issue Type: Sub-task
>  Components: Scanners
>Affects Versions: 2.0.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-16372_testcase.patch, HBASE-16372_testcase_1.patch
>
>
> Came as part of review discussion in HBASE-15554. If there are references 
> kept to previous cells in the read path, with the Ref count based eviction 
> mechanism in trunk, then chances are there to evict a block backing the 
> previous cell but the read path still does some operations on that garbage 
> collected previous cell leading to incorrect results.
> Areas to target
> -> Storescanner
> -> Bloom filters (particularly in compaction path)
> Thanks to [~anoop.hbase] to point out this in bloomfilter path. But we found 
> it could be in other areas also.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16444) CellUtil#getSumOfCellKeyElementLengths() should consider KEY_INFRASTRUCTURE_SIZE

2016-08-21 Thread ramkrishna.s.vasudevan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-16444:
---
Status: Patch Available  (was: Open)

> CellUtil#getSumOfCellKeyElementLengths() should consider 
> KEY_INFRASTRUCTURE_SIZE
> 
>
> Key: HBASE-16444
> URL: https://issues.apache.org/jira/browse/HBASE-16444
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Minor
> Attachments: HBASE-16444.patch
>
>
> Currently CellUtil#getSumOfCellKeyElementLengths() considers 
> {code}
> return cell.getRowLength() + cell.getFamilyLength() +
> cell.getQualifierLength() +
> KeyValue.TIMESTAMP_TYPE_SIZE;
> {code}
> It can consider the 2 byte ROWLEN and 1 byte FAMILY_LEN also because with the 
> current way of things we are sure how our key is structured.
> But pls note that
> {code}
> // This will be a low estimate.  Will do for now.
> return getSumOfCellKeyElementLengths(cell);
> {code}
> It says clearly it is going to be a low estimate. But in the write path there 
> should be no harm in adding the complete KEY_INFRA_SIZE. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16444) CellUtil#getSumOfCellKeyElementLengths() should consider KEY_INFRASTRUCTURE_SIZE

2016-08-21 Thread ramkrishna.s.vasudevan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-16444:
---
Attachment: HBASE-16444.patch

A simple patch. Will try QA.

> CellUtil#getSumOfCellKeyElementLengths() should consider 
> KEY_INFRASTRUCTURE_SIZE
> 
>
> Key: HBASE-16444
> URL: https://issues.apache.org/jira/browse/HBASE-16444
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>Priority: Minor
> Attachments: HBASE-16444.patch
>
>
> Currently CellUtil#getSumOfCellKeyElementLengths() considers 
> {code}
> return cell.getRowLength() + cell.getFamilyLength() +
> cell.getQualifierLength() +
> KeyValue.TIMESTAMP_TYPE_SIZE;
> {code}
> It can consider the 2 byte ROWLEN and 1 byte FAMILY_LEN also because with the 
> current way of things we are sure how our key is structured.
> But pls note that
> {code}
> // This will be a low estimate.  Will do for now.
> return getSumOfCellKeyElementLengths(cell);
> {code}
> It says clearly it is going to be a low estimate. But in the write path there 
> should be no harm in adding the complete KEY_INFRA_SIZE. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16455) Provide API for obtaining highest file number among all the WAL files

2016-08-21 Thread Jerry He (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430014#comment-15430014
 ] 

Jerry He commented on HBASE-16455:
--

+1 on v4.

It does not include the meta wal/provider. If you don't want it, please note it 
in the method comment.

> Provide API for obtaining highest file number among all the WAL files
> -
>
> Key: HBASE-16455
> URL: https://issues.apache.org/jira/browse/HBASE-16455
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.4.0
>
> Attachments: 16455.v1.txt, 16455.v2.txt, 16455.v3.txt, 16455.v4.txt
>
>
> Currently RegionServerServices has the following API:
> {code}
>   WAL getWAL(HRegionInfo regionInfo) throws IOException;
> {code}
> Caller can only obtain filenum for a specific WAL.
> When multi wal is in use, we should add API for obtaining highest file number 
> among all the outstanding WAL files.
> User can pass null to getWAL() method above, but the filenum for the returned 
> WAL may not be the highest among all the WAL files.
> See log snippet in the first comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15871) Memstore flush doesn't finish because of backwardseek() in memstore scanner.

2016-08-21 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430006#comment-15430006
 ] 

ramkrishna.s.vasudevan commented on HBASE-15871:


I can take this forward then. 

> Memstore flush doesn't finish because of backwardseek() in memstore scanner.
> 
>
> Key: HBASE-15871
> URL: https://issues.apache.org/jira/browse/HBASE-15871
> Project: HBase
>  Issue Type: Bug
>  Components: Scanners
>Affects Versions: 1.1.2
>Reporter: Jeongdae Kim
> Fix For: 1.1.2
>
> Attachments: HBASE-15871.branch-1.1.001.patch, 
> HBASE-15871.branch-1.1.002.patch, HBASE-15871.branch-1.1.003.patch, 
> memstore_backwardSeek().PNG
>
>
> Sometimes in our production hbase cluster, it takes a long time to finish 
> memstore flush.( for about more than 30 minutes)
> the reason is that a memstore flusher thread calls 
> StoreScanner.updateReaders(), waits for acquiring a lock that store scanner 
> holds in StoreScanner.next() and backwardseek() in memstore scanner runs for 
> a long time.
> I think that this condition could occur in reverse scan by the following 
> process.
> 1) create a reversed store scanner by requesting a reverse scan.
> 2) flush a memstore in the same HStore.
> 3) puts a lot of cells in memstore and memstore is almost full.
> 4) call the reverse scanner.next() and re-create all scanners in this store 
> because all scanners was already closed by 2)'s flush() and backwardseek() 
> with store's lastTop for all new scanners.
> 5) in this status, memstore is almost full by 2) and all cells in memstore 
> have sequenceID greater than this scanner's readPoint because of 2)'s 
> flush(). this condition causes searching all cells in memstore, and 
> seekToPreviousRow() repeatly seach cells that are already searched if a row 
> has one column. (described this in more detail in a attached file.)
> 6) flush a memstore again in the same HStore, and wait until 4-5) process 
> finished, to update store files in the same HStore after flusing.
> I searched HBase jira. and found a similar issue. (HBASE-14497) but, 
> HBASE-14497's fix can't solve this issue because that fix just changed 
> recursive call to loop.(and already applied to our HBase version)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16460) Can't rebuild the BucketAllocator's data structures when BucketCache use FileIOEngine

2016-08-21 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15430005#comment-15430005
 ] 

ramkrishna.s.vasudevan commented on HBASE-16460:


+1. Good catch. 
{code}
public static class HFileBlockPair {
{code}
This can be marked  with @visibileForTesting tag.  Rest looks good to me.

> Can't rebuild the BucketAllocator's data structures when BucketCache use 
> FileIOEngine
> -
>
> Key: HBASE-16460
> URL: https://issues.apache.org/jira/browse/HBASE-16460
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 2.0.0, 1.1.6, 1.3.1, 1.2.3, 0.98.22
>Reporter: Guanghao Zhang
>Assignee: Guanghao Zhang
> Attachments: HBASE-16460.patch
>
>
> When bucket cache use FileIOEngine, it will rebuild the bucket allocator's 
> data structures from a persisted map. So it should first read the map from 
> persistence file then use the map to new a BucketAllocator. But now the code 
> has wrong sequence in retrieveFromFile() method of BucketCache.java.
> {code}
>   BucketAllocator allocator = new BucketAllocator(cacheCapacity, 
> bucketSizes, backingMap, realCacheSize);
>   backingMap = (ConcurrentHashMap) 
> ois.readObject();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16376) Document implicit side-effects on partial results when calling Scan#setBatch(int)

2016-08-21 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429992#comment-15429992
 ] 

Hadoop QA commented on HBASE-16376:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
3s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s 
{color} | {color:green} master passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} master passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
57s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} master passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} master passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
27m 25s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 59s 
{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
8s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 38m 8s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:date2016-08-22 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12824756/HBASE-16376.001.patch 
|
| JIRA Issue | HBASE-16376 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 068b2613c4d4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |

[jira] [Updated] (HBASE-16376) Document implicit side-effects on partial results when calling Scan#setBatch(int)

2016-08-21 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HBASE-16376:
---
Status: Patch Available  (was: Open)

> Document implicit side-effects on partial results when calling 
> Scan#setBatch(int)
> -
>
> Key: HBASE-16376
> URL: https://issues.apache.org/jira/browse/HBASE-16376
> Project: HBase
>  Issue Type: Task
>  Components: API, documentation
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Minor
>  Labels: beginner
> Fix For: 2.0.0, 1.3.0, 0.98.22, 1.2.4
>
> Attachments: HBASE-16376.001.patch
>
>
> It was brought to my attention that the javadoc on {{Scan#setBatch(int)}} 
> does not inform the user that calling this method has the implicit 
> side-effect that the user may see partial {{Result}}s.
> While the side-effect isn't necessarily surprising for developers who know 
> how it's implemented, but for API users, this might be a very jarring 
> implication.
> We should update the documentation on {{Scan#setBatch(int)}} to inform users 
> that they may see partial results if they call this method (and perhaps refer 
> them to the size-based {{Scan#setMaxResultSize(long)}} too).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16376) Document implicit side-effects on partial results when calling Scan#setBatch(int)

2016-08-21 Thread Josh Elser (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Elser updated HBASE-16376:
---
Attachment: HBASE-16376.001.patch

.001 Trivial patch which clarifies that calling {{setBatch(int)}} also implies 
{{setAllowPartialResults(true)}}, and directs the user to use 
{{setMaxResultSize(long)}} instead.

> Document implicit side-effects on partial results when calling 
> Scan#setBatch(int)
> -
>
> Key: HBASE-16376
> URL: https://issues.apache.org/jira/browse/HBASE-16376
> Project: HBase
>  Issue Type: Task
>  Components: API, documentation
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Minor
>  Labels: beginner
> Fix For: 2.0.0, 1.3.0, 0.98.22, 1.2.4
>
> Attachments: HBASE-16376.001.patch
>
>
> It was brought to my attention that the javadoc on {{Scan#setBatch(int)}} 
> does not inform the user that calling this method has the implicit 
> side-effect that the user may see partial {{Result}}s.
> While the side-effect isn't necessarily surprising for developers who know 
> how it's implemented, but for API users, this might be a very jarring 
> implication.
> We should update the documentation on {{Scan#setBatch(int)}} to inform users 
> that they may see partial results if they call this method (and perhaps refer 
> them to the size-based {{Scan#setMaxResultSize(long)}} too).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16455) Provide API for obtaining highest file number among all the WAL files

2016-08-21 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429949#comment-15429949
 ] 

Hadoop QA commented on HBASE-16455:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 1s 
{color} | {color:blue} The patch file was not named according to hbase's naming 
conventions. Please see 
https://yetus.apache.org/documentation/0.3.0/precommit-patchnames for 
instructions. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
3s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} master passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s 
{color} | {color:green} master passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
49s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
18s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
54s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} master passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s 
{color} | {color:green} master passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
27m 29s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 92m 35s 
{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 135m 40s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:date2016-08-21 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12824750/16455.v4.txt |
| JIRA Issue | HBASE-16455 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 46787e9c12a9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64

[jira] [Commented] (HBASE-16461) combine table into a new table

2016-08-21 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429945#comment-15429945
 ] 

Allan Yang commented on HBASE-16461:


Don't need a new feature, if you want move one table's table to another, just 
bulkload them.

> combine table into a new table
> --
>
> Key: HBASE-16461
> URL: https://issues.apache.org/jira/browse/HBASE-16461
> Project: HBase
>  Issue Type: Wish
>Reporter: Nick.han
>
> how about we create a new feture that combine tow or more table into one new 
> table?it's easy for hbase data structure。



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16455) Provide API for obtaining highest file number among all the WAL files

2016-08-21 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429931#comment-15429931
 ] 

Ted Yu commented on HBASE-16455:


See if patch v4 is better.
The following method is added:
{code}
  public List getWALs() throws IOException {
{code}

> Provide API for obtaining highest file number among all the WAL files
> -
>
> Key: HBASE-16455
> URL: https://issues.apache.org/jira/browse/HBASE-16455
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.4.0
>
> Attachments: 16455.v1.txt, 16455.v2.txt, 16455.v3.txt, 16455.v4.txt
>
>
> Currently RegionServerServices has the following API:
> {code}
>   WAL getWAL(HRegionInfo regionInfo) throws IOException;
> {code}
> Caller can only obtain filenum for a specific WAL.
> When multi wal is in use, we should add API for obtaining highest file number 
> among all the outstanding WAL files.
> User can pass null to getWAL() method above, but the filenum for the returned 
> WAL may not be the highest among all the WAL files.
> See log snippet in the first comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16455) Provide API for obtaining highest file number among all the WAL files

2016-08-21 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-16455:
---
Attachment: 16455.v4.txt

> Provide API for obtaining highest file number among all the WAL files
> -
>
> Key: HBASE-16455
> URL: https://issues.apache.org/jira/browse/HBASE-16455
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.4.0
>
> Attachments: 16455.v1.txt, 16455.v2.txt, 16455.v3.txt, 16455.v4.txt
>
>
> Currently RegionServerServices has the following API:
> {code}
>   WAL getWAL(HRegionInfo regionInfo) throws IOException;
> {code}
> Caller can only obtain filenum for a specific WAL.
> When multi wal is in use, we should add API for obtaining highest file number 
> among all the outstanding WAL files.
> User can pass null to getWAL() method above, but the filenum for the returned 
> WAL may not be the highest among all the WAL files.
> See log snippet in the first comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16455) Provide API for obtaining highest file number among all the WAL files

2016-08-21 Thread Jerry He (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429921#comment-15429921
 ] 

Jerry He commented on HBASE-16455:
--

Hi, Ted

My feeling is that exposing the filenum as region server level API does not 
seem to be good fit.
filenum is specific internal at the fs Provider or fs WAL level.  At the region 
server level, we deal with the WALFactory or WAL interfaces.

How about we do this:
Have a API in the region server service level to return a list/array of WALs 
that belong to this region server. This makes sense given the support for 
multiwal,
You can overload the current getWAL(), or add a new API.
The current backup relies on FsWal and relies on knowing its internals, which 
is ok. It can cast WAL to FsWal and obtain and calculate what it needs.


> Provide API for obtaining highest file number among all the WAL files
> -
>
> Key: HBASE-16455
> URL: https://issues.apache.org/jira/browse/HBASE-16455
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 2.0.0, 1.4.0
>
> Attachments: 16455.v1.txt, 16455.v2.txt, 16455.v3.txt
>
>
> Currently RegionServerServices has the following API:
> {code}
>   WAL getWAL(HRegionInfo regionInfo) throws IOException;
> {code}
> Caller can only obtain filenum for a specific WAL.
> When multi wal is in use, we should add API for obtaining highest file number 
> among all the outstanding WAL files.
> User can pass null to getWAL() method above, but the filenum for the returned 
> WAL may not be the highest among all the WAL files.
> See log snippet in the first comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16462) Test failure TestRSGroupsBase.testGroupBalance

2016-08-21 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429757#comment-15429757
 ] 

Hadoop QA commented on HBASE-16462:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
4s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 11s 
{color} | {color:green} master passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s 
{color} | {color:green} master passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
11s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 31s 
{color} | {color:red} hbase-rsgroup in master has 6 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s 
{color} | {color:green} master passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} master passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
27m 31s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 10s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 20s 
{color} | {color:green} hbase-rsgroup in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
8s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 35m 50s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:date2016-08-21 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12824718/HBASE-16462-v1.patch |
| JIRA Issue | HBASE-16462 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux a7c006727aee 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / d077219 |
| Default Java | 1.7.0_101 |
| Multi-JDK

[jira] [Updated] (HBASE-16462) Test failure TestRSGroupsBase.testGroupBalance

2016-08-21 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-16462:
---
Status: Patch Available  (was: Open)

> Test failure TestRSGroupsBase.testGroupBalance
> --
>
> Key: HBASE-16462
> URL: https://issues.apache.org/jira/browse/HBASE-16462
> Project: HBase
>  Issue Type: Bug
>Reporter: Guangxu Cheng
> Attachments: HBASE-16462-v1.patch
>
>
> show this fail when TestRSGroupsBase
> {code}
> testGroupBalance(org.apache.hadoop.hbase.rsgroup.TestRSGroups)  Time elapsed: 
> 309.517 sec  <<< FAILURE!
> java.lang.AssertionError: Waiting timed out after [300,000] msec
> at org.junit.Assert.fail(Assert.java:88)
> at org.apache.hadoop.hbase.Waiter.waitFor(Waiter.java:209)
> at org.apache.hadoop.hbase.Waiter.waitFor(Waiter.java:143)
> at 
> org.apache.hadoop.hbase.HBaseTestingUtility.waitFor(HBaseTestingUtility.java:3816)
> at 
> org.apache.hadoop.hbase.rsgroup.TestRSGroupsBase.testGroupBalance(TestRSGroupsBase.java:434)
> {code}
> The exception may be caused by a bug.
> {code:title=TestRSGroupsBase.java|borderStyle=solid}
> rsGroupAdmin.balanceRSGroup(newGroupName);
> TEST_UTIL.waitFor(WAIT_TIMEOUT, new Waiter.Predicate() {
>   @Override
>   public boolean evaluate() throws Exception {
> for (List regions : 
> getTableServerRegionMap().get(tableName).values()) {
>   if (2 != regions.size()) {
> return false;
>   }
> }
> return true;
>   }
> }); 
> {code}
> The new Group has one table and three servers, and the table has six regions.
> Beginning, all regions are located on a single server.
> After balance, regions distributed on three server, preferably each server on 
> two region.
> However,this is not absolute. Maybe one server has one region, another server 
> has three regions.
> So, while waiting for the results of balance, we need only determine whether 
> the region on the server, without having to check region's number.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14918) In-Memory MemStore Flush and Compaction

2016-08-21 Thread Edward Bortnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429728#comment-15429728
 ] 

Edward Bortnikov commented on HBASE-14918:
--

We've just attached a proposed simplified spec for in-memory flush 
configuration on HBASE-16417, please take a look and speak up (smile).

> In-Memory MemStore Flush and Compaction
> ---
>
> Key: HBASE-14918
> URL: https://issues.apache.org/jira/browse/HBASE-14918
> Project: HBase
>  Issue Type: Umbrella
>Affects Versions: 2.0.0
>Reporter: Eshcar Hillel
>Assignee: Eshcar Hillel
> Attachments: CellBlocksSegmentDesign.pdf, MSLABMove.patch
>
>
> A memstore serves as the in-memory component of a store unit, absorbing all 
> updates to the store. From time to time these updates are flushed to a file 
> on disk, where they are compacted (by eliminating redundancies) and 
> compressed (i.e., written in a compressed format to reduce their storage 
> size).
> We aim to speed up data access, and therefore suggest to apply in-memory 
> memstore flush. That is to flush the active in-memory segment into an 
> intermediate buffer where it can be accessed by the application. Data in the 
> buffer is subject to compaction and can be stored in any format that allows 
> it to take up smaller space in RAM. The less space the buffer consumes the 
> longer it can reside in memory before data is flushed to disk, resulting in 
> better performance.
> Specifically, the optimization is beneficial for workloads with 
> medium-to-high key churn which incur many redundant cells, like persistent 
> messaging. 
> We suggest to structure the solution as 4 subtasks (respectively, patches). 
> (1) Infrastructure - refactoring of the MemStore hierarchy, introducing 
> segment (StoreSegment) as first-class citizen, and decoupling memstore 
> scanner from the memstore implementation;
> (2) Adding StoreServices facility at the region level to allow memstores 
> update region counters and access region level synchronization mechanism;
> (3) Implementation of a new memstore (CompactingMemstore) with non-optimized 
> immutable segment representation, and 
> (4) Memory optimization including compressed format representation and off 
> heap allocations.
> This Jira continues the discussion in HBASE-13408.
> Design documents, evaluation results and previous patches can be found in 
> HBASE-13408. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions

2016-08-21 Thread Edward Bortnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429726#comment-15429726
 ] 

Edward Bortnikov edited comment on HBASE-16417 at 8/21/16 1:43 PM:
---

Notes to the suggested definition: 
1. Speculative scans have been eliminated as merge trigger - turn to be to 
costly. With option 3 (compact_data), the user takes the responsibility for 
what he's doing. 

2. The default implementation for immutable sorted index is CellArrayMap (an 
array of references to Cell objects). The CellChunkMap implementation embeds 
the Cell objects into the sorted index array, and saves some space by doing so. 
The index implementation is orthogonal to in-memory flush policy. The 
CellChunkMap index only works with MSLAB data storage. The use cases for it are 
TBD. For example, if it is only planned to work with off-heap data, no separate 
configuration is required. Let's follow this up separately on HBASE-16421. 
 


was (Author: ebortnik):
Notes to the suggested definition: 
1. Speculative scans have been eliminated as merge trigger - turn to be to 
costly. With option 3 (compact_data), the user takes the responsibility for 
what he's doing. 

2. The default implementation for immutable sorted index is CellArrayMap (an 
array of references to Cell objects). The CellChunkMap implementation embeds 
the Cell objects into the sorted index array, and saves some space by doing so. 
The index implementation is orthogonal to in-memory flush policy. The 
CellChunkMap index only works with MSLAB data storage. The use cases for it are 
TBD. For example, if it is only planned to work with off-heap data, no separate 
configuration is required. Let's follow this up separately on HBase-16421. 
 

> In-Memory MemStore Policy for Flattening and Compactions
> 
>
> Key: HBASE-16417
> URL: https://issues.apache.org/jira/browse/HBASE-16417
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions

2016-08-21 Thread Edward Bortnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429726#comment-15429726
 ] 

Edward Bortnikov commented on HBASE-16417:
--

Notes to the suggested definition: 
1. Speculative scans have been eliminated as merge trigger - turn to be to 
costly. With option 3 (compact_data), the user takes the responsibility for 
what he's doing. 

2. The default implementation for immutable sorted index is CellArrayMap (an 
array of references to Cell objects). The CellChunkMap implementation embeds 
the Cell objects into the sorted index array, and saves some space by doing so. 
The index implementation is orthogonal to in-memory flush policy. The 
CellChunkMap index only works with MSLAB data storage. The use cases for it are 
TBD. For example, if it is only planned to work with off-heap data, no separate 
configuration is required. Let's follow this up separately on HBase-16421. 
 

> In-Memory MemStore Policy for Flattening and Compactions
> 
>
> Key: HBASE-16417
> URL: https://issues.apache.org/jira/browse/HBASE-16417
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions

2016-08-21 Thread Edward Bortnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429724#comment-15429724
 ] 

Edward Bortnikov edited comment on HBASE-16417 at 8/21/16 1:39 PM:
---

Suggestion for Flush Policy, feel free to comment (smile). 

A new configuration parameter, IN_MEMORY_FLUSH_POLICY, will encompass three 
levels of managing memory flush at the store (CF) level. 

1. “none”. Semantics: no in-memory flush - status quo before the project 
started. 

2. “compact_index” (default). Semantics: 
 a. When a MemStore overflows, it is transformed into an immutable segment. 
Namely, its index is flattened into a sorted array. 
 b. The new segment is pushed into the segment pipeline (list of immutable 
segments, sorted by creation time). The pipeline segments are used for serving 
reads, along with the new MemStore and the block cache. 
 c. A MemStore (disk) flush writes the oldest in-memory segment to a file. 
 d. When too many segments accumulate in the pipeline (e.g., above 3), 
their indices are merged to reduce the number of files created by disk flushes. 
The threshold is not available for end-user tuning. Implementation details: 
 - No copy happens below the index level - neither the Cell objects nor 
the binary data are relocated. 
 - No redundant cells are eliminated, to avoid the costly SQM scan. 

3. “compact_data”. This mode is targeted to use cases with high churn/locality 
of writes. Semantics (difference from 2d): 
 a. When too many segments accumulate in the pipeline, their indices and 
data are merged, to reduce the memory footprint and postpone the future I/O. 
 - Redundant cells are eliminated (SQM scan is applied). 
 - If MSLAB storage is used for binary data, then the data in the new 
segment created by merge is relocated to new chunks. 



was (Author: ebortnik):
Suggestion for Flush Policy, feel free to comment (smile). 

A new configuration parameter, IN_MEMORY_FLUSH_POLICY, will encompass three 
levels of managing memory flush at the store (CF) level. 

1. “none”. Semantics: no in-memory flush - status quo before the project 
started. 
2. “compact_index” (default). Semantics: 
 a. When a MemStore overflows, it is transformed into an immutable segment. 
Namely, its index is flattened into a sorted array. 
 b. The new segment is pushed into the segment pipeline (list of immutable 
segments, sorted by creation time). The pipeline segments are used for serving 
reads, along with the new MemStore and the block cache. 
 c. A MemStore (disk) flush writes the oldest in-memory segment to a file. 
 d. When too many segments accumulate in the pipeline (e.g., above 3), 
their indices are merged to reduce the number of files created by disk flushes. 
The threshold is not available for end-user tuning. Implementation details: 
 - No copy happens below the index level - neither the Cell objects nor 
the binary data are relocated. 
 - No redundant cells are eliminated, to avoid the costly SQM scan. 

3. “compact_data”. This mode is targeted to use cases with high churn/locality 
of writes. Semantics (difference from 2d): 
 a. When too many segments accumulate in the pipeline, their indices and 
data are merged, to reduce the memory footprint and postpone the future I/O. 
 - Redundant cells are eliminated (SQM scan is applied). 
 - If MSLAB storage is used for binary data, then the data in the new 
segment created by merge is relocated to new chunks. 


> In-Memory MemStore Policy for Flattening and Compactions
> 
>
> Key: HBASE-16417
> URL: https://issues.apache.org/jira/browse/HBASE-16417
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions

2016-08-21 Thread Edward Bortnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429724#comment-15429724
 ] 

Edward Bortnikov edited comment on HBASE-16417 at 8/21/16 1:38 PM:
---

Suggestion for Flush Policy, feel free to comment (smile). 

A new configuration parameter, IN_MEMORY_FLUSH_POLICY, will encompass three 
levels of managing memory flush at the store (CF) level. 

1. “none”. Semantics: no in-memory flush - status quo before the project 
started. 
2. “compact_index” (default). Semantics: 
 a. When a MemStore overflows, it is transformed into an immutable segment. 
Namely, its index is flattened into a sorted array. 
 b. The new segment is pushed into the segment pipeline (list of immutable 
segments, sorted by creation time). The pipeline segments are used for serving 
reads, along with the new MemStore and the block cache. 
 c. A MemStore (disk) flush writes the oldest in-memory segment to a file. 
 d. When too many segments accumulate in the pipeline (e.g., above 3), 
their indices are merged to reduce the number of files created by disk flushes. 
The threshold is not available for end-user tuning. Implementation details: 
 - No copy happens below the index level - neither the Cell objects nor 
the binary data are relocated. 
 - No redundant cells are eliminated, to avoid the costly SQM scan. 

3. “compact_data”. This mode is targeted to use cases with high churn/locality 
of writes. Semantics (difference from 2d): 
 a. When too many segments accumulate in the pipeline, their indices and 
data are merged, to reduce the memory footprint and postpone the future I/O. 
 - Redundant cells are eliminated (SQM scan is applied). 
 - If MSLAB storage is used for binary data, then the data in the new 
segment created by merge is relocated to new chunks. 



was (Author: ebortnik):
Suggestion for Flush Policy, feel free to comment (smile). 

A new configuration parameter, IN_MEMORY_FLUSH_POLICY, will encompass three 
levels of managing memory flush at the store (CF) level. 

1. “none”. Semantics: no in-memory flush - status quo before the project 
started. 
2. “compact_index” (default). Semantics: 
 a. When a MemStore overflows, it is transformed into an immutable segment. 
Namely, its index is flattened into a sorted array. 
 b. The new segment is pushed into the segment pipeline (list of immutable 
segments, sorted by creation time). The pipeline segments are used for serving 
reads, along with the new MemStore and the block cache. 
 c. A MemStore (disk) flush writes the oldest in-memory segment to a file. 
 d. When too many segments accumulate in the pipeline (e.g., above 3), 
their indices are merged to reduce the number of files created by disk flushes. 
The threshold is not available for end-user tuning. Implementation details: 
 - No copy happens below the index level - neither the Cell objects nor 
the binary data are relocated. 
 - No redundant cells are eliminated, to avoid the costly SQM scan. 
3. “compact_data”. This mode is targeted to use cases with high churn/locality 
of writes. Semantics (difference from 2d): 
 a. When too many segments accumulate in the pipeline, their indices and 
data are merged, to reduce the memory footprint and postpone the future I/O. 
 - Redundant cells are eliminated (SQM scan is applied). 
 - If MSLAB storage is used for binary data, then the data in the new 
segment created by merge is relocated to new chunks. 


> In-Memory MemStore Policy for Flattening and Compactions
> 
>
> Key: HBASE-16417
> URL: https://issues.apache.org/jira/browse/HBASE-16417
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions

2016-08-21 Thread Edward Bortnikov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429724#comment-15429724
 ] 

Edward Bortnikov commented on HBASE-16417:
--

Suggestion for Flush Policy, feel free to comment (smile). 

A new configuration parameter, IN_MEMORY_FLUSH_POLICY, will encompass three 
levels of managing memory flush at the store (CF) level. 

1. “none”. Semantics: no in-memory flush - status quo before the project 
started. 
2. “compact_index” (default). Semantics: 
 a. When a MemStore overflows, it is transformed into an immutable segment. 
Namely, its index is flattened into a sorted array. 
 b. The new segment is pushed into the segment pipeline (list of immutable 
segments, sorted by creation time). The pipeline segments are used for serving 
reads, along with the new MemStore and the block cache. 
 c. A MemStore (disk) flush writes the oldest in-memory segment to a file. 
 d. When too many segments accumulate in the pipeline (e.g., above 3), 
their indices are merged to reduce the number of files created by disk flushes. 
The threshold is not available for end-user tuning. Implementation details: 
 - No copy happens below the index level - neither the Cell objects nor 
the binary data are relocated. 
 - No redundant cells are eliminated, to avoid the costly SQM scan. 
3. “compact_data”. This mode is targeted to use cases with high churn/locality 
of writes. Semantics (difference from 2d): 
 a. When too many segments accumulate in the pipeline, their indices and 
data are merged, to reduce the memory footprint and postpone the future I/O. 
 - Redundant cells are eliminated (SQM scan is applied). 
 - If MSLAB storage is used for binary data, then the data in the new 
segment created by merge is relocated to new chunks. 


> In-Memory MemStore Policy for Flattening and Compactions
> 
>
> Key: HBASE-16417
> URL: https://issues.apache.org/jira/browse/HBASE-16417
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Anastasia Braginsky
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-16461) combine table into a new table

2016-08-21 Thread Nick.han (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-16461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick.han updated HBASE-16461:
-
Priority: Major  (was: Minor)

> combine table into a new table
> --
>
> Key: HBASE-16461
> URL: https://issues.apache.org/jira/browse/HBASE-16461
> Project: HBase
>  Issue Type: Wish
>Reporter: Nick.han
>
> how about we create a new feture that combine tow or more table into one new 
> table?it's easy for hbase data structure。



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-16463) Add new crypto provider with Commons CRYPTO for Transparent encryption

[jira] [Commented] (HBASE-15871) Memstore flush doesn't finish because of backwardseek() in memstore scanner.

[jira] [Updated] (HBASE-16463) Add new crypto provider with Commons CRYPTO for Transparent encryption

[jira] [Updated] (HBASE-16463) Add new crypto provider with Commons CRYPTO for Transparent encryption

[jira] [Commented] (HBASE-16444) CellUtil#getSumOfCellKeyElementLengths() should consider KEY_INFRASTRUCTURE_SIZE

[jira] [Commented] (HBASE-16444) CellUtil#getSumOfCellKeyElementLengths() should consider KEY_INFRASTRUCTURE_SIZE

[jira] [Commented] (HBASE-16444) CellUtil#getSumOfCellKeyElementLengths() should consider KEY_INFRASTRUCTURE_SIZE

[jira] [Created] (HBASE-16463) Add new crypto provider with Commons CRYPTO for Transparent encryption

[jira] [Commented] (HBASE-16444) CellUtil#getSumOfCellKeyElementLengths() should consider KEY_INFRASTRUCTURE_SIZE

[jira] [Commented] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions

[jira] [Commented] (HBASE-16444) CellUtil#getSumOfCellKeyElementLengths() should consider KEY_INFRASTRUCTURE_SIZE

[jira] [Updated] (HBASE-16455) Provide API for obtaining highest file number among all the WAL files

[jira] [Commented] (HBASE-16372) References to previous cell in read path should be avoided

[jira] [Updated] (HBASE-16444) CellUtil#getSumOfCellKeyElementLengths() should consider KEY_INFRASTRUCTURE_SIZE

[jira] [Updated] (HBASE-16444) CellUtil#getSumOfCellKeyElementLengths() should consider KEY_INFRASTRUCTURE_SIZE

[jira] [Commented] (HBASE-16455) Provide API for obtaining highest file number among all the WAL files

[jira] [Commented] (HBASE-15871) Memstore flush doesn't finish because of backwardseek() in memstore scanner.

[jira] [Commented] (HBASE-16460) Can't rebuild the BucketAllocator's data structures when BucketCache use FileIOEngine

[jira] [Commented] (HBASE-16376) Document implicit side-effects on partial results when calling Scan#setBatch(int)

[jira] [Updated] (HBASE-16376) Document implicit side-effects on partial results when calling Scan#setBatch(int)

[jira] [Updated] (HBASE-16376) Document implicit side-effects on partial results when calling Scan#setBatch(int)

[jira] [Commented] (HBASE-16455) Provide API for obtaining highest file number among all the WAL files

[jira] [Commented] (HBASE-16461) combine table into a new table

[jira] [Commented] (HBASE-16455) Provide API for obtaining highest file number among all the WAL files

[jira] [Updated] (HBASE-16455) Provide API for obtaining highest file number among all the WAL files

[jira] [Commented] (HBASE-16455) Provide API for obtaining highest file number among all the WAL files

[jira] [Commented] (HBASE-16462) Test failure TestRSGroupsBase.testGroupBalance

[jira] [Updated] (HBASE-16462) Test failure TestRSGroupsBase.testGroupBalance

[jira] [Commented] (HBASE-14918) In-Memory MemStore Flush and Compaction

[jira] [Comment Edited] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions

[jira] [Commented] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions

[jira] [Comment Edited] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions

[jira] [Comment Edited] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions

[jira] [Commented] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions

[jira] [Updated] (HBASE-16461) combine table into a new table

35 matches

Site Navigation

Mail list logo

Footer information