subject:"\[jira\] \[Commented\] \(HBASE\-15180\) Reduce garbage created while reading Cells from Codec Decoder"

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-03-08 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186625#comment-15186625
 ] 

stack commented on HBASE-15180:
---

The horrid names are fine by me. They are explicit and for internal use only. 
Otherwise patch looks good. Have you looked at moving what is common in IPCUtil 
up into hbase-common and then shutting down the access on IPCUtil  so only 
accessible by ipc client? Can move its guts to some codec util up in hbase 
common and the server-side could use that. Is that a load of work? Is it 
doable? Ok to file a separate issue to do this.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch, HBASE-15180_V6.patch, HBASE-15180_V7.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-03-08 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186406#comment-15186406
 ] 

Anoop Sam John commented on HBASE-15180:


Ya to make sure the client uses only which it supposed to, we can rename method 
at least.  Any better name suggestion boss [~saint@gmail.com]?
The best way would have been that, the codec instance know where am I.  Whether 
it is in client context or server.  We need to set it while constructing Codec 
instance.   But ya at least a method name change we can do now.
I will fix ur suggestion abt explicit javadoc statement on commit.  Thanks for 
the review.
Once Stack also good abt the patch, will commit.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch, HBASE-15180_V6.patch, HBASE-15180_V7.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-03-08 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186384#comment-15186384
 ] 

Enis Soztutar commented on HBASE-15180:
---

The patch looks much cleaner than v4 I think.  

Let's add a javadoc to this saying that the ownership / lifecycle of the passed 
BB is transferred to the Decoder and Cells returned from the decoder, so that 
the contract is explicit:  
{code}
+  Decoder getDecoder(ByteBuffer buf);
{code}

Again, the contract for the {{IPCUtil. createCellScanner()}} used by client 
side vs server side should be explicit. How do we even know that one is used 
only by the client or server? We can declare it in the method name explicitly, 
like {{createCellScannerReusingBuffers()}} (sorry for the horrible name). 

Otherwise looks good. [~saint@gmail.com] give it a quick glance? 

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch, HBASE-15180_V6.patch, HBASE-15180_V7.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-03-08 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186348#comment-15186348
 ] 

Enis Soztutar commented on HBASE-15180:
---

bq. Dropped branch-1 based fixed versions. Adding new method into Codec is fine 
there? Even if, we may change BB to new type in 2.0 later. So better we won't 
add any thing.
The only Codec that is used by outside is the Phoenix index codec as far as I 
know. The codec encodes some more cells for the index corresponding to the 
IndexedKeyValue's. We can modify Phoenix to compile with HBase-2.0 when time 
comes. Other than that, Codec's are not user-level classes, they are/should be 
framework level only. 


> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch, HBASE-15180_V6.patch, HBASE-15180_V7.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-03-01 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15173573#comment-15173573
 ] 

Hadoop QA commented on HBASE-15180:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
34s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 26s 
{color} | {color:green} master passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s 
{color} | {color:green} master passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 7m 
7s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
43s {color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 59s 
{color} | {color:red} hbase-server in master has 1 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s 
{color} | {color:green} master passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 15s 
{color} | {color:green} master passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 15s 
{color} | {color:red} hbase-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 9s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 9s 
{color} | {color:red} Patch generated 1 new checkstyle issues in hbase-common 
(total was 60, now 61). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
29m 5s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 7s 
{color} | {color:green} hbase-client in the patch passed with JDK v1.8.0_72. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 7s 
{color} | {color:green} hbase-common in the patch passed with JDK v1.8.0_72. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 146m 47s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_72. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 23s 
{color} | {color:green} hbase-client in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 11s 
{color} | {color:green} hbase-common in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 52s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 2m 
13s

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-29 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15173243#comment-15173243
 ] 

Anoop Sam John commented on HBASE-15180:


So in this patch made sure that on client side we wont go with optimization of 
Cells created directly over the response read buffer.   The cells will be 
handed over to user app and we can not be sure whether some of the cells will 
be cached or not.  Caching even single cell will make it such that we can not 
GC the bigger sized response buffer.  Server side we have MSLAB and copy to 
that before adding Cells to memstore.  [~enis]  Ping for ur review as well.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch, HBASE-15180_V6.patch, HBASE-15180_V7.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-23 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160284#comment-15160284
 ] 

stack commented on HBASE-15180:
---

Looks better to me. All the messing is kept internal to Codec.

Hopefully we get to remove the tag-particular codecs and KeyValue ones one day 
(let the codec figure whether tags are present or not rather than upper layers 
-- make Tags first class).

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch, HBASE-15180_V6.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-23 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15160261#comment-15160261
 ] 

Anoop Sam John commented on HBASE-15180:


wdyt of the latest patch [~saint@gmail.com], [~enis]

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch, HBASE-15180_V6.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15153947#comment-15153947
 ] 

Hadoop QA commented on HBASE-15180:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
52s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s 
{color} | {color:green} master passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s 
{color} | {color:green} master passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 6m 
55s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
36s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
32s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s 
{color} | {color:green} master passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s 
{color} | {color:green} master passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 16s 
{color} | {color:red} hbase-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 8s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 8s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 9s 
{color} | {color:red} Patch generated 1 new checkstyle issues in hbase-common 
(total was 71, now 72). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
37s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
22m 15s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 48s 
{color} | {color:green} hbase-client in the patch passed with JDK v1.8.0_72. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 34s 
{color} | {color:green} hbase-common in the patch passed with JDK v1.8.0_72. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 86m 3s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 3s 
{color} | {color:green} hbase-client in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 49s 
{color} | {color:green} hbase-common in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 80m 22s 
{color} | {color:green} hbase-server in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
44s {color} | {color:green} Patch does

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-18 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15153715#comment-15153715
 ] 

Anoop Sam John commented on HBASE-15180:


Dropped branch-1 based fixed versions.  Adding new method into Codec is fine 
there?  Even if, we may change BB to new type in 2.0 later.  So better we won't 
add any thing.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch, HBASE-15180_V6.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-18 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15153713#comment-15153713
 ] 

Anoop Sam John commented on HBASE-15180:


bq.maybe it is best to cut to BB? Or, your Interface that does BB – ByteBuff? 
Something to consider.
Ok..  Add getDecoder() with BB as of now.. Later we need change to new 
interface type, we can change then. So this fix will go into 2.0 only.  So 
before 2.0 is out, we will make the change of BB to new type as needed.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-17 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151398#comment-15151398
 ] 

stack commented on HBASE-15180:
---

bq. Said that, continue with IS instead of a data structure like byte[]/BB is 
giving us all sort of flexibility.

One thought I had on this subsequently was that [~enis] probably made the BB 
suggestion because he noticed we were turning somersaults to make IS work. So, 
while IS might be more flexible, if it means we have to do messy stuff, maybe 
it is best to cut to BB? Or, your Interface that does BB -- ByteBuff? Something 
to consider.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0, 1.2.1
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-17 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151327#comment-15151327
 ] 

Enis Soztutar commented on HBASE-15180:
---

bq. Yeah, best if could get away with not changing it but I'd say it is ok. Its 
2.0 and you are changing a class only used by hbase internals (i'd say)
Codec is LimitedPrivate, and should be used only by other frameworks (Phoenix, 
etc). It should be ok to change the interface in 2.0. But, if we feel that IS 
is more flexible, it is fine. 



> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0, 1.2.1
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-16 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149962#comment-15149962
 ] 

stack commented on HBASE-15180:
---

bq. Codec being non private if we add new methods, users may have to change in 
their code when use new version..

Yeah, best if could get away with not changing it but I'd say it is ok. Its 2.0 
and you are changing a class only used by hbase internals (i'd say)

 bq. Said that, continue with IS instead of a data structure like byte[]/BB is 
giving us all sort of flexibility.

This is reasonable.

Could be your Interface that does a subset of ByteBuffer too?

But as you say, maybe while you are working stuff out, best to stay flexible.


> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0, 1.2.1
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-16 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149932#comment-15149932
 ] 

Anoop Sam John commented on HBASE-15180:


2 reasons why did not go that way
1.  Codec being non private if we add new methods, users may have to change in 
their code when use new version.. Can we add method to Codec interface in a 
minor/patch release?  If we do fulfill our need with out changing the 
interface, I thought that is better,
2. Ya we have BB now..  Was thinking of doing PoC with reading data into more 
than one buffer when we use these buffers from a pool (We have BBBpool now).  
We know the size need to read the reqs.  As in BBBPool we can make have running 
avg of size and create bigger buffers.  But on demand creation of buffers off 
heap may create some -ve impact on full GC?   Any way all these are to be 
tested.   Said that, continue with IS instead of a data structure like 
byte[]/BB is giving us all sort of flexibility. 

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0, 1.2.1
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-16 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149319#comment-15149319
 ] 

stack commented on HBASE-15180:
---

bq. Maybe {{Codec}}s should be able to work on top of both IS and BBs instead 
of the BBIS wrapper.

I like that. So add to the Codec Interface a getDecoder(BB)? Would that work 
[~anoop.hbase]?

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0, 1.2.1
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-16 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149215#comment-15149215
 ] 

Enis Soztutar commented on HBASE-15180:
---

HBASE-14501 removed {{IS.available()}} dependency for the Codecs, but we are 
adding it back here because we can rely on BBIS.available()? Maybe {{Codec}}s 
should be able to work on top of both IS and BBs instead of the BBIS wrapper. 

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0, 1.2.1
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-16 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149211#comment-15149211
 ] 

stack commented on HBASE-15180:
---

s/remainAwareIS/reliableAvailable/ ?

Ok on the rest. Sounds good. If new patch, add to the EOF thrown how many bytes 
are missing.



> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0, 1.2.1
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-16 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148985#comment-15148985
 ] 

Anoop Sam John commented on HBASE-15180:


bq.Is this flag saying we can rely on available() giving a good answer?
Yes. Any suggestion?

The unsafe based BB read dont have boundary checks.. That is why thought of 
adding check in this Stream level.

Yes in StreamUtils it is doing the read of 4 bytes at a time
int n = in.read(readInBuf, bytesRead, readInBuf.length - bytesRead);
The while loop is to make sure it tried it best to read 4 bytes from stream.  
Even if the 1st go could do only 2 bytes. Still tried.  This was present in 
BaseDecoder. Just moved from there to here.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0, 1.2.1
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-16 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148960#comment-15148960
 ] 

stack commented on HBASE-15180:
---

bq.  I think a better question would be why the test coverage in hbase-common 
is not sufficient to capture the behavior expected by the hbase-server module.

This sounds good along w/ the general theme that modules should be 
self-contained (yeah, we need to break up hbase-server, move out mapreduce, 
etc.)

Can we have a better data member name than remainAwareIS? Is this flag saying 
we can rely on available() giving a good answer?

Why bother with this check? Just try reading? If EOF, it'll throw an EOFE 
anyways? Otherwise, if some other issue, we'll want to see that exception 
rather than the EOFE we create here:

66  if(this.available() Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0, 1.2.1
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-16 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148845#comment-15148845
 ] 

Sean Busbey commented on HBASE-15180:
-

{quote}
bq. UT case TestTag is in the hbase-server module which depends on 
hbase-common, but all changes in the latest patch lie in hbase-common, and it 
seems Yetus only checked UT of the hbase-common module and neglected the 
failure.

We need to make sure we run all test in all modules? Because our tests are not 
really UTs. Some are (mainly those in hbase-server module) are FTs. So change 
in hbase-common can cause issues with tests in hbase-server
{quote}

precommit testing is meant to be a quick first-pass correctness filter (or at 
the least, that's [the ideology Yetus subscribes 
to|http://yetus.apache.org/documentation/0.1.0/precommit-architecture/#some-philosophy]).
 It's not a replacement for robust nightly checks nor committer reviews. Based 
on those principles, Yetus presumes that modules are relatively self-contained 
wrt unit tests. In general, I think a new feature in yetus that leveraged the 
same module-dependency detecting used to determine build order in Yetus 0.2.0+ 
could also be used for an opt-in "build everything that depends on module X" 
mode.


In our case though, precommit tests already take an absurdly long time and 
hbase-server is the biggest offender (at > 1 hour per JVM). Always running the 
hbase-server unit tests is simply not sustainable for us. I think a better 
question would be why the test coverage in hbase-common is not sufficient to 
capture the behavior expected by the hbase-server module.

All that said, if you want to force a build across all modules (on the 
presumption that the "change in X means also build Y" generalizes) you could 
touch the root pom.xml in your change, or really any file in the top level 
directory.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0, 1.2.1
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-15 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148222#comment-15148222
 ] 

Anoop Sam John commented on HBASE-15180:


Ya the test code change looks fine.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0, 1.2.1
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-15 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148203#comment-15148203
 ] 

Anoop Sam John commented on HBASE-15180:


Thanks for checking the test.. I will check that..
bq. UT case TestTag is in the hbase-server module which depends on 
hbase-common, but all changes in the latest patch lie in hbase-common, and it 
seems Yetus only checked UT of the hbase-common module and neglected the 
failure.
We need to make sure we run all test in all modules?  Because our tests are not 
really UTs.  Some are (mainly those in hbase-server module) are FTs. So change 
in hbase-common can cause issues with tests in hbase-server

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0, 1.2.1
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-15 Thread Yu Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15148194#comment-15148194
 ] 

Yu Li commented on HBASE-15180:
---

{{TestTags#testFlushAndCompactionWithoutTags}} will fail with the latest patch.

After some investigation, it's caused by the bad design of the test case, that 
it assumes the KeyValue offset in the backup array is always 0, while after the 
below change in {{KeyValueCodec}} in this patch, the assumption no longer 
stands, thus causing the assertion to fail.
{code}
+Cell c = createCell(buf.array(), buf.arrayOffset() + buf.position(), 
len);
{code}

The fix should be straight forward, that to change:
{code}
assertTrue(current.getValueOffset() + current.getValueLength() == 
current.getLength());
{code}
into:
{code}
assertTrue(current.getValueOffset() + current.getValueLength() == 
current.getOffset()
  + current.getLength());
{code}
in both line 222 and 240 of {{TestTags}}. [~anoop.hbase] mind take a look here?

What's more, the UT case TestTag is in the hbase-server module which depends on 
hbase-common, but all changes in the latest patch lie in hbase-common, and it 
seems Yetus only checked UT of the hbase-common module and neglected the 
failure. Is this something new to fix or a known issue [~busbey]? Thanks.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0, 1.2.1
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-14 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146976#comment-15146976
 ] 

Anoop Sam John commented on HBASE-15180:


Ya my mistake in selecting Eclipse import suggestion ..  Will correct.  Thanks 
for the good eye.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0, 1.2.1
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-14 Thread Yu Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15146768#comment-15146768
 ] 

Yu Li commented on HBASE-15180:
---

Below change in the v4 patch seems like some typo
{code}
+import com.sun.xml.internal.rngom.parse.compact.EOFException
{code}
and we should change to use {{java.io.EOFException}}? [~anoop.hbase]

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Affects Versions: 0.98.0
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0, 1.2.1
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, 
> HBASE-15180_V4.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15143139#comment-15143139
 ] 

Hadoop QA commented on HBASE-15180:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
37s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s 
{color} | {color:green} master passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} master passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
10s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
49s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} master passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} master passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 13s 
{color} | {color:red} Patch generated 3 new checkstyle issues in hbase-common 
(total was 45, now 47). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
24m 59s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 0s 
{color} | {color:red} hbase-common introduced 1 new FindBugs issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 57s 
{color} | {color:green} hbase-common in the patch passed with JDK v1.8.0_72. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 50s 
{color} | {color:green} hbase-common in the patch passed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
7s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 39m 27s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hbase-common |
|  |  org.apache.hadoop.hbase.codec.KeyValueCodec$KeyValueDecoder.parseCell() 
ignores result of org.apache.hadoop.hbase.io.ByteBufferInputStream.skip(long)  
At KeyValueCodec.java: At KeyValueCodec.java:[line 81] |
\\
\\
|| Subsystem || Report/Notes ||

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-03 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130258#comment-15130258
 ] 

Anoop Sam John commented on HBASE-15180:


Have a new patch ready but need change once HBASE-15177 goes in.  We will be 
using BBIS instead BAIS then.  I dont need a new extension for BAIS to gran the 
byte[] it uses.  I can add required APIs in BBIS.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-01 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127259#comment-15127259
 ] 

stack commented on HBASE-15180:
---

bq. Codec and its Decoder being non private, we can not directly change the 
params/return types.

Codec was done long time ago. It was kept minimal until use cases were better 
known. We are probably only user. Lets look at a case-by-case basis. At least 
we can add methods and deprecate the old.

bq. And same Decoder used in WAL reading as well. 

Good. That was intent that it would be used everywhere. Pity it is not used 
reading and writing hfiles.

bq. Now we are returning a BAIS object from this IpcUtil method.

I see what you are saying. Rather than BAIS, instead a CIS, one that does Cells 
more natively. That sounds good.  As long as the CellIS is an IS, we can use 
Codec Interfaces.

Rather than pass a boolean to the method to do direct or tags, could you return 
a different Codec implementation? A server-side Codec and/or tags-capable 
(would be better if seriialization figured if tags were present rather than a 
meta boolean passed in by the server)? Would we still need to do context if 
serverside codec and clientside codec?

Looking at CellUtil#createCell, its content is same as what is in 
CellReadableByteArrayInputStream; its duplicated code?

We are pivoting on the underlying Stream being a BAIS. Will it always be a 
BAIS? Will it ever be a DBB?


bq. While reading from WAL, same path of flow is executed and then the IS wont 
be CellInputStream type.

Why not and should it?


bq.  Only diff is instead of BAIS we make CellBAIS object so we save copy.

Could the createCell save a copy in same way? Look see if a BAIS and if so, use 
its buffer and offset creating the Cell?

bq. As u suggest we can have new config which can be turned on at server side 
and off at client side/

Pardon me. Config is a bad idea.  Does the caller have enough context to make a 
server vs client Decoder? How did we do this for tags? It had to know if server 
vs client?  We could also pull createCellScanner out of IPCUtil. Maybe 
createCellScanner does not belong in IPCUtil? We can then in RPCClient or 
RPCServer create different cellscanners.  Or, each creates an IPCUtil. We could 
pass in a 'Context' on construction of IPCUtil and in it it would say if Client 
or Server.

The logic in CellReadablePBIS is same as for PBIS?  We use PBIS just so we can 
figure if EOF.







> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-01 Thread Jingcheng Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126122#comment-15126122
 ] 

Jingcheng Du commented on HBASE-15180:
--

To supplement more information.
+--+--+-+-+
||mslab-pool-on | mslab-on  | mslab-off   |
+--+--+-+-+
| throughput |  242801  |  240464| 225679 |
+--+--+-+-+
| latency(us) |  486|  483  | 704   
|
+--+--+-+-+
| gc pause(ms) |  8440  | 12837   |  17131  |
+--+--+-+-+
I used 2MB for the chunk size during the test.
Speaking of G1GC, G1GC splits the heap into 2000 regions. In my test (the heap 
size is 64GB), each region will be 32MB. 
If the chunk pool is off, there might be fragments after minor gc (2MB is a 
fragment comparing to 32MB, and the fragments are compacted in mixed gc and 
full gc).
If the chunk size is increased to 16MB or larger, I believe the performance can 
be improved comparing to 2MB chunk size.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-01 Thread Jingcheng Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126131#comment-15126131
 ] 

Jingcheng Du commented on HBASE-15180:
--

And the gc pause was measured per minute.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-01 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127620#comment-15127620
 ] 

Anoop Sam John commented on HBASE-15180:


bq.I see what you are saying. Rather than BAIS, instead a CIS, one that does 
Cells more natively. That sounds good. As long as the CellIS is an IS, we can 
use Codec Interfaces.
Yes
bq.Rather than pass a boolean to the method to do direct or tags, could you 
return a different Codec implementation? A server-side Codec and/or 
tags-capable (would be better if seriialization figured if tags were present 
rather than a meta boolean passed in by the server)? Would we still need to do 
context if serverside codec and clientside codec?
Ya let me see..  I agree that it is ugly passing the boolean throughout. As of 
now we dont support passing tags from client to server and reverse.  Codec has 
to serialize tags when it is Replication. So we have a new Codec 
(KVCodecWithTags).  But both these Codecs were using same BaseDecoder and Cell 
create paths.  Let me see how we can solve this.

bq.We are pivoting on the underlying Stream being a BAIS. Will it always be a 
BAIS? Will it ever be a DBB?
It can be DBB later. Once we start reading the request into DBB (pooled) - yes.
Said that we are not having any hard need for underlying IS to be BAIS.   That 
is the reason why I did not add some thing like a new API in Codec where asking 
Decoder to work on a byte[] or so.  COntinue that to be an IS based gives us 
the freedom to change the underlying data structure.  We can make ByteBufferIS 
which is a Cell readable. We can direct create cell (with out copy) over the 
underlying DBB. (We have OffheapCell now)

bq.Could the createCell save a copy in same way? Look see if a BAIS and if so, 
use its buffer and offset creating the Cell?
You mean the createCell in CellUtil? No. even if the IS is BAIS, we can not 
directly make a Cell (with out any copy). BAIS is not exposing its backing 
byte[] buffer. We will need indirect way of grabbing the byte[] from it.
That is why I made the CellBAIS extending BAIS which is having extra API to 
create a Cell directly from its underlying buffer(with out any copy)

{quote}
bq.While reading from WAL, same path of flow is executed and then the IS wont 
be CellInputStream type.
Why not and should it?
{quote}
>From that stream we can not make Cell directly with out any copy. The 
>underlying stream is the one from DFS. Said that CellInputStream is a stream 
>from which we can make cell DIRECTLY WITH OUT COPY.  The name is confusing?  
>From other streams also we can read cell but with copy.

Ya I agree new configs I also dont prefer. Actually we can avoid this and any 
context.  We can  have 2 paths of Decoder make in client and server. Ya we can 
even move it from IPCUtil also. Let me see.



> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-02-01 Thread Esteban Gutierrez (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15126709#comment-15126709
 ] 

Esteban Gutierrez commented on HBASE-15180:
---

Good stuff [~jingcheng...@intel.com]. Very interesting numbers, I expected 
further tuning to provide a throughput, but didn't tried with larger chucks (my 
test was around 24GB). Once I have a chance, I'm going to re-run my tests with 
JDK8 u80 and I let you know to let you know if I'm seeing the same behavior.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-31 Thread Jingcheng Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125780#comment-15125780
 ] 

Jingcheng Du commented on HBASE-15180:
--

bq. I believe Jingcheng Du done some YCSB test with writes (G1GC we have) and 
he said it was giving bad results.
Thanks [~anoop.hbase].
I did some YCSB tests with HBase 1.0.0 release and G1GC (JDK is 1.8.0_51), and 
I used 64GB for the JVM heap in each RS, and test the cases with/without MSLAB.
I ran YCSB with the command "ycsb load hbase-10 -P 1T_workload -threads 200 -p 
columnfamily=family -p clientbuffering=true -s > 1T_workload.dat". I observed 
the performance (throughput, latency and gc pause time), the performance of the 
case when enabling MSLAB with chunk pool is better than the case when enabling 
MSLAB without chunk pool, and the case without MSLAB is worst.
The performance might be different with G1GC in different jre versions and 
different heap sizes. According to my tests, MSLAB with chunk was best.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-30 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124875#comment-15124875
 ] 

stack commented on HBASE-15180:
---



bq. My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
directly from it.

Where do we need this (trying to follow along). In current patch I see it being 
used inside in IPCUtils method that returns a CellScanner -- seems odd to use 
this new Stream in this method to give to the Codec which then does the 
CellScanner Interface.

bq. Plan to introduce some thing like a CodecContext associated with every 
Codec instance which can say the server/client context.

Why we need a Context? Don't we currently make a decoder per Cell type and/or 
context? Then we keep simple Codec API and any mess parsing is internal to the 
Codec implementation?

bq. SO u suggest renaming of the interface. That should be fine and looks 
better.

Yeah, I think suggested name is better but, lets spend some time on how this 
stuff will be used first.

I remember being here with this Codec stuff and I kept bumping into need for a 
CellInputStream but in end was able to make do with CellScanner; that was then 
and stuff may be different now.

bq. To avoid the overhead of parsing tagsLength every time this was done.

Yeah. Lets move away from passing these withTags flags in the code base.. When 
we decode, we should be able to cheaply figure if tags present or not; lets fix 
that rather than pass extra flag all over.

bq. This was needed because of the way we have this PushbackIS. 

Shouldn't we pass the length when we create the PBIS derivative?

bq. Now any way you suggest add a new config to decide this copy or not rather 
than rely on MSLAB. 

Can we ask our environment if we are on the serverside and if so, just do the 
non-copy and presume that MSLAB or something else, if MSLAB is off, will assume 
ownership of the Cells so we can let go of the buffer?  Doing this is a little 
more indirect but better I think than having MSLAB reference in RPC.


> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-30 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124905#comment-15124905
 ] 

Anoop Sam John commented on HBASE-15180:


Codec and its Decoder being non private, we can not directly change the 
params/return types.And same Decoder used in WAL reading as well. You can note 
that the getDecoder still takes an InputStream.  Now we are returning a BAIS 
object from this IpcUtil method. So for reading a Cell, we need copy cell bytes 
into new byte[].  Here this actual Object type which we return only is getting 
Changed..  We extend BAIS and make a CellBAIS which implements CellInputStream. 
And while we read cells from Decoder, we have check like if it is 
CellInputStream, we read Cells directly from it.   While reading from WAL, same 
path of flow is executed and then the IS wont be CellInputStream type.  Then we 
will be doing copy from Stream and make Cells..   
So Codec gives a Decoder (which is of type CellScanner)  by taking an 
InputStream.  There is no change am making in this area.  Only diff is instead 
of BAIS we make CellBAIS object so we save copy.

I thought of making the Context so as to know the codec context where it is 
running (client or server)..   We use diff IpcUtil methods (extra boolean param 
for copy or not) and that is why I thought now Context is not needed.  As u 
suggest we can have new config which can be turned on at server side and off at 
client side/

bq.lets fix that rather than pass extra flag all over.
Let me try.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-30 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124906#comment-15124906
 ] 

Anoop Sam John commented on HBASE-15180:


bq.Can we ask our environment if we are on the serverside and if so
IpcUtil we have the conf.  How we can know server/client side?

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-30 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124908#comment-15124908
 ] 

Anoop Sam John commented on HBASE-15180:


bq.Yes, I noticed about 5-10% improvement on GC times and CPU utilization after 
disabling MSLAB only if using G1GC. Tuning MSLAB helps a little but I don't see 
to much advantage to have it enabled when G1GC is there.
I believe [~jingcheng...@intel.com] done some YCSB test with writes (G1GC we 
have) and he said it was giving bad results.

bq.MSLAB is like an HBASE-14918 Segment. Would it make sense to make it a 
Segment?
Segment is a part of the memstore right?  MSLAB is a place for only keeping the 
Cell data.  So I am not sure whether MSLAB can be a Segment.

Any way our work in this umbrella Jira is to make the write path off heap and 
make an off heap MSLAB impl.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-29 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15123913#comment-15123913
 ] 

Hadoop QA commented on HBASE-15180:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
39s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s 
{color} | {color:green} master passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 37s 
{color} | {color:green} master passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 8m 
12s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
50s {color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 0s 
{color} | {color:red} hbase-common in master has 1 extant Findbugs warnings. 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 47s 
{color} | {color:red} hbase-server in master has 1 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s 
{color} | {color:green} master passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s 
{color} | {color:green} master passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 21s 
{color} | {color:red} hbase-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 37s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 43s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_72. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 43s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.8.0_72. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 35s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_91. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 35s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.7.0_91. {color} |
| {color:black}{color} | {color:black} checkstyle {color} | {color:black} 1m 
23s {color} | {color:black} Patch generated 2 new checkstyle issues in 
hbase-common (total was 100, now 102). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
57s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
30m 36s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 29s 
{color} | {color:red} hbase-common introduced 1 new FindBugs issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 3s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 37s 
{color} | {color:green} hbase-client in the patch passed with JDK v1.8.0_72. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 53s 
{color} | {color:green} hbase-common in the patch passed with JDK v1.8.0_72. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 49s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 26s 
{color} | {color:green} hbase-client in the patch passed with JDK v1.7.0_91. 
{color} |

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-29 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124586#comment-15124586
 ] 

Anoop Sam John commented on HBASE-15180:


CellScanner - Ya we made it like normal our scanner way of call advance and 
then get current.  You mean we should have been doing just getting next Cell 
from the Decoder?  
bq. If so, we have CellOutputStream, should your CellReadable be a 
CellInputStream with read methods that return Cells to mirror the write methods 
we have in CellOutputStream. Your CellReadableByteArrayInputStream would become 
CellByteArrayInputStream and would implement CellInputStream.
SO u suggest renaming of the interface.  That should be fine and looks better.

bq.I've asked this before I know but do we have to flag when tags and when 
without? Internally, when we read, the Cell will know if it has tags or not?
To avoid the overhead of parsing tagsLength every time this was done.  The old 
method we used to read cells from InputStream was
iscreate(final InputStream in, boolean withTags)
So KVCodecWithTags only pass this as true and we make KeyValue instance.  By 
default we have KVCodec which pass false and we make NoTagsKV on which 
getTagsLength is just return 0.  But there is still an issue more.  When we 
copy the Cell to MSLAB before adding to Memstore, we do copy and make a new 
KeyValue only. So this NoTagsKV becomes KV there !!!Need handle but another 
issue.
Ya as u said, we can avoid passing this boolean and after the Cell is been read 
out in new CellInputStream (KeyValue object then), we need to parse tags length 
once and see it is 0 or >0 and based on that recreate NoTagsKV if needed.  That 
is one parsing op extra and one Object creation extra. For that gone with 
passing the boolean at that time I believe.  What do u say?

{quote}
What is the length in the below?
Cell readCell(int length, boolean withTags) throws IOException;
Do we have to pass this in each time?
{quote}
This was needed because of the way we have this PushbackIS.  In Decoder impl, 
we wrap the actual IS with this PBIS.  The Decoder advance operation, reads one 
byte to know whether it is end.  And now the remaining 3 bytes and old already 
read byte to be considered to get the Cell's total length.  Said that the read 
of the Cell's length has to happen in the PBIS impl.  (there only we know the 
old already read single byte)..  And the read of the Cell (Construction of 
Cell) to be done in CellByteArrayInputStream.  That is why we have to pass the 
length. I agree it looks ugly... I had no other way..  This PBIS thing was done 
to read Cells in WAL decoder properly.   In case of reading Cells from byte[] 
from RPC, we dont really have such an issue.   Let me see some way we can solve 
this. May be we need special treatment for both cases.

bq.IPCUtil takes a Configuration? Can we not just read the Configuration on 
construction rather than pass this flag per call?
IPCUtil #createCellScanner(final Codec codec, final CompressionCodec 
compressor,  final byte [] cellBlock)  -  Used from RpcClient
new method called from RpcServer alone.  Yes I dont want this direct Cell 
create to happen in client side where we dont have an MSLAB copy.   Still as u 
said, if we read the conf property and set in IPCUtil and use that while 
construction of Stream, I can avoid this passing of boolean.   If any chance 
the config xml in server is used in the client and that says use MSLAB (Still 
we dont have as we dont have Memstore and all there), we will do it wrongly no. 
  That is I was afraid to do that.  

Now any way you suggest add a new config to decide this copy or not rather than 
rely on MSLAB. I think ya it is good. Only downside is again a new config!  But 
ya it looks better..  Let me do that.. 

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-29 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124771#comment-15124771
 ] 

ramkrishna.s.vasudevan commented on HBASE-15180:


Renaming the interface is better +1 on that. 
G1GC and MSLAB is interesting to see. 
We need a handoff from the incoming request handlers to the Server.  Going 
based on MSLAB on or off makes sense but if it is 'off' then creating  a BB 
pool on the Rpcserver request processing side may have to see for issues still 
the BBs are flushed to a hfile?

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-29 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124005#comment-15124005
 ] 

Enis Soztutar commented on HBASE-15180:
---

bq. May be BB based then. Again am trying to experiment with reading the req 
not just into one single large BB. From the pool we might be getting fixed 
sized smaller BBs (Say 64 KB or so) And we can read in to those many BBs. And 
the CellScanner need to work on a set of BBs (Like the MultiByteBuff stuff) 
Then again even this BB based API is an issue.. Continue with an InputStream 
based API gives us the freedom of experimenting with this different data 
structures.
We are still creating an IS per request. It is not the end of the world though. 
At the time of the request we do know the RPC buffer size required. I was 
trying to reuse the BBPool from Hadoop which can return a buffer at least as 
large as the request size.  

bq. Ya we have MSALB enabled by default. I agree that doing the MSLAB check in 
RPC layer looks ugly. Wanted to avoid we refer to the req read byte[] (from 
memstore cells) when some one turns MSLAB off. So what do you say? Remove this 
check?
We need to do a separate issue and remove the option for disabling MSLAB. Then 
we can assume in this patch that MSLAB is always enabled. 

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-29 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124024#comment-15124024
 ] 

stack commented on HBASE-15180:
---

On MSLAB on all the time, there is talk of turning it off in G1GC because there 
it does not help (this you [~esteban]?). TBD. On other hand, I can see need for 
a 'terminus', a 'handoff' from the reading handler so no longer a reference and 
so something like a copy into MSLAB makes some sense. MSLAB is like an 
HBASE-14918 Segment. Would it make sense to make it a Segment?

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-29 Thread Esteban Gutierrez (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124036#comment-15124036
 ] 

Esteban Gutierrez commented on HBASE-15180:
---

Yes, I noticed about 5-10% improvement on GC times and CPU utilization after 
disabling MSLAB only if using G1GC. Tuning MSLAB helps a little but I don't see 
to much advantage to have it enabled when G1GC is there.

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-29 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124091#comment-15124091
 ] 

Enis Soztutar commented on HBASE-15180:
---

bq. Yes, I noticed about 5-10% improvement on GC times and CPU utilization 
after disabling MSLAB only if using G1GC. 
Even with G1, this is unexpected. Is there a theoretical explanation? 

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-29 Thread Esteban Gutierrez (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124141#comment-15124141
 ] 

Esteban Gutierrez commented on HBASE-15180:
---

My guess at that time was that the rate the RSs where flushing the memstores 
aggressively, MSLAB was only creating extra work for the collector. If I 
remember correctly, I remember I started to notice the improvement of disabling 
MSLAB after going to higher throughputs (+100K puts/sec). At lower throughputs 
I don't think the contribution with our without was noticeable.  

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-29 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124145#comment-15124145
 ] 

stack commented on HBASE-15180:
---

bq. Even with G1, this is unexpected. Is there a theoretical explanation?

Why you say [~enis]? Here's a SWAG. MSLAB is to prevent fragmentation in CMS. 
Every Cell gets copied and there is the allocation of the SLABs themselves (no 
reuse). G1GC avoids fragmentation by copying to a new region if fragmention. 
Many small copies and SLAB allocations cost more than the relatively macro 
copies G1GC does. To be verified...

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-29 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124076#comment-15124076
 ] 

Hadoop QA commented on HBASE-15180:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 1s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
40s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s 
{color} | {color:green} master passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 7s 
{color} | {color:green} master passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 7m 
10s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
37s {color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 42s 
{color} | {color:red} hbase-common in master has 1 extant Findbugs warnings. 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 50s 
{color} | {color:red} hbase-server in master has 1 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s 
{color} | {color:green} master passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s 
{color} | {color:green} master passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 14s 
{color} | {color:red} hbase-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 24s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 20s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_72. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 20s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.8.0_72. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 25s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_91. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 25s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 10s 
{color} | {color:red} Patch generated 2 new checkstyle issues in hbase-common 
(total was 100, now 102). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
36s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
21m 55s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
56s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 14s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 57s 
{color} | {color:green} hbase-client in the patch passed with JDK v1.8.0_72. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 44s 
{color} | {color:green} hbase-common in the patch passed with JDK v1.8.0_72. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 20s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 56s 
{color} | {color:green} hbase-client in the patch passed with JDK v1.7.0_91. 
{color} |
| {color:green}+1{color} |

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-29 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124131#comment-15124131
 ] 

stack commented on HBASE-15180:
---

Why not use Cell Codec for CellReadableByteArrayInputStream? CellCodec.Decoded 
takes an InputStream and returns Cells via CellScanner implementation.

What is difference between a CellReadable and a CellScanner? You have to do 
advance and then current each time which is a little more awkward but it is a 
common pattern we want to institute throughout. I suppose it is awkward. That'd 
be your argument. If so, we have CellOutputStream, should your CellReadable be 
a CellInputStream with read methods that return Cells to mirror the write 
methods we have in CellOutputStream. Your CellReadableByteArrayInputStream 
would become CellByteArrayInputStream and would implement CellInputStream.

I've asked this before I know but do we have to flag when tags and when 
without? Internally, when we read, the Cell will know if it has tags or not?

What is the length in the below?

  Cell readCell(int length, boolean withTags) throws IOException;

Do we have to pass this in each time?

192* @param directCellRead
193*  Whether to make Cells directly from the cellBlock bytes 
or need to copy. Pass false
194*  while using from client side.

IPCUtil takes a Configuration? Can we not just read the Configuration on 
construction rather than pass this flag per call?

Seems like you want server-side and client-side to act different. Having 
RPCServer 'know' about MSLAB don't seem right. It is pollution of rpc with 
internals on how we do memstore. Can we have another property for when we 
should lean on the rpc buffer (we 'know' it safe when mslab is going on. 
perhaps a method that obscures the rationale for when to copy )

Patch looks great otherwise.



> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-28 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15122488#comment-15122488
 ] 

Enis Soztutar commented on HBASE-15180:
---

This is a good idea. Having all cells in the same RPC share the same byte[]. 

Is {{CellReadable}} really necessary? Isn't this the same thing as 
Codec.Decoder. I mean, from a layering perspective, I thought that we would 
instead change the Codec to be aware of byte[] directly, and return a 
CellScanner that can return KV's from the same buffer. I was thinking of doing 
a Codec at the RPC layer to do something like FAST_DIFF. Can that still be done 
with this patch? 

Should we default to MSLAB for good? I don't think anybody runs with MSLAB off. 

RPCServer reaching this is not right: 
{code}
+this.mslabEnabled = conf.getBoolean(HConstants.USEMSLAB_KEY, 
HConstants.USEMSLAB_DEFAULT);
{code}

Can the byte[4]'s be statically allocated?  

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-28 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15122878#comment-15122878
 ] 

Anoop Sam John commented on HBASE-15180:


You mean to make Codec decoder to work over a byte[] also rather than on 
InputStream? That is nice. Why I did not go through that path is my next work 
is in E2E off heap in write path also.  So when the write req comes, we might 
be reading it into an off heap BB (from a pool). So then byte[] based decoding 
is not possible.  May be BB based then.   Again am trying to experiment with 
reading the req not just into one single large BB.   From the pool we might be 
getting fixed sized smaller BBs (Say 64 KB or so)  And we can read in to those 
many BBs.  And the CellScanner need to work on a set of BBs  (Like the 
MultiByteBuff stuff)Then again even this BB based API is an issue..  
Continue  with an InputStream based API gives us the freedom of experimenting 
with this different data structures.

bq.Should we default to MSLAB for good? I don't think anybody runs with MSLAB 
off.
Ya we have MSALB enabled by default.  I agree that doing the MSLAB check in RPC 
layer looks ugly.   Wanted to avoid we refer to the req read byte[] (from 
memstore cells) when some one turns MSLAB off.So what do you say? Remove 
this check? 

bq.Can the byte[4]'s be statically allocated?
No we can not. There will be many parallel req processing and many instances of 
this InputStream in action.  On one instance only we dont have multi threaded 
reads.  Making it static make all IS instances refer the same byte [] !

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120788#comment-15120788
 ] 

Hadoop QA commented on HBASE-15180:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
10s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s 
{color} | {color:green} master passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s 
{color} | {color:green} master passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 7m 
56s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
42s {color} | {color:green} master passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 50s 
{color} | {color:red} hbase-common in master has 1 extant Findbugs warnings. 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 6s 
{color} | {color:red} hbase-server in master has 1 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s 
{color} | {color:green} master passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s 
{color} | {color:green} master passed with JDK v1.7.0_91 {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 16s 
{color} | {color:red} hbase-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 29s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 26s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.8.0_66. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 26s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.8.0_66. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 29s 
{color} | {color:red} hbase-server in the patch failed with JDK v1.7.0_91. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 29s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.7.0_91. {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 11s 
{color} | {color:red} Patch generated 2 new checkstyle issues in hbase-common 
(total was 100, now 102). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
25m 35s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s 
{color} | {color:green} the patch passed with JDK v1.7.0_91 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 52s 
{color} | {color:green} hbase-client in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 43s 
{color} | {color:green} hbase-common in the patch passed with JDK v1.8.0_66. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 29s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 59s 
{color} | {color:green} hbase-client in the patch passed with JDK v1.7.0_91. 
{color} |
| {color:green}+1{color} |

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-27 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120736#comment-15120736
 ] 

Ted Yu commented on HBASE-15180:


{code}
48public Cell readCell(boolean withTags) throws IOException {
49  if (intBuf == null) {
50// Lazy init. In real flow, we will use the readCell(int, 
boolean) API only
51intBuf = new byte[Bytes.SIZEOF_INT];
{code}
There are two places where readCell() is implemented in the same manner. Can 
code duplication be reduced ?
{code}
56protected static class CellReadablePBIS extends PBIS implements 
CellReadable {
{code}
What does PB mean above ?

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder

2016-01-27 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120791#comment-15120791
 ] 

Anoop Sam John commented on HBASE-15180:


bq.What does PB mean above ?
PushbackInputStream. We have PBIS  already here.

Let me see how can we avoid the duplication

> Reduce garbage created while reading Cells from Codec Decoder
> -
>
> Key: HBASE-15180
> URL: https://issues.apache.org/jira/browse/HBASE-15180
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: HBASE-15180.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use 
> KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create 
> a byte[] of length 4 and read the cell length and then an array of Cell's 
> length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created 
> on top of a ByteArrayInputStream on top of this. By default in write path, we 
> have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell 
> bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that 
> bytes.  So there is no issue if we create Cells over the RPC read byte[] 
> directly here in Decoder.  No need for 2 byte[] creation and copy for every 
> Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells 
> directly from it.  
> Same Codec path is used in client side also. There better we can avoid this 
> direct Cell create and continue to do the copy to smaller byte[]s path.  Plan 
> to introduce some thing like a CodecContext associated with every Codec 
> instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

54 matches

Mail list logo