date:20150531

[jira] [Commented] (HBASE-13686) Fail to limit rate in RateLimiter

2015-05-31 Thread Ashish Singhi (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566984#comment-14566984
 ] 

Ashish Singhi commented on HBASE-13686:
---

bq. If set the RateLimiter is 10 res per minute. In the first minute, we can 
consume 30 res (10 from avail, 10 from above "return limit", 10 from refill).
How ? In this case refill will return 10. Then if you check the logic of 
RateLimiter#canExceute(long), available will calculated using below code in 
this case
{code}
avail = Math.max(0, Math.min(avail + refillAmount, limit));
{code}
So in the first minute the calculation will be like avail = Math.max(0, 
Math.min(10+ 10, 10)); as per this avail be equal to 10.

> Fail to limit rate in RateLimiter
> -
>
> Key: HBASE-13686
> URL: https://issues.apache.org/jira/browse/HBASE-13686
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.1.0
>Reporter: Guanghao Zhang
>Assignee: Ashish Singhi
> Fix For: 2.0.0, 1.2.0, 1.1.1
>
> Attachments: HBASE-13686-v1.patch, HBASE-13686-v2.patch, 
> HBASE-13686.patch
>
>
> While using the patch in HBASE-11598 , I found that RateLimiter can't to 
> limit the rate right.
> {code} 
>  /**
>* given the time interval, are there enough available resources to allow 
> execution?
>* @param now the current timestamp
>* @param lastTs the timestamp of the last update
>* @param amount the number of required resources
>* @return true if there are enough available resources, otherwise false
>*/
>   public synchronized boolean canExecute(final long now, final long lastTs, 
> final long amount) {
> return avail >= amount ? true : refill(now, lastTs) >= amount;
>   }
> {code}
> When avail >= amount, avail can't be refill. But in the next time to call 
> canExecute, lastTs maybe update. So avail will waste some time to refill. 
> Even we use smaller rate than the limit, the canExecute will return false. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13686) Fail to limit rate in RateLimiter

2015-05-31 Thread Guanghao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566975#comment-14566975
 ] 

Guanghao Zhang commented on HBASE-13686:


refill in AverageIntervalRateLimiter.
{code}
if (nextRefillTime == -1) {
  // Till now no resource has been consumed.
  nextRefillTime = EnvironmentEdgeManager.currentTimeMillis();
  return limit;
}
{code}
If set the RateLimiter is 10 res per minute. In the first minute, we can 
consume 30 res (10 from avail, 10 from above "return limit", 10 from refill).
nextRefillTime should be one property of RateLimiter and initialized when new 
RateLimiter.

> Fail to limit rate in RateLimiter
> -
>
> Key: HBASE-13686
> URL: https://issues.apache.org/jira/browse/HBASE-13686
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.1.0
>Reporter: Guanghao Zhang
>Assignee: Ashish Singhi
> Fix For: 2.0.0, 1.2.0, 1.1.1
>
> Attachments: HBASE-13686-v1.patch, HBASE-13686-v2.patch, 
> HBASE-13686.patch
>
>
> While using the patch in HBASE-11598 , I found that RateLimiter can't to 
> limit the rate right.
> {code} 
>  /**
>* given the time interval, are there enough available resources to allow 
> execution?
>* @param now the current timestamp
>* @param lastTs the timestamp of the last update
>* @param amount the number of required resources
>* @return true if there are enough available resources, otherwise false
>*/
>   public synchronized boolean canExecute(final long now, final long lastTs, 
> final long amount) {
> return avail >= amount ? true : refill(now, lastTs) >= amount;
>   }
> {code}
> When avail >= amount, avail can't be refill. But in the next time to call 
> canExecute, lastTs maybe update. So avail will waste some time to refill. 
> Even we use smaller rate than the limit, the canExecute will return false. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13356) HBase should provide an InputFormat supporting multiple scans in mapreduce jobs over snapshots

2015-05-31 Thread Andrew Mains (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Mains updated HBASE-13356:
-
Attachment: HBASE-13356-branch-1.patch

> HBase should provide an InputFormat supporting multiple scans in mapreduce 
> jobs over snapshots
> --
>
> Key: HBASE-13356
> URL: https://issues.apache.org/jira/browse/HBASE-13356
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce
>Reporter: Andrew Mains
>Assignee: Andrew Mains
>Priority: Minor
> Attachments: HBASE-13356-0.98.patch, HBASE-13356-branch-1.patch, 
> HBASE-13356.2.patch, HBASE-13356.3.patch, HBASE-13356.4.patch, 
> HBASE-13356.patch
>
>
> Currently, HBase supports the pushing of multiple scans to mapreduce jobs 
> over live tables (via MultiTableInputFormat) but only supports a single scan 
> for mapreduce jobs over table snapshots. It would be handy to support 
> multiple scans over snapshots as well, probably through another input format 
> (MultiTableSnapshotInputFormat?). To mimic the functionality present in 
> MultiTableInputFormat, the new input format would likely have to take in the 
> names of all snapshots used in addition to the scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-13819) Make RPC layer CellBlock buffer a DirectByteBuffer

2015-05-31 Thread Anoop Sam John (JIRA)

Anoop Sam John created HBASE-13819:
--

 Summary: Make RPC layer CellBlock buffer a DirectByteBuffer
 Key: HBASE-13819
 URL: https://issues.apache.org/jira/browse/HBASE-13819
 Project: HBase
  Issue Type: Sub-task
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 2.0.0


In RPC layer, when we make a cellBlock to put as RPC payload, we will make an 
on heap byte buffer (via BoundedByteBufferPool). The pool will keep upto 
certain number of buffers. This jira aims at testing possibility for making 
this buffers off heap ones. (DBB)  The advantages
1. Unsafe based writes to off heap is faster than that to on heap. Now we are 
not using unsafe based writes at all. Even if we add, DBB will be better
2. When Cells are backed by off heap (HBASE-11425) off heap to off heap writes 
will be better
3. When checked the code in SocketChannel impl, if we pass a HeapByteBuffer to 
the socket channel, it will create a temp DBB and copy data to there and only 
DBBs will be moved to Sockets. If we make DBB 1st hand itself, we can  avoid 
this one more level of copying.

Will do different perf testing with changed and report back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12988) [Replication]Parallel apply edits on row-level

2015-05-31 Thread Lars Hofhansl (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-12988:
--
Assignee: (was: hongyu bi)
  Status: Patch Available  (was: Open)

Ran some replication tests. Let's see if it breaks something unexpected.
I'll repeat: _not ready to be used at all_, will ship edits out of order.

> [Replication]Parallel apply edits on row-level
> --
>
> Key: HBASE-12988
> URL: https://issues.apache.org/jira/browse/HBASE-12988
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Reporter: hongyu bi
> Attachments: ParallelReplication-v2.txt
>
>
> we can apply  edits to slave cluster in parallel on table-level to speed up 
> replication .
> update : per conversation blow , it's better to apply edits on row-level in 
> parallel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13344) Add enforcer rule that matches our JDK support statement

2015-05-31 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566947#comment-14566947
 ] 

Sean Busbey commented on HBASE-13344:
-

we only need the extra-enforcer-rules dependency on the enforcer plugin when 
the release profile is active, right? could we move that to the profile as well?

> Add enforcer rule that matches our JDK support statement
> 
>
> Key: HBASE-13344
> URL: https://issues.apache.org/jira/browse/HBASE-13344
> Project: HBase
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Matt Warhaftig
>Priority: Minor
>  Labels: beginner, maven
> Fix For: 2.0.0
>
> Attachments: HBASE-13344-master.patch, 
> HBASE-13344-master_addendum_v1.patch, HBASE-13344-master_v2.patch
>
>
> The [ref guide gives a list of JDKs that we expect our hbase versions to work 
> with at runtime|http://hbase.apache.org/book.html#basic.prerequisites].
> Let's add in the extra-enforcer-rules mojo and start using [the bytecode 
> version  
> rule|http://mojo.codehaus.org/extra-enforcer-rules/enforceBytecodeVersion.html]
>  to make sure that the result of our builds on a given branch won't fail out 
> because of a misconfigured target jdk version (or a dependency that targets a 
> later jdk).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HBASE-13803) Disable the MobCompactionChore when the interval is not larger than 0

2015-05-31 Thread ramkrishna.s.vasudevan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-13803.

  Resolution: Fixed
Hadoop Flags: Reviewed

Pushed to hbase-11339 branch. Thanks for the patch [~jingcheng...@intel.com].

> Disable the MobCompactionChore when the interval is not larger than 0
> -
>
> Key: HBASE-13803
> URL: https://issues.apache.org/jira/browse/HBASE-13803
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Fix For: hbase-11339
>
> Attachments: HBASE-13803-V2.diff, HBASE-13803.diff
>
>
> If users set the interval of MobCompactionChore as a number that is not 
> larger than 0, we should disable the MobCompactionChore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HBASE-13805) Use LimitInputStream in hbase-common instead of ProtobufUtil.LimitedInputStream

2015-05-31 Thread Anoop Sam John (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John resolved HBASE-13805.

  Resolution: Fixed
Hadoop Flags: Reviewed

Thanks Jingcheng for the patch.

> Use LimitInputStream in hbase-common instead of 
> ProtobufUtil.LimitedInputStream
> ---
>
> Key: HBASE-13805
> URL: https://issues.apache.org/jira/browse/HBASE-13805
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Fix For: hbase-11339
>
> Attachments: HBASE-13805.diff
>
>
> Now we have a LimitedInputStream defined in ProtobufUtil.java. We have 
> similar code in LimitInputStream of hbase-common, we can use it instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13805) Use LimitInputStream in hbase-common instead of ProtobufUtil.LimitedInputStream

2015-05-31 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566924#comment-14566924
 ] 

Anoop Sam John commented on HBASE-13805:


I see..  Ya mergeDelimitedFrom() seems added in this branch..  +1 then.

> Use LimitInputStream in hbase-common instead of 
> ProtobufUtil.LimitedInputStream
> ---
>
> Key: HBASE-13805
> URL: https://issues.apache.org/jira/browse/HBASE-13805
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Fix For: hbase-11339
>
> Attachments: HBASE-13805.diff
>
>
> Now we have a LimitedInputStream defined in ProtobufUtil.java. We have 
> similar code in LimitInputStream of hbase-common, we can use it instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13356) HBase should provide an InputFormat supporting multiple scans in mapreduce jobs over snapshots

2015-05-31 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566923#comment-14566923
 ] 

Ted Yu commented on HBASE-13356:


Looks pretty good.
Minor comments:
{code}
+ * MultiTableSnapshotInputFormat generalizes {@link 
org.apache.hadoop.hbase.mapred
+ * .TableSnapshotInputFormat}
{code}
Better put '{@link ' on second line so that the class name is on same line.

In MultiTableSnapshotInputFormatImpl :
{code}
+  // TODO: these probably belong elsewhere/may already be implemented 
elsewhere.
+
{code}
The above can be removed, right ?

> HBase should provide an InputFormat supporting multiple scans in mapreduce 
> jobs over snapshots
> --
>
> Key: HBASE-13356
> URL: https://issues.apache.org/jira/browse/HBASE-13356
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce
>Reporter: Andrew Mains
>Assignee: Andrew Mains
>Priority: Minor
> Attachments: HBASE-13356-0.98.patch, HBASE-13356.2.patch, 
> HBASE-13356.3.patch, HBASE-13356.4.patch, HBASE-13356.patch
>
>
> Currently, HBase supports the pushing of multiple scans to mapreduce jobs 
> over live tables (via MultiTableInputFormat) but only supports a single scan 
> for mapreduce jobs over table snapshots. It would be handy to support 
> multiple scans over snapshots as well, probably through another input format 
> (MultiTableSnapshotInputFormat?). To mimic the functionality present in 
> MultiTableInputFormat, the new input format would likely have to take in the 
> names of all snapshots used in addition to the scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13803) Disable the MobCompactionChore when the interval is not larger than 0

2015-05-31 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566920#comment-14566920
 ] 

Anoop Sam John commented on HBASE-13803:


+1

> Disable the MobCompactionChore when the interval is not larger than 0
> -
>
> Key: HBASE-13803
> URL: https://issues.apache.org/jira/browse/HBASE-13803
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Fix For: hbase-11339
>
> Attachments: HBASE-13803-V2.diff, HBASE-13803.diff
>
>
> If users set the interval of MobCompactionChore as a number that is not 
> larger than 0, we should disable the MobCompactionChore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13818) manual region split from HBase shell, I found that split command acts incorrectly with hex split keys

2015-05-31 Thread zhangjg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangjg updated HBASE-13818:

Description: 
manual region split from HBase shell, I found that split command acts 
incorrectly with hex split keys

hbase(main):001:0> split 
'sdb,\x00\x00+Ug\xD60\x00\x00\x01\x00\x10\xC0,1432909366893.6b601fa4eb9e1244d049bde93e340736.'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/data/xiaoju/hbase-0.96.2-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/data/xiaoju/hbase-0.96.2-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/data/xiaoju/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2015-06-01 11:40:46,986 WARN  [main] util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable

ERROR: Illegal character code:44, <,> at 3. User-space table qualifiers can 
only contain 'alphanumeric characters': i.e. [a-zA-Z_0-9-.]: 
sdb,"\x00\x00+Ug\xD60\x00\x00\x01\x00\x10\xC0",1432909366893.6b601fa4eb9e1244d049bde93e340736.

Here is some help for this command:
Split entire table or pass a region to split individual region.  With the 
second parameter, you can specify an explicit split key for the region.  
Examples:
split 'tableName'
split 'namespace:tableName'
split 'regionName' # format: 'tableName,startKey,id'
split 'tableName', 'splitKey'
split 'regionName', 'splitKey'

  was:
manual region split from HBase shell, I found that split command acts 
incorrectly with hex split keys

hbase(main):001:0> split 
'sdb,"\x00\x00+Ug\xD60\x00\x00\x01\x00\x10\xC0",1432909366893.6b601fa4eb9e1244d049bde93e340736.'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/data/xiaoju/hbase-0.96.2-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/data/xiaoju/hbase-0.96.2-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/data/xiaoju/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2015-06-01 11:40:46,986 WARN  [main] util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable

ERROR: Illegal character code:44, <,> at 3. User-space table qualifiers can 
only contain 'alphanumeric characters': i.e. [a-zA-Z_0-9-.]: 
sdb,"\x00\x00+Ug\xD60\x00\x00\x01\x00\x10\xC0",1432909366893.6b601fa4eb9e1244d049bde93e340736.

Here is some help for this command:
Split entire table or pass a region to split individual region.  With the 
second parameter, you can specify an explicit split key for the region.  
Examples:
split 'tableName'
split 'namespace:tableName'
split 'regionName' # format: 'tableName,startKey,id'
split 'tableName', 'splitKey'
split 'regionName', 'splitKey'


> manual region split from HBase shell, I found that split command acts 
> incorrectly with hex split keys
> -
>
> Key: HBASE-13818
> URL: https://issues.apache.org/jira/browse/HBASE-13818
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 0.96.2
>Reporter: zhangjg
>
> manual region split from HBase shell, I found that split command acts 
> incorrectly with hex split keys
> hbase(main):001:0> split 
> 'sdb,\x00\x00+Ug\xD60\x00\x00\x01\x00\x10\xC0,1432909366893.6b601fa4eb9e1244d049bde93e340736.'
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/data/xiaoju/hbase-0.96.2-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/data/xiaoju/hbase-0.96.2-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/data/xiaoju/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> 2015-06-01 11:40:46,986 WARN  [main] util.NativeCodeLoader: Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> ERROR: Illegal character code:44, <,> at 3. User-space table qualifiers can 
> only contain 'alphanumeric characters': i.e. [a-zA-Z_0-9-.]: 
> sdb,"\x00\x00+Ug\xD60\x00\x00\x01\

[jira] [Created] (HBASE-13818) manual region split from HBase shell, I found that split command acts incorrectly with hex split keys

2015-05-31 Thread zhangjg (JIRA)

zhangjg created HBASE-13818:
---

 Summary: manual region split from HBase shell, I found that split 
command acts incorrectly with hex split keys
 Key: HBASE-13818
 URL: https://issues.apache.org/jira/browse/HBASE-13818
 Project: HBase
  Issue Type: Bug
  Components: shell
Affects Versions: 0.96.2
Reporter: zhangjg


manual region split from HBase shell, I found that split command acts 
incorrectly with hex split keys

hbase(main):001:0> split 
'sdb,"\x00\x00+Ug\xD60\x00\x00\x01\x00\x10\xC0",1432909366893.6b601fa4eb9e1244d049bde93e340736.'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/data/xiaoju/hbase-0.96.2-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/data/xiaoju/hbase-0.96.2-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/data/xiaoju/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2015-06-01 11:40:46,986 WARN  [main] util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable

ERROR: Illegal character code:44, <,> at 3. User-space table qualifiers can 
only contain 'alphanumeric characters': i.e. [a-zA-Z_0-9-.]: 
sdb,"\x00\x00+Ug\xD60\x00\x00\x01\x00\x10\xC0",1432909366893.6b601fa4eb9e1244d049bde93e340736.

Here is some help for this command:
Split entire table or pass a region to split individual region.  With the 
second parameter, you can specify an explicit split key for the region.  
Examples:
split 'tableName'
split 'namespace:tableName'
split 'regionName' # format: 'tableName,startKey,id'
split 'tableName', 'splitKey'
split 'regionName', 'splitKey'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13356) HBase should provide an InputFormat supporting multiple scans in mapreduce jobs over snapshots

2015-05-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566909#comment-14566909
 ] 

Hadoop QA commented on HBASE-13356:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12736478/HBASE-13356-0.98.patch
  against 0.98 branch at commit 0e6102a68cc95f0240fa72a5f86866c07b8744b7.
  ATTACHMENT ID: 12736478

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 17 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.1 2.5.2 2.6.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
24 warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.backup.TestHFileArchiving
  org.apache.hadoop.hbase.master.TestTableLockManager

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14251//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14251//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14251//artifact/patchprocess/checkstyle-aggregate.html

  Javadoc warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14251//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14251//console

This message is automatically generated.

> HBase should provide an InputFormat supporting multiple scans in mapreduce 
> jobs over snapshots
> --
>
> Key: HBASE-13356
> URL: https://issues.apache.org/jira/browse/HBASE-13356
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce
>Reporter: Andrew Mains
>Assignee: Andrew Mains
>Priority: Minor
> Attachments: HBASE-13356-0.98.patch, HBASE-13356.2.patch, 
> HBASE-13356.3.patch, HBASE-13356.4.patch, HBASE-13356.patch
>
>
> Currently, HBase supports the pushing of multiple scans to mapreduce jobs 
> over live tables (via MultiTableInputFormat) but only supports a single scan 
> for mapreduce jobs over table snapshots. It would be handy to support 
> multiple scans over snapshots as well, probably through another input format 
> (MultiTableSnapshotInputFormat?). To mimic the functionality present in 
> MultiTableInputFormat, the new input format would likely have to take in the 
> names of all snapshots used in addition to the scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13803) Disable the MobCompactionChore when the interval is not larger than 0

2015-05-31 Thread Jingcheng Du (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingcheng Du updated HBASE-13803:
-
Attachment: HBASE-13803-V2.diff

Update the patch according to Anoop's comment. Thanks.

> Disable the MobCompactionChore when the interval is not larger than 0
> -
>
> Key: HBASE-13803
> URL: https://issues.apache.org/jira/browse/HBASE-13803
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Fix For: hbase-11339
>
> Attachments: HBASE-13803-V2.diff, HBASE-13803.diff
>
>
> If users set the interval of MobCompactionChore as a number that is not 
> larger than 0, we should disable the MobCompactionChore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13804) Revert the changes in pom.xml

2015-05-31 Thread Jingcheng Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566887#comment-14566887
 ] 

Jingcheng Du commented on HBASE-13804:
--

Hi Jon [~jmhsieh], do you have concerns on this patch? Please advise if any. 
Thanks.

> Revert the changes in pom.xml
> -
>
> Key: HBASE-13804
> URL: https://issues.apache.org/jira/browse/HBASE-13804
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Fix For: hbase-11339
>
> Attachments: HBASE-13804.diff
>
>
> Some code were delete in pom.xml.
> {noformat}
> 
>target/jacoco.exec
> 
> {noformat}
> We can revert the changes if this change is not necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13805) Use LimitInputStream in hbase-common instead of ProtobufUtil.LimitedInputStream

2015-05-31 Thread Jingcheng Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566886#comment-14566886
 ] 

Jingcheng Du commented on HBASE-13805:
--

I think the code in ProtobufUtil is added intentionally for mob. We have to 
remove them explicitly.

> Use LimitInputStream in hbase-common instead of 
> ProtobufUtil.LimitedInputStream
> ---
>
> Key: HBASE-13805
> URL: https://issues.apache.org/jira/browse/HBASE-13805
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Fix For: hbase-11339
>
> Attachments: HBASE-13805.diff
>
>
> Now we have a LimitedInputStream defined in ProtobufUtil.java. We have 
> similar code in LimitInputStream of hbase-common, we can use it instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13806) Check the mob files when there are mob-enabled columns in HFileCorruptionChecker

2015-05-31 Thread Jingcheng Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566883#comment-14566883
 ] 

Jingcheng Du commented on HBASE-13806:
--

Thanks guys for comments.
bq. The idea here is that we read a hfile, and if we have the mob refs present 
on a cell, we verify that the corresponding mob file is present. Furthermore, 
we could verify if the mob value cell is present
I think in this patch, we should do same things as what is done in 
HFileCorruptionChecker where only opening a reader to a HFile. We can file 
another JIRA to detect the missed mob files which needs a full table scan I am 
afraid (In the current implementation of HFile.main() it does the full table 
scan, we need resolve the reference to mob file in that scan).

bq. But this tool just try create a reader on the HFiles and not really reading 
Cells out of it. So may be we will have the ref cell -> mob cell check as an 
other tool or so? Or am I missing some thing?
You are right Anoop. This one is just to find all the mob files and check 
whether they are corrupt. We need another JIRA to check the missed mob cells.

bq. But this would need a full table scan also? Are you just plan to check the 
integrity of the MOB files?
This is used to check the corruption of mob files, we need another JIRA to 
check the integrity, I am afraid we have to do a full table scan like what has 
been implemented for normal tables (We need to resolve the reference cell to 
the mob cell if any).

> Check the mob files when there are mob-enabled columns in 
> HFileCorruptionChecker
> 
>
> Key: HBASE-13806
> URL: https://issues.apache.org/jira/browse/HBASE-13806
> Project: HBase
>  Issue Type: Sub-task
>  Components: mob
>Affects Versions: hbase-11339
>Reporter: Jingcheng Du
>Assignee: Jingcheng Du
> Fix For: hbase-11339
>
>
> Now in HFileCorruptionChecker, it only checks the files in regions. We need 
> check the mob files too if there are mob-enabled columns in that table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13356) HBase should provide an InputFormat supporting multiple scans in mapreduce jobs over snapshots

2015-05-31 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566880#comment-14566880
 ] 

Ted Yu commented on HBASE-13356:


TestHFileArchiving test failure is not related to the patch.

> HBase should provide an InputFormat supporting multiple scans in mapreduce 
> jobs over snapshots
> --
>
> Key: HBASE-13356
> URL: https://issues.apache.org/jira/browse/HBASE-13356
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce
>Reporter: Andrew Mains
>Assignee: Andrew Mains
>Priority: Minor
> Attachments: HBASE-13356-0.98.patch, HBASE-13356.2.patch, 
> HBASE-13356.3.patch, HBASE-13356.4.patch, HBASE-13356.patch
>
>
> Currently, HBase supports the pushing of multiple scans to mapreduce jobs 
> over live tables (via MultiTableInputFormat) but only supports a single scan 
> for mapreduce jobs over table snapshots. It would be handy to support 
> multiple scans over snapshots as well, probably through another input format 
> (MultiTableSnapshotInputFormat?). To mimic the functionality present in 
> MultiTableInputFormat, the new input format would likely have to take in the 
> names of all snapshots used in addition to the scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13356) HBase should provide an InputFormat supporting multiple scans in mapreduce jobs over snapshots

2015-05-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566877#comment-14566877
 ] 

Hadoop QA commented on HBASE-13356:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12736474/HBASE-13356.4.patch
  against master branch at commit 0e6102a68cc95f0240fa72a5f86866c07b8744b7.
  ATTACHMENT ID: 12736474

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 14 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.1 2.5.2 2.6.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.backup.TestHFileArchiving

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14250//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14250//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14250//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14250//console

This message is automatically generated.

> HBase should provide an InputFormat supporting multiple scans in mapreduce 
> jobs over snapshots
> --
>
> Key: HBASE-13356
> URL: https://issues.apache.org/jira/browse/HBASE-13356
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce
>Reporter: Andrew Mains
>Assignee: Andrew Mains
>Priority: Minor
> Attachments: HBASE-13356-0.98.patch, HBASE-13356.2.patch, 
> HBASE-13356.3.patch, HBASE-13356.4.patch, HBASE-13356.patch
>
>
> Currently, HBase supports the pushing of multiple scans to mapreduce jobs 
> over live tables (via MultiTableInputFormat) but only supports a single scan 
> for mapreduce jobs over table snapshots. It would be handy to support 
> multiple scans over snapshots as well, probably through another input format 
> (MultiTableSnapshotInputFormat?). To mimic the functionality present in 
> MultiTableInputFormat, the new input format would likely have to take in the 
> names of all snapshots used in addition to the scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13356) HBase should provide an InputFormat supporting multiple scans in mapreduce jobs over snapshots

2015-05-31 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566856#comment-14566856
 ] 

Ted Yu commented on HBASE-13356:


Do you mind attaching patch for branch-1 ?

There are some conflicts in 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestMultiTableInputFormat.java

The patch name for branch-1 should contain branch-1. e.g. 13356-branch-1.patch

Thanks

> HBase should provide an InputFormat supporting multiple scans in mapreduce 
> jobs over snapshots
> --
>
> Key: HBASE-13356
> URL: https://issues.apache.org/jira/browse/HBASE-13356
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce
>Reporter: Andrew Mains
>Assignee: Andrew Mains
>Priority: Minor
> Attachments: HBASE-13356-0.98.patch, HBASE-13356.2.patch, 
> HBASE-13356.3.patch, HBASE-13356.4.patch, HBASE-13356.patch
>
>
> Currently, HBase supports the pushing of multiple scans to mapreduce jobs 
> over live tables (via MultiTableInputFormat) but only supports a single scan 
> for mapreduce jobs over table snapshots. It would be handy to support 
> multiple scans over snapshots as well, probably through another input format 
> (MultiTableSnapshotInputFormat?). To mimic the functionality present in 
> MultiTableInputFormat, the new input format would likely have to take in the 
> names of all snapshots used in addition to the scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13356) HBase should provide an InputFormat supporting multiple scans in mapreduce jobs over snapshots

2015-05-31 Thread Andrew Mains (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Mains updated HBASE-13356:
-
Attachment: HBASE-13356-0.98.patch

Attached a patch created against 0.98.12, in response to a request by 
[~Shaofengshi] on the mailing list. Hopefully the naming convention is correct; 
I went off of other tickets I could find, but couldn't find definitive 
documentation on how to best indicate that this patch is against an older 
version.

> HBase should provide an InputFormat supporting multiple scans in mapreduce 
> jobs over snapshots
> --
>
> Key: HBASE-13356
> URL: https://issues.apache.org/jira/browse/HBASE-13356
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce
>Reporter: Andrew Mains
>Assignee: Andrew Mains
>Priority: Minor
> Attachments: HBASE-13356-0.98.patch, HBASE-13356.2.patch, 
> HBASE-13356.3.patch, HBASE-13356.4.patch, HBASE-13356.patch
>
>
> Currently, HBase supports the pushing of multiple scans to mapreduce jobs 
> over live tables (via MultiTableInputFormat) but only supports a single scan 
> for mapreduce jobs over table snapshots. It would be handy to support 
> multiple scans over snapshots as well, probably through another input format 
> (MultiTableSnapshotInputFormat?). To mimic the functionality present in 
> MultiTableInputFormat, the new input format would likely have to take in the 
> names of all snapshots used in addition to the scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13356) HBase should provide an InputFormat supporting multiple scans in mapreduce jobs over snapshots

2015-05-31 Thread Andrew Mains (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566827#comment-14566827
 ] 

Andrew Mains commented on HBASE-13356:
--

Just updated, and confirmed that the v4 patch applies using smart-apply-patch. 
Let me know if there are any other issues.

> HBase should provide an InputFormat supporting multiple scans in mapreduce 
> jobs over snapshots
> --
>
> Key: HBASE-13356
> URL: https://issues.apache.org/jira/browse/HBASE-13356
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce
>Reporter: Andrew Mains
>Assignee: Andrew Mains
>Priority: Minor
> Attachments: HBASE-13356.2.patch, HBASE-13356.3.patch, 
> HBASE-13356.4.patch, HBASE-13356.patch
>
>
> Currently, HBase supports the pushing of multiple scans to mapreduce jobs 
> over live tables (via MultiTableInputFormat) but only supports a single scan 
> for mapreduce jobs over table snapshots. It would be handy to support 
> multiple scans over snapshots as well, probably through another input format 
> (MultiTableSnapshotInputFormat?). To mimic the functionality present in 
> MultiTableInputFormat, the new input format would likely have to take in the 
> names of all snapshots used in addition to the scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13356) HBase should provide an InputFormat supporting multiple scans in mapreduce jobs over snapshots

2015-05-31 Thread Andrew Mains (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Mains updated HBASE-13356:
-
Attachment: HBASE-13356.4.patch

> HBase should provide an InputFormat supporting multiple scans in mapreduce 
> jobs over snapshots
> --
>
> Key: HBASE-13356
> URL: https://issues.apache.org/jira/browse/HBASE-13356
> Project: HBase
>  Issue Type: New Feature
>  Components: mapreduce
>Reporter: Andrew Mains
>Assignee: Andrew Mains
>Priority: Minor
> Attachments: HBASE-13356.2.patch, HBASE-13356.3.patch, 
> HBASE-13356.4.patch, HBASE-13356.patch
>
>
> Currently, HBase supports the pushing of multiple scans to mapreduce jobs 
> over live tables (via MultiTableInputFormat) but only supports a single scan 
> for mapreduce jobs over table snapshots. It would be handy to support 
> multiple scans over snapshots as well, probably through another input format 
> (MultiTableSnapshotInputFormat?). To mimic the functionality present in 
> MultiTableInputFormat, the new input format would likely have to take in the 
> names of all snapshots used in addition to the scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13344) Add enforcer rule that matches our JDK support statement

2015-05-31 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566799#comment-14566799
 ] 

Andrew Purtell commented on HBASE-13344:


+1 for moving to the release profile. 
I found myself building a release with JDK 8, bailed out part way and adjusted 
paths for JDK 7. This can happen.

> Add enforcer rule that matches our JDK support statement
> 
>
> Key: HBASE-13344
> URL: https://issues.apache.org/jira/browse/HBASE-13344
> Project: HBase
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Matt Warhaftig
>Priority: Minor
>  Labels: beginner, maven
> Fix For: 2.0.0
>
> Attachments: HBASE-13344-master.patch, 
> HBASE-13344-master_addendum_v1.patch, HBASE-13344-master_v2.patch
>
>
> The [ref guide gives a list of JDKs that we expect our hbase versions to work 
> with at runtime|http://hbase.apache.org/book.html#basic.prerequisites].
> Let's add in the extra-enforcer-rules mojo and start using [the bytecode 
> version  
> rule|http://mojo.codehaus.org/extra-enforcer-rules/enforceBytecodeVersion.html]
>  to make sure that the result of our builds on a given branch won't fail out 
> because of a misconfigured target jdk version (or a dependency that targets a 
> later jdk).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13647) Default value for hbase.client.operation.timeout is too high

2015-05-31 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566798#comment-14566798
 ] 

Andrew Purtell commented on HBASE-13647:


bq. I kept the operation timeout in the htable stuff (but it's not me who put 
it there :-) ), but now I wonder if we should not just remove it from this code 
path

Sounds reasonable, but that would be an incompatible change, so would have to 
be for 1.2 and up.

> Default value for hbase.client.operation.timeout is too high
> 
>
> Key: HBASE-13647
> URL: https://issues.apache.org/jira/browse/HBASE-13647
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.0.1, 0.98.13, 1.2.0, 1.1.1
>Reporter: Andrey Stepachev
>Assignee: Andrey Stepachev
>Priority: Blocker
> Fix For: 2.0.0, 0.98.13, 1.0.2, 1.2.0, 1.1.1
>
> Attachments: HBASE-13647.patch, HBASE-13647.v2.patch
>
>
> Default value for hbase.client.operation.timeout is too high, it is LONG.Max.
> That value will block any service calls to coprocessor endpoints indefinitely.
> Should we introduce better default value for that?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-12988) [Replication]Parallel apply edits on row-level

2015-05-31 Thread Lars Hofhansl (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-12988:
--
Attachment: ParallelReplication-v2.txt

Here's a simple patch. Only so that we can perf test this, to see whether we'll 
see any improvement.

* doesn't do any grouping by table or row, so with badly times compaction on 
the sink, deleted data can resurface
* fixed max parallelization, should better be scaled to the number of sink 
region servers available (i.e. 10% of selected servers, or something)

As I said, just for testing so that we can see what we can expect as 
improvement at best.

> [Replication]Parallel apply edits on row-level
> --
>
> Key: HBASE-12988
> URL: https://issues.apache.org/jira/browse/HBASE-12988
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Reporter: hongyu bi
>Assignee: hongyu bi
> Attachments: ParallelReplication-v2.txt
>
>
> we can apply  edits to slave cluster in parallel on table-level to speed up 
> replication .
> update : per conversation blow , it's better to apply edits on row-level in 
> parallel



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13782) RS stuck after FATAL ``FSHLog: Could not append.''

2015-05-31 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566659#comment-14566659
 ] 

Lars Hofhansl commented on HBASE-13782:
---

An RS _should_ shut down when encountering a FATAL problem. This is a bug.
Please list the exact version of HBase used, rather than a vendor specific 
version number.


> RS stuck after FATAL ``FSHLog: Could not append.''
> --
>
> Key: HBASE-13782
> URL: https://issues.apache.org/jira/browse/HBASE-13782
> Project: HBase
>  Issue Type: Bug
>  Components: wal
>Affects Versions: 1.0.1
> Environment: hbaes version: 1.0.0-cdh5.4.0
> hadoop version: 2.6.0-cdh5.4.0 
>Reporter: Mingjie Lai
>Priority: Critical
> Attachments: hbase-rs.log, hbase-site.xml
>
>
> hbaes version: 1.0.0-cdh5.4.0
> hadoop version: 2.6.0-cdh5.4.0 
> Environment: 40-node hadoop cluster shared with a 10-node hbase cluster and a 
> 30-node yarn.
> We started to see that one RS stopped to serve any client request since 
> 2015-05-26 01:05:33, while all other RS were okay. I checked RS log and found 
> that there are some FATAL logs when 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog tried to append() and sync{}:
> {code}
> 2015-05-26 01:05:33,700 FATAL 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Could not append. Requesting 
> close of wal
> java.io.IOException: Bad connect ack with firstBadLink as 10.28.1.17:50010
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1472)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1373)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:600)
> 2015-05-26 01:05:33,700 FATAL 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Could not append. Requesting 
> close of wal
> java.io.IOException: Bad connect ack with firstBadLink as 10.28.1.17:50010
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1472)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1373)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:600)
> 2015-05-26 01:05:33,700 FATAL 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Could not append. Requesting 
> close of wal
> java.io.IOException: Bad connect ack with firstBadLink as 10.28.1.17:50010
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1472)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1373)
>   at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:600)
> 2015-05-26 01:05:33,700 INFO org.apache.hadoop.hbase.regionserver.wal.FSHLog: 
> Archiving 
> hdfs://nameservice1/hbase/WALs/hbase08.company.com,60020,1431985722474/hbase08.company.com%2C60020%2C1431985722474.default.1432602140966
>  to 
> hdfs://nameservice1/hbase/oldWALs/hbase08.company.com%2C60020%2C1431985722474.default.1432602140966
> 2015-05-26 01:05:33,701 ERROR 
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Error syncing, request close 
> of wal 
> {code}
> Since the HDFS cluster is shared with a YARN cluster, at the time, there were 
> some io heavy jobs running, and exhausted xciever at some of the DNs at the 
> exact same time. I think it's the reason why the RS got 
> ``java.io.IOException: Bad connect ack with firstBadLink''
> The problem is, the RS got stuck without any response since then. 
> flushQueueLength grew to the ceiling and stayed there. The only log entries 
> are from periodicFlusher:
> {code}
> 2015-05-26 02:06:26,742 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> regionserver/hbase08.company.com/10.28.1.6:60020.periodicFlusher requesting 
> flush for region 
> myns:mytable,3992+80bb1,1432526964367.c4906e519c1f8206a284c66a8eda2159. after 
> a delay of 11000
> 2015-05-26 02:06:26,742 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> regionserver/hbase08.company.com/10.28.1.6:60020.periodicFlusher requesting 
> flush for region 
> myns:mytable,0814+0416,1432541066864.cf42d5ab47e051d69e516971e82e84be. after 
> a delay of 7874
> 2015-05-26 02:06:26,742 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> regionserver/hbase08.company.com/10.28.1.6:60020.periodicFlusher requesting 
> flush for region 
> myns:mytable,2022+7a571,1432528246524.299c1d4bb28fda2a4d9f248c6c22153c. after 
> a delay of 22740
> 2015-05-26 02:06:26,742 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> regionserver/hbase08.company.com/10.28.1.6:60020.periodicFlusher requesting 
> flush for region 
> my

[jira] [Commented] (HBASE-13448) New Cell implementation with cached component offsets/lengths

2015-05-31 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566650#comment-14566650
 ] 

Lars Hofhansl commented on HBASE-13448:
---

Oh I get this. Hence in my test I only see the disadvantage of extra heap used. 
That's why I asked how to best this. :)

So I'll test with multiple CFs (maybe one per column) and also not compact the 
table.

> New Cell implementation with cached component offsets/lengths
> -
>
> Key: HBASE-13448
> URL: https://issues.apache.org/jira/browse/HBASE-13448
> Project: HBase
>  Issue Type: Sub-task
>  Components: Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: 13448-0.98.txt, HBASE-13448.patch, HBASE-13448_V2.patch, 
> HBASE-13448_V3.patch, gc.png, hits.png
>
>
> This can be extension to KeyValue and can be instantiated and used in read 
> path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13560) Large compaction queue should steal from small compaction queue when idle

2015-05-31 Thread Changgeng Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566633#comment-14566633
 ] 

Changgeng Li commented on HBASE-13560:
--

I think the test is still valid. We just need to shutdown one of the thread 
pool after the change introduced by this issue.


> Large compaction queue should steal from small compaction queue when idle
> -
>
> Key: HBASE-13560
> URL: https://issues.apache.org/jira/browse/HBASE-13560
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 2.0.0
>Reporter: Elliott Clark
>Assignee: Changgeng Li
> Attachments: queuestealwork-v1.patch, queuestealwork-v4.patch, 
> queuestealwork-v5.patch, queuestealwork-v6.patch, queuestealwork-v7.patch
>
>
> If you tune compaction threads so that a server is never over commited when 
> large and small compaction threads are busy then it should be possible to 
> have the large compactions steal work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13560) Large compaction queue should steal from small compaction queue when idle

2015-05-31 Thread Changgeng Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566632#comment-14566632
 ] 

Changgeng Li commented on HBASE-13560:
--

I think the test is still valid. We just need to shutdown one of the thread 
pool after the change introduced by this issue.


> Large compaction queue should steal from small compaction queue when idle
> -
>
> Key: HBASE-13560
> URL: https://issues.apache.org/jira/browse/HBASE-13560
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: 2.0.0
>Reporter: Elliott Clark
>Assignee: Changgeng Li
> Attachments: queuestealwork-v1.patch, queuestealwork-v4.patch, 
> queuestealwork-v5.patch, queuestealwork-v6.patch, queuestealwork-v7.patch
>
>
> If you tune compaction threads so that a server is never over commited when 
> large and small compaction threads are busy then it should be possible to 
> have the large compactions steal work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13344) Add enforcer rule that matches our JDK support statement

2015-05-31 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566625#comment-14566625
 ] 

Hadoop QA commented on HBASE-13344:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12736435/HBASE-13344-master_addendum_v1.patch
  against master branch at commit 0e6102a68cc95f0240fa72a5f86866c07b8744b7.
  ATTACHMENT ID: 12736435

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+0 tests included{color}.  The patch appears to be a 
documentation, build,
or dev-support patch that doesn't require tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.1 2.5.2 2.6.0)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.backup.TestHFileArchiving

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14249//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14249//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14249//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/14249//console

This message is automatically generated.

> Add enforcer rule that matches our JDK support statement
> 
>
> Key: HBASE-13344
> URL: https://issues.apache.org/jira/browse/HBASE-13344
> Project: HBase
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Matt Warhaftig
>Priority: Minor
>  Labels: beginner, maven
> Fix For: 2.0.0
>
> Attachments: HBASE-13344-master.patch, 
> HBASE-13344-master_addendum_v1.patch, HBASE-13344-master_v2.patch
>
>
> The [ref guide gives a list of JDKs that we expect our hbase versions to work 
> with at runtime|http://hbase.apache.org/book.html#basic.prerequisites].
> Let's add in the extra-enforcer-rules mojo and start using [the bytecode 
> version  
> rule|http://mojo.codehaus.org/extra-enforcer-rules/enforceBytecodeVersion.html]
>  to make sure that the result of our builds on a given branch won't fail out 
> because of a misconfigured target jdk version (or a dependency that targets a 
> later jdk).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-13817) ByteBufferOuputStream - add writeInt support

2015-05-31 Thread Anoop Sam John (JIRA)

Anoop Sam John created HBASE-13817:
--

 Summary: ByteBufferOuputStream - add writeInt support
 Key: HBASE-13817
 URL: https://issues.apache.org/jira/browse/HBASE-13817
 Project: HBase
  Issue Type: Sub-task
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 2.0.0


While writing Cells to this stream, to make the CellBlock ByteBuffer, we do 
write length of the cell as int. We use StreamUtils to do this which will write 
each byte one after the other. So 4 write calls on Stream.(OutputSteam has only 
this support) With ByteBufferOuputStream we have the overhead of checking for 
size limit and possible grow with every write call. Internally this stream 
writes to a ByteBuffer. Again inside the ByteBuffer implementations there is 
position limit checks.  If we do write these length as int in one go we can 
reduce this overhead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13344) Add enforcer rule that matches our JDK support statement

2015-05-31 Thread Matt Warhaftig (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Warhaftig updated HBASE-13344:
---
Attachment: HBASE-13344-master_addendum_v1.patch

Ok, sounds reasonable so choose the addendum approach.  Attached 
'HBASE-13344-master_addendum_v1.patch' moves the maxJdkVersion rule to the 
'release' profile.

> Add enforcer rule that matches our JDK support statement
> 
>
> Key: HBASE-13344
> URL: https://issues.apache.org/jira/browse/HBASE-13344
> Project: HBase
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Matt Warhaftig
>Priority: Minor
>  Labels: beginner, maven
> Fix For: 2.0.0
>
> Attachments: HBASE-13344-master.patch, 
> HBASE-13344-master_addendum_v1.patch, HBASE-13344-master_v2.patch
>
>
> The [ref guide gives a list of JDKs that we expect our hbase versions to work 
> with at runtime|http://hbase.apache.org/book.html#basic.prerequisites].
> Let's add in the extra-enforcer-rules mojo and start using [the bytecode 
> version  
> rule|http://mojo.codehaus.org/extra-enforcer-rules/enforceBytecodeVersion.html]
>  to make sure that the result of our builds on a given branch won't fail out 
> because of a misconfigured target jdk version (or a dependency that targets a 
> later jdk).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12295) Prevent block eviction under us if reads are in progress from the BBs

2015-05-31 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566406#comment-14566406
 ] 

Anoop Sam John commented on HBASE-12295:


bq.Don't these each have their own ref count?
Now in our impl, ref count is incremented per scanner. So it is not like normal 
java ref count. When we see one HFileBlock's ref count is 2 that means there 
are 2 active scanners still referring to this. Those refs might be in the read 
path places like comparators etc and/or by cells created from this block.   So 
if we have to make some thing like cell close() we will maintain the count 
incremented when a Cell is created also. This creation is at HFileScanner 
level. Then every cell, when closed has to decrement this count.  But the 
disadv for this wrt perf is we will make the counter incr/decr many times. This 
has to be an atomic long and so there will be some cost. But more than that , 
the complications will be in cell filtering area etc. When the Cell is created 
from Scanner level, we dont know whether this cell will get eliminated. The 
cell can get ignored at many places (due to version/ttl/acl/visibility/filter 
etc etc)..  All these impl has to handle the close of the cell when it is being 
ignored.  There can be custom filters which is doing this elimination. (by 
filterCell(Cell) or  by filterRowCells(List) ) All these has to handle 
the close. And filter can transform an incoming cell into another and make that 
to be used from then on (transformCell(Cell))  Same way CPs also can do this 
filtering. There is the complication which I was telling.   I actually liked 
that idea so that changes in HRegion level can be avoided..  Then thinking more 
and more I landed in all these kind of practical issues...  Just adding here 
FYI  [~stack]

> Prevent block eviction under us if reads are in progress from the BBs
> -
>
> Key: HBASE-12295
> URL: https://issues.apache.org/jira/browse/HBASE-12295
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0
>
> Attachments: HBASE-12295.pdf, HBASE-12295_trunk.patch
>
>
> While we try to serve the reads from the BBs directly from the block cache, 
> we need to ensure that the blocks does not get evicted under us while 
> reading.  This JIRA is to discuss and implement a strategy for the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HBASE-13448) New Cell implementation with cached component offsets/lengths

2015-05-31 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566404#comment-14566404
 ] 

Anoop Sam John edited comment on HBASE-13448 at 5/31/15 7:23 AM:
-

[~larsh] thanks for the comments

I was trying to explain why we won't see any improve as such in the test and 
especially  in 0.98. Sorry if I was not clearly saying.
Test have 1 CF and single file in that. Under StoreScanner KVHeap, we have only 
single file always and there is no comparison happening and no calls to 
getXXXOffset/Length there.  There is get calls in StoreScanner (max 2 times) 
and then in SQM also we need component offset/length.  But in SQM we dont do 
get calls on KeyValue to get offset/length.  Instead we calculate there on 
parsing KV buffer on our own. (See code below). Then SQM is skipping these 
cells and so no further get calls on the cells.  So in effect there is 2 times 
get call on rowLength and just one time on others.  This makes it clear why no 
adv.
In a real case where Cells are not skipped (and in trunk especially) there are 
many times call happen and mainly on rowLength.  When ExplicitColTracker in 
use, there are calls to qualifier offset/length also many times.  For other 
component length/offset, the keyLength is parsed frequently.  If u see table in 
above comments you can see how many times each call happen on a single Cell. 
Those numbers are when cells are written back to client side so comes in all 
layes.  But in that test also I had only 1 CF and one HFile.  So when this is 
also getting more, there will be comparison op happening in 2 KVHeaps and so 
the calls will be more. (We no longer pass the byte[], offset, length into 
Comparators but instead pass Cell alone)

So in case of trunk there will be adv we would see..  If you can give us your 
test, I will run it on trunk.

{code}
byte [] bytes = kv.getBuffer();
int offset = kv.getOffset();

int keyLength = Bytes.toInt(bytes, offset, Bytes.SIZEOF_INT);
offset += KeyValue.ROW_OFFSET;

int initialOffset = offset;

short rowLength = Bytes.toShort(bytes, offset, Bytes.SIZEOF_SHORT);
offset += Bytes.SIZEOF_SHORT;

int ret = this.rowComparator.compareRows(row, this.rowOffset, 
this.rowLength,
bytes, offset, rowLength);
...
...

//Passing rowLength
offset += rowLength;

//Skipping family
byte familyLength = bytes [offset];
offset += familyLength + 1;

int qualLength = keyLength -
  (offset - initialOffset) - KeyValue.TIMESTAMP_TYPE_SIZE;

long timestamp = Bytes.toLong(bytes, initialOffset + keyLength - 
KeyValue.TIMESTAMP_TYPE_SIZE);
...
...
byte type = bytes[initialOffset + keyLength - 1];
...
MatchCode colChecker = columns.checkColumn(bytes, offset, qualLength, type);
if (colChecker == MatchCode.INCLUDE) {
  ReturnCode filterResponse = ReturnCode.SKIP;
  // STEP 2: Yes, the column is part of the requested columns. Check if 
filter is present
  if (filter != null) {
// STEP 3: Filter the key value and return if it filters out
filterResponse = filter.filterKeyValue(kv);

{code}



was (Author: anoop.hbase):
@larsh thanks for the comments

I was trying to explain why we won't see any improve as such in the test and 
especially  in 0.98. Sorry if I was not clearly saying.
Test have 1 CF and single file in that. Under StoreScanner KVHeap, we have only 
single file always and there is no comparison happening and no calls to 
getXXXOffset/Length there.  There is get calls in StoreScanner (max 2 times) 
and then in SQM also we need component offset/length.  But in SQM we dont do 
get calls on KeyValue to get offset/length.  Instead we calculate there on 
parsing KV buffer on our own. (See code below). Then SQM is skipping these 
cells and so no further get calls on the cells.  So in effect there is 2 times 
get call on rowLength and just one time on others.  This makes it clear why no 
adv.
In a real case where Cells are not skipped (and in trunk especially) there are 
many times call happen and mainly on rowLength.  When ExplicitColTracker in 
use, there are calls to qualifier offset/length also many times.  For other 
component length/offset, the keyLength is parsed frequently.  If u see table in 
above comments you can see how many times each call happen on a single Cell. 
Those numbers are when cells are written back to client side so comes in all 
layes.  But in that test also I had only 1 CF and one HFile.  So when this is 
also getting more, there will be comparison op happening in 2 KVHeaps and so 
the calls will be more. (We no longer pass the byte[], offset, length into 
Comparators but instead pass Cell alone)

So in case of trunk there will be adv we would see..  If you can give us your 
test, I will run it on trunk.

{code}
byte [] bytes = kv.getBuffer();
int offset = kv.getOffset();

in

[jira] [Commented] (HBASE-13448) New Cell implementation with cached component offsets/lengths

2015-05-31 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566404#comment-14566404
 ] 

Anoop Sam John commented on HBASE-13448:


@larsh thanks for the comments

I was trying to explain why we won't see any improve as such in the test and 
especially  in 0.98. Sorry if I was not clearly saying.
Test have 1 CF and single file in that. Under StoreScanner KVHeap, we have only 
single file always and there is no comparison happening and no calls to 
getXXXOffset/Length there.  There is get calls in StoreScanner (max 2 times) 
and then in SQM also we need component offset/length.  But in SQM we dont do 
get calls on KeyValue to get offset/length.  Instead we calculate there on 
parsing KV buffer on our own. (See code below). Then SQM is skipping these 
cells and so no further get calls on the cells.  So in effect there is 2 times 
get call on rowLength and just one time on others.  This makes it clear why no 
adv.
In a real case where Cells are not skipped (and in trunk especially) there are 
many times call happen and mainly on rowLength.  When ExplicitColTracker in 
use, there are calls to qualifier offset/length also many times.  For other 
component length/offset, the keyLength is parsed frequently.  If u see table in 
above comments you can see how many times each call happen on a single Cell. 
Those numbers are when cells are written back to client side so comes in all 
layes.  But in that test also I had only 1 CF and one HFile.  So when this is 
also getting more, there will be comparison op happening in 2 KVHeaps and so 
the calls will be more. (We no longer pass the byte[], offset, length into 
Comparators but instead pass Cell alone)

So in case of trunk there will be adv we would see..  If you can give us your 
test, I will run it on trunk.

{code}
byte [] bytes = kv.getBuffer();
int offset = kv.getOffset();

int keyLength = Bytes.toInt(bytes, offset, Bytes.SIZEOF_INT);
offset += KeyValue.ROW_OFFSET;

int initialOffset = offset;

short rowLength = Bytes.toShort(bytes, offset, Bytes.SIZEOF_SHORT);
offset += Bytes.SIZEOF_SHORT;

int ret = this.rowComparator.compareRows(row, this.rowOffset, 
this.rowLength,
bytes, offset, rowLength);
...
...

//Passing rowLength
offset += rowLength;

//Skipping family
byte familyLength = bytes [offset];
offset += familyLength + 1;

int qualLength = keyLength -
  (offset - initialOffset) - KeyValue.TIMESTAMP_TYPE_SIZE;

long timestamp = Bytes.toLong(bytes, initialOffset + keyLength - 
KeyValue.TIMESTAMP_TYPE_SIZE);
...
...
byte type = bytes[initialOffset + keyLength - 1];
...
MatchCode colChecker = columns.checkColumn(bytes, offset, qualLength, type);
if (colChecker == MatchCode.INCLUDE) {
  ReturnCode filterResponse = ReturnCode.SKIP;
  // STEP 2: Yes, the column is part of the requested columns. Check if 
filter is present
  if (filter != null) {
// STEP 3: Filter the key value and return if it filters out
filterResponse = filter.filterKeyValue(kv);

{code}


> New Cell implementation with cached component offsets/lengths
> -
>
> Key: HBASE-13448
> URL: https://issues.apache.org/jira/browse/HBASE-13448
> Project: HBase
>  Issue Type: Sub-task
>  Components: Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0
>
> Attachments: 13448-0.98.txt, HBASE-13448.patch, HBASE-13448_V2.patch, 
> HBASE-13448_V3.patch, gc.png, hits.png
>
>
> This can be extension to KeyValue and can be instantiated and used in read 
> path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

37 matches

Mail list logo