[jira] [Created] (HBASE-15405) Fix PE logging and wrong defaults in help message

2016-03-05 Thread Appy (JIRA)
Appy created HBASE-15405:


 Summary: Fix PE logging and wrong defaults in help message
 Key: HBASE-15405
 URL: https://issues.apache.org/jira/browse/HBASE-15405
 Project: HBase
  Issue Type: Bug
  Components: Performance
Reporter: Appy
Assignee: Appy
Priority: Minor


Corrects wrong default values for few options in the help message.

Final stats from multiple clients are intermingled making it hard to 
understand. Also the logged stats aren't very machine readable. It can be 
helpful in a daily perf testing rig which scraps logs for results.

Example of logs before the change.
{noformat}
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest latency log 
(microseconds), on 1048576 measures
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest latency log 
(microseconds), on 1048576 measures
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest latency log 
(microseconds), on 1048576 measures
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest latency log 
(microseconds), on 1048576 measures
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest latency log 
(microseconds), on 1048576 measures
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest latency log 
(microseconds), on 1048576 measures
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest latency log 
(microseconds), on 1048576 measures
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: 0/1048570/1048576, latency 
mean=953.98, min=359.00, max=324050.00, stdDev=851.82, 95th=1368.00, 
99th=1625.00
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: 0/1048570/1048576, latency 
mean=953.92, min=356.00, max=323394.00, stdDev=817.55, 95th=1370.00, 
99th=1618.00
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: 0/1048570/1048576, latency 
mean=953.98, min=367.00, max=322745.00, stdDev=840.43, 95th=1369.00, 
99th=1622.00
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest latency log 
(microseconds), on 1048576 measures
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest latency log 
(microseconds), on 1048576 measures
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest latency log 
(microseconds), on 1048576 measures
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest Min  = 
375.0
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest Min  = 
363.0
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest Avg  = 
953.6624126434326
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest Avg  = 
953.4124526977539
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest StdDev   = 
781.3929776087633
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest StdDev   = 
742.8027916717297
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest 50th = 
894.0
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest 50th = 
894.0
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest 75th = 
1070.0
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest 75th = 
1071.0
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest 95th = 
1369.0
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest 95th = 
1369.0
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest 99th = 
1623.0
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest 99th = 
1624.0
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest Min  = 
372.0
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest 99.9th   = 
3013.998000214
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest Avg  = 
953.2451229095459
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest 99.9th   = 
3043.998000214
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest StdDev   = 
725.4744472152282
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest 99.99th  = 
25282.38016755
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest 50th = 
895.0
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest 99.99th  = 
25812.76334
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest 75th = 
1071.0
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest 99.999th = 
89772.78990004538
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest 95th = 
1369.0
16/03/05 22:43:06 INFO hbase.PerformanceEvaluation: IncrementTest 99.999th = 
122808.39587019826
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15400) Using Multiple Output for Date Tiered Compaction

2016-03-05 Thread Clara Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clara Xiong updated HBASE-15400:

Attachment: HBASE-15400.patch

> Using Multiple Output for Date Tiered Compaction
> 
>
> Key: HBASE-15400
> URL: https://issues.apache.org/jira/browse/HBASE-15400
> Project: HBase
>  Issue Type: Sub-task
>  Components: Compaction
>Reporter: Clara Xiong
>Assignee: Clara Xiong
> Fix For: 2.0.0
>
> Attachments: HBASE-15400.patch
>
>
> When we compact, we can output multiple files along the current window 
> boundaries. There are two use cases:
> 1. Major compaction: We want to output date tiered store files.
> 2. Bulk load files and the old file generated by major compaction before 
> upgrading to DTCP.
> Pros: 
> 1. Restore locality, process versioning, updates and deletes while 
> maintaining the tiered layout.
> 2. The best way to fix a skewed layout.
>  
> I am starting from a prototype of date tiered file writer from HBASE-15389 
> and will upload a patch soon. I have to call out a few design decisions:
> 1. We only want to output the files along all windows for major compaction. 
> 2. For minor compaction, we don't want to output too many files, which will 
> remain around because of current restriction of contiguous compaction by seq 
> id. I will only output two files if all the files in the windows are being 
> combined, one for the data within window and the other for the out-of-window 
> tail. If there is any file in the window excluded from compaction, only one 
> file will be output from compaction. When the windows are promoted, the 
> situation of out of order data will gradually improve.
> 3. We have to pass the boundaries with the list of store file as a complete 
> time snapshot instead of two separate calls because window layout is 
> determined by the time the computation is called. So we will need new type of 
> compaction request. 
> 4. Since we will assign the same seq id for all output files, we need to sort 
> by maxTimestamp subsequently. Right now all compaction policy gets the files 
> sorted for StoreFileManager which sorts by seq id and other criteria. I will 
> use this order for DTCP only, to avoid impacting other compaction policies. 
> 5. We need some cleanup of current design of StoreEngine and CompactionPolicy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15404) PE: Clients in append and increment are operating serially

2016-03-05 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182021#comment-15182021
 ] 

Appy commented on HBASE-15404:
--

I was working with 1.2.0 version, however, it may also be affecting master.
[~stack] I think you'll know where the bug is since you worked on this recently.

> PE: Clients in append and increment are operating serially
> --
>
> Key: HBASE-15404
> URL: https://issues.apache.org/jira/browse/HBASE-15404
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 1.2.0
>Reporter: Appy
>
> On running hbase pe --nomapred increment/append 10,  i see the following 
> output where it seems like threads are executing operations serially. In the 
> UI too, only one RS is getting requests at a time.
> {noformat}
> 6/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
> thread TestClient-1
> 16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
> thread TestClient-2
> 16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
> thread TestClient-8
> 16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
> thread TestClient-0
> 16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
> thread TestClient-4
> 16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
> thread TestClient-6
> 16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
> thread TestClient-7
> 16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
> thread TestClient-5
> 16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
> thread TestClient-9
> 16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
> thread TestClient-3
> 16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
> mean=942.48, min=390.00, max=163444.00, stdDev=892.64, 95th=1361.00, 
> 99th=1605.00
> 16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
> mean=942.53, min=366.00, max=163400.00, stdDev=885.49, 95th=1361.00, 
> 99th=1605.00
> 16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
> mean=942.41, min=402.00, max=163436.00, stdDev=891.54, 95th=1359.00, 
> 99th=1602.41
> 16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
> mean=942.51, min=399.00, max=163610.00, stdDev=892.40, 95th=1360.00, 
> 99th=1600.00
> 16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
> mean=942.59, min=393.00, max=162932.00, stdDev=887.65, 95th=1361.00, 
> 99th=1604.00
> 16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
> mean=942.26, min=385.00, max=163482.00, stdDev=891.71, 95th=1358.00, 
> 99th=1599.00
> 16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
> mean=942.51, min=383.00, max=163246.00, stdDev=888.07, 95th=1360.00, 
> 99th=1605.00
> 16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
> mean=942.45, min=385.00, max=163405.00, stdDev=886.65, 95th=1359.00, 
> 99th=1604.00
> 16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
> mean=942.38, min=400.00, max=163580.00, stdDev=887.28, 95th=1359.00, 
> 99th=1602.00
> 16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
> mean=942.29, min=407.00, max=163403.00, stdDev=889.77, 95th=1357.00, 
> 99th=1597.00
> 16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
> mean=950.75, min=366.00, max=163400.00, stdDev=817.84, 95th=1363.00, 
> 99th=1605.00
> 16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
> mean=950.75, min=383.00, max=163246.00, stdDev=821.95, 95th=1363.00, 
> 99th=1604.00
> 16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
> mean=950.75, min=389.00, max=163444.00, stdDev=824.03, 95th=1364.00, 
> 99th=1603.00
> 16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
> mean=950.56, min=382.00, max=163403.00, stdDev=822.44, 95th=1363.00, 
> 99th=1603.00
> 16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
> mean=950.79, min=393.00, max=162932.00, stdDev=818.75, 95th=1365.00, 
> 99th=1601.84
> 16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
> mean=950.70, min=388.00, max=163436.00, stdDev=823.52, 95th=1364.00, 
> 99th=1606.00
> 16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
> mean=950.72, min=376.00, max=163405.00, stdDev=820.65, 95th=1364.00, 
> 99th=1605.00
> 16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
> mean=950.56, min=382.00, max=163482.00, stdDev=823.43, 95th=1363.00, 
> 

[jira] [Created] (HBASE-15404) PE: Clients in append and increment are operating serially

2016-03-05 Thread Appy (JIRA)
Appy created HBASE-15404:


 Summary: PE: Clients in append and increment are operating serially
 Key: HBASE-15404
 URL: https://issues.apache.org/jira/browse/HBASE-15404
 Project: HBase
  Issue Type: Bug
  Components: Performance
Affects Versions: 1.2.0
Reporter: Appy


On running hbase pe --nomapred increment/append 10,  i see the following output 
where it seems like threads are executing operations serially. In the UI too, 
only one RS is getting requests at a time.
{noformat}
6/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
thread TestClient-1
16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
thread TestClient-2
16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
thread TestClient-8
16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
thread TestClient-0
16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
thread TestClient-4
16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
thread TestClient-6
16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
thread TestClient-7
16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
thread TestClient-5
16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
thread TestClient-9
16/03/05 22:26:15 INFO hbase.PerformanceEvaluation: Timed test starting in 
thread TestClient-3
16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
mean=942.48, min=390.00, max=163444.00, stdDev=892.64, 95th=1361.00, 
99th=1605.00
16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
mean=942.53, min=366.00, max=163400.00, stdDev=885.49, 95th=1361.00, 
99th=1605.00
16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
mean=942.41, min=402.00, max=163436.00, stdDev=891.54, 95th=1359.00, 
99th=1602.41
16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
mean=942.51, min=399.00, max=163610.00, stdDev=892.40, 95th=1360.00, 
99th=1600.00
16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
mean=942.59, min=393.00, max=162932.00, stdDev=887.65, 95th=1361.00, 
99th=1604.00
16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
mean=942.26, min=385.00, max=163482.00, stdDev=891.71, 95th=1358.00, 
99th=1599.00
16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
mean=942.51, min=383.00, max=163246.00, stdDev=888.07, 95th=1360.00, 
99th=1605.00
16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
mean=942.45, min=385.00, max=163405.00, stdDev=886.65, 95th=1359.00, 
99th=1604.00
16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
mean=942.38, min=400.00, max=163580.00, stdDev=887.28, 95th=1359.00, 
99th=1602.00
16/03/05 22:27:54 INFO hbase.PerformanceEvaluation: 0/104857/1048576, latency 
mean=942.29, min=407.00, max=163403.00, stdDev=889.77, 95th=1357.00, 
99th=1597.00
16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
mean=950.75, min=366.00, max=163400.00, stdDev=817.84, 95th=1363.00, 
99th=1605.00
16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
mean=950.75, min=383.00, max=163246.00, stdDev=821.95, 95th=1363.00, 
99th=1604.00
16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
mean=950.75, min=389.00, max=163444.00, stdDev=824.03, 95th=1364.00, 
99th=1603.00
16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
mean=950.56, min=382.00, max=163403.00, stdDev=822.44, 95th=1363.00, 
99th=1603.00
16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
mean=950.79, min=393.00, max=162932.00, stdDev=818.75, 95th=1365.00, 
99th=1601.84
16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
mean=950.70, min=388.00, max=163436.00, stdDev=823.52, 95th=1364.00, 
99th=1606.00
16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
mean=950.72, min=376.00, max=163405.00, stdDev=820.65, 95th=1364.00, 
99th=1605.00
16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
mean=950.56, min=382.00, max=163482.00, stdDev=823.43, 95th=1363.00, 
99th=1599.00
16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
mean=950.67, min=376.00, max=163580.00, stdDev=821.59, 95th=1364.00, 
99th=1602.00
16/03/05 22:29:35 INFO hbase.PerformanceEvaluation: 0/209714/1048576, latency 
mean=950.77, min=390.00, max=163610.00, stdDev=823.88, 95th=1363.00, 
99th=1600.00
16/03/05 22:31:16 INFO hbase.PerformanceEvaluation: 0/314571/1048576, latency 
mean=952.21, min=369.00, max=162932.00, stdDev=787.36, 95th=1361.00, 
99th=1595.00
16/03/05 22:31:16 INFO 

[jira] [Commented] (HBASE-15400) Using Multiple Output for Date Tiered Compaction

2016-03-05 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181990#comment-15181990
 ] 

Duo Zhang commented on HBASE-15400:
---

{quote}
4. Since we will assign the same seq id for all output files, we need to sort 
by maxTimestamp subsequently. Right now all compaction policy gets the files 
sorted for StoreFileManager which sorts by seq id and other criteria. I will 
use this order for DTCP only, to avoid impacting other compaction policies.
{quote}
If we only write to all windows in major compaction and only write out two 
files in minor compaction, I think we could assign different seqIds for the 
output files?
For major compaction, let 'seqId' be the max seqIds of all files, then we could 
use seqId, seqId - 1, seqId - 2... for the output files. And for minor 
compaction, at least we could have two input files, so just reuse the seqIds of 
these files is enough?

{quote}
3. We have to pass the boundaries with the list of store file as a complete 
time snapshot instead of two separate calls because window layout is determined 
by the time the computation is called. So we will need new type of compaction 
request.
{quote}
What about store the new fields in CompactionContext? Especially, store 'now' 
in compaction context and use it in subsequent calls?

Thanks.

> Using Multiple Output for Date Tiered Compaction
> 
>
> Key: HBASE-15400
> URL: https://issues.apache.org/jira/browse/HBASE-15400
> Project: HBase
>  Issue Type: Sub-task
>  Components: Compaction
>Reporter: Clara Xiong
>Assignee: Clara Xiong
> Fix For: 2.0.0
>
>
> When we compact, we can output multiple files along the current window 
> boundaries. There are two use cases:
> 1. Major compaction: We want to output date tiered store files.
> 2. Bulk load files and the old file generated by major compaction before 
> upgrading to DTCP.
> Pros: 
> 1. Restore locality, process versioning, updates and deletes while 
> maintaining the tiered layout.
> 2. The best way to fix a skewed layout.
>  
> I am starting from a prototype of date tiered file writer from HBASE-15389 
> and will upload a patch soon. I have to call out a few design decisions:
> 1. We only want to output the files along all windows for major compaction. 
> 2. For minor compaction, we don't want to output too many files, which will 
> remain around because of current restriction of contiguous compaction by seq 
> id. I will only output two files if all the files in the windows are being 
> combined, one for the data within window and the other for the out-of-window 
> tail. If there is any file in the window excluded from compaction, only one 
> file will be output from compaction. When the windows are promoted, the 
> situation of out of order data will gradually improve.
> 3. We have to pass the boundaries with the list of store file as a complete 
> time snapshot instead of two separate calls because window layout is 
> determined by the time the computation is called. So we will need new type of 
> compaction request. 
> 4. Since we will assign the same seq id for all output files, we need to sort 
> by maxTimestamp subsequently. Right now all compaction policy gets the files 
> sorted for StoreFileManager which sorts by seq id and other criteria. I will 
> use this order for DTCP only, to avoid impacting other compaction policies. 
> 5. We need some cleanup of current design of StoreEngine and CompactionPolicy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15403) Performance Evaluation tool isn't working as expected

2016-03-05 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181985#comment-15181985
 ] 

Anoop Sam John commented on HBASE-15403:


--rows dont mean total rows which will be written.. I see, this is the #times 
one client thread loop runs. The write tests allows for multi?  If so this will 
be #multi requests. So then that will come out like #threads * --rows * puts 
per multi...
For scan --rows also means same way.  If we test with ScanRange1000 and --rows 
= 1000 and 10 threads. total 1000* 1000 * 10 rows read overall..  Yes there may 
be duplicate rks coming out across diff threads.

Need to check more on code.


> Performance Evaluation tool isn't working as expected
> -
>
> Key: HBASE-15403
> URL: https://issues.apache.org/jira/browse/HBASE-15403
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 1.2.0
>Reporter: Appy
>Priority: Critical
>
> hbase pe --nomapred --rows=100 --table='t4' randomWrite 10
> # count on t4 gives 620 rows
> hbase pe --nomapred --rows=200 --table='t5' randomWrite 10
> # count on t5 gives 1257 rows
> hbase pe --nomapred --table='t6' --rows=200 randomWrite 1
> # count on t6 gives 126 rows
> I was working with 1.2.0, but it's likely that it'll also be affecting master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181977#comment-15181977
 ] 

Hadoop QA commented on HBASE-15392:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
4s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 
24s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
52s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
42s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 4m 6s 
{color} | {color:red} Patch generated 5 new checkstyle issues in hbase-server 
(total was 180, now 184). {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
23m 59s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
59s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 88m 53s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.8.0. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 87m 45s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 222m 48s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0 Failed junit tests | hadoop.hbase.regionserver.TestBlocksRead |
| JDK v1.7.0_79 Failed junit tests | hadoop.hbase.regionserver.TestBlocksRead |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12791644/15392v3.wip.patch |
| JIRA Issue | HBASE-15392 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 

[jira] [Commented] (HBASE-15403) Performance Evaluation tool isn't working as expected

2016-03-05 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181976#comment-15181976
 ] 

Jean-Marc Spaggiari commented on HBASE-15403:
-

Is that not "normal"? Random can mean duplicates, no? Is you keep all versions 
and count the versions too, don't you get the expected number of cells?

100 rows 10 clients means 1 000 rows so you have about 380 duplicates. Sound 
realistic to me?

> Performance Evaluation tool isn't working as expected
> -
>
> Key: HBASE-15403
> URL: https://issues.apache.org/jira/browse/HBASE-15403
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 1.2.0
>Reporter: Appy
>Priority: Critical
>
> hbase pe --nomapred --rows=100 --table='t4' randomWrite 10
> # count on t4 gives 620 rows
> hbase pe --nomapred --rows=200 --table='t5' randomWrite 10
> # count on t5 gives 1257 rows
> hbase pe --nomapred --table='t6' --rows=200 randomWrite 1
> # count on t6 gives 126 rows
> I was working with 1.2.0, but it's likely that it'll also be affecting master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15403) Performance Evaluation tool isn't working as expected

2016-03-05 Thread Appy (JIRA)
Appy created HBASE-15403:


 Summary: Performance Evaluation tool isn't working as expected
 Key: HBASE-15403
 URL: https://issues.apache.org/jira/browse/HBASE-15403
 Project: HBase
  Issue Type: Bug
  Components: Performance
Affects Versions: 1.2.0
Reporter: Appy
Priority: Critical


hbase pe --nomapred --rows=100 --table='t4' randomWrite 10
# count on t4 gives 620 rows

hbase pe --nomapred --rows=200 --table='t5' randomWrite 10
# count on t5 gives 1257 rows

hbase pe --nomapred --table='t6' --rows=200 randomWrite 1
# count on t6 gives 126 rows

I was working with 1.2.0, but it's likely that it'll also be affecting master.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15400) Using Multiple Output for Date Tiered Compaction

2016-03-05 Thread Clara Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clara Xiong updated HBASE-15400:

Issue Type: Sub-task  (was: Improvement)
Parent: HBASE-15339

> Using Multiple Output for Date Tiered Compaction
> 
>
> Key: HBASE-15400
> URL: https://issues.apache.org/jira/browse/HBASE-15400
> Project: HBase
>  Issue Type: Sub-task
>  Components: Compaction
>Reporter: Clara Xiong
>Assignee: Clara Xiong
> Fix For: 2.0.0
>
>
> When we compact, we can output multiple files along the current window 
> boundaries. There are two use cases:
> 1. Major compaction: We want to output date tiered store files.
> 2. Bulk load files and the old file generated by major compaction before 
> upgrading to DTCP.
> Pros: 
> 1. Restore locality, process versioning, updates and deletes while 
> maintaining the tiered layout.
> 2. The best way to fix a skewed layout.
>  
> I am starting from a prototype of date tiered file writer from HBASE-15389 
> and will upload a patch soon. I have to call out a few design decisions:
> 1. We only want to output the files along all windows for major compaction. 
> 2. For minor compaction, we don't want to output too many files, which will 
> remain around because of current restriction of contiguous compaction by seq 
> id. I will only output two files if all the files in the windows are being 
> combined, one for the data within window and the other for the out-of-window 
> tail. If there is any file in the window excluded from compaction, only one 
> file will be output from compaction. When the windows are promoted, the 
> situation of out of order data will gradually improve.
> 3. We have to pass the boundaries with the list of store file as a complete 
> time snapshot instead of two separate calls because window layout is 
> determined by the time the computation is called. So we will need new type of 
> compaction request. 
> 4. Since we will assign the same seq id for all output files, we need to sort 
> by maxTimestamp subsequently. Right now all compaction policy gets the files 
> sorted for StoreFileManager which sorts by seq id and other criteria. I will 
> use this order for DTCP only, to avoid impacting other compaction policies. 
> 5. We need some cleanup of current design of StoreEngine and CompactionPolicy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-11792) Organize PerformanceEvaluation usage output

2016-03-05 Thread Appy (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-11792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181966#comment-15181966
 ] 

Appy commented on HBASE-11792:
--

This patch really simplifies PerformanceEvaluation tool help. I can see patches 
for other branches too, should we get them committed too?

> Organize PerformanceEvaluation usage output
> ---
>
> Key: HBASE-11792
> URL: https://issues.apache.org/jira/browse/HBASE-11792
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance, test
>Reporter: Nick Dimiduk
>Assignee: Misty Stanley-Jones
>Priority: Minor
>  Labels: beginner
> Fix For: 2.0.0
>
> Attachments: HBASE-11792-0.98.patch, HBASE-11792-branch-1.0.patch, 
> HBASE-11792-branch-1.1.patch, HBASE-11792-branch-1.2.patch, 
> HBASE-11792-branch-1.patch, HBASE-11792.patch
>
>
> PerformanceEvaluation has enjoyed a good bit of attention recently. All the 
> new features are muddled together. It would be nice to organize the output of 
> the Options list according to some scheme. I was thinking you're group 
> entries by when they're used. For example
> *General options*
> - nomapred
> - rows
> - oneCon
> - ...
> *Table Creation/Write tests*
> - compress
> - flushCommits
> - valueZipf
> - ...
> *Read tests*
> - filterAll
> - multiGet
> - replicas
> - ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14703) HTable.mutateRow does not collect stats

2016-03-05 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181944#comment-15181944
 ] 

Jesse Yates commented on HBASE-14703:
-

And FWIW, looks like the build failure started in #746 with the same cause, so 
I think we are good on the commit.

> HTable.mutateRow does not collect stats
> ---
>
> Key: HBASE-14703
> URL: https://issues.apache.org/jira/browse/HBASE-14703
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>Assignee: Heng Chen
> Fix For: 2.0.0
>
> Attachments: HBASE-14702_v5.2_addendum-addendum.patch, 
> HBASE-14703-5.2-addendum.patch, HBASE-14703-async.patch, 
> HBASE-14703-start.patch, HBASE-14703-v4.1.patch, HBASE-14703-v4.patch, 
> HBASE-14703-v6_with-check-and-mutate.patch, HBASE-14703.patch, 
> HBASE-14703_v1.patch, HBASE-14703_v10.patch, HBASE-14703_v10.patch, 
> HBASE-14703_v11.patch, HBASE-14703_v12.patch, HBASE-14703_v13.patch, 
> HBASE-14703_v2.patch, HBASE-14703_v3.patch, HBASE-14703_v5.1.patch, 
> HBASE-14703_v5.2.patch, HBASE-14703_v5.patch, HBASE-14703_v6-addendum.patch, 
> HBASE-14703_v6.patch, HBASE-14703_v7.patch, HBASE-14703_v8.patch, 
> HBASE-14703_v9.patch
>
>
> We are trying to fix the stats implementation, by moving it out of the Result 
> object and into an Rpc payload (but not the 'cell payload', just as part of 
> the values returned from the request). This change will also us use easily 
> switch to AsyncProcess as the executor, and support stats, for nearly all the 
> rpc calls. However, that means when you upgrade the client or server, you 
> will lose stats visibility until the other side is upgraded. We could keep 
> around the Result based stats storage to accommodate the old api and send 
> both stats back from the server (in each result and in the rpc payload).
> Note that we will still be wire compatible - protobufs mean we can just ride 
> over the lack of information.
> The other tricky part of this is that Result has a 
> non-InterfaceAudience.Private getStatistics() method (along with two 
> InterfaceAudience.Private addResults and setStatistics methods), so we might 
> need a release to deprecate the getStats() method before throwing it out?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14703) HTable.mutateRow does not collect stats

2016-03-05 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181943#comment-15181943
 ] 

Jesse Yates commented on HBASE-14703:
-

[~busbey] yes, that is still the case. However, it is a temporary outage until 
upgrade on a afaik not widely used feature that only impacts performance, not 
correctness. I'm leaning to not backporting b/c it is technically an 
incompatibilitity of feature (though still wire compatible - thanks protobuf!). 
Hence, kicking it back to the RMs :)

Thanks for the pointer on branch-2 - I'm a bit behind in the latest dev state.

> HTable.mutateRow does not collect stats
> ---
>
> Key: HBASE-14703
> URL: https://issues.apache.org/jira/browse/HBASE-14703
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>Assignee: Heng Chen
> Fix For: 2.0.0
>
> Attachments: HBASE-14702_v5.2_addendum-addendum.patch, 
> HBASE-14703-5.2-addendum.patch, HBASE-14703-async.patch, 
> HBASE-14703-start.patch, HBASE-14703-v4.1.patch, HBASE-14703-v4.patch, 
> HBASE-14703-v6_with-check-and-mutate.patch, HBASE-14703.patch, 
> HBASE-14703_v1.patch, HBASE-14703_v10.patch, HBASE-14703_v10.patch, 
> HBASE-14703_v11.patch, HBASE-14703_v12.patch, HBASE-14703_v13.patch, 
> HBASE-14703_v2.patch, HBASE-14703_v3.patch, HBASE-14703_v5.1.patch, 
> HBASE-14703_v5.2.patch, HBASE-14703_v5.patch, HBASE-14703_v6-addendum.patch, 
> HBASE-14703_v6.patch, HBASE-14703_v7.patch, HBASE-14703_v8.patch, 
> HBASE-14703_v9.patch
>
>
> We are trying to fix the stats implementation, by moving it out of the Result 
> object and into an Rpc payload (but not the 'cell payload', just as part of 
> the values returned from the request). This change will also us use easily 
> switch to AsyncProcess as the executor, and support stats, for nearly all the 
> rpc calls. However, that means when you upgrade the client or server, you 
> will lose stats visibility until the other side is upgraded. We could keep 
> around the Result based stats storage to accommodate the old api and send 
> both stats back from the server (in each result and in the rpc payload).
> Note that we will still be wire compatible - protobufs mean we can just ride 
> over the lack of information.
> The other tricky part of this is that Result has a 
> non-InterfaceAudience.Private getStatistics() method (along with two 
> InterfaceAudience.Private addResults and setStatistics methods), so we might 
> need a release to deprecate the getStats() method before throwing it out?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-7743) Replace *SortReducers with Hadoop Secondary Sort

2016-03-05 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181935#comment-15181935
 ] 

Jerry He commented on HBASE-7743:
-

Linked to HBASE-13897 as well.

> Replace *SortReducers with Hadoop Secondary Sort
> 
>
> Key: HBASE-7743
> URL: https://issues.apache.org/jira/browse/HBASE-7743
> Project: HBase
>  Issue Type: Sub-task
>  Components: mapreduce, Performance
>Reporter: Nick Dimiduk
> Fix For: 2.0.0
>
>
> The mapreduce package provides two Reducer implementations, 
> KeyValueSortReducer and PutSortReducer, which are used by Import, ImportTsv, 
> and WALPlayer in conjunction with the HFileOutputFormat. Both of these 
> implementations make use of a TreeSet to sort values matching a key. This 
> reducer will OOM when rows are large.
> A better solution would be to implement secondary sort of the values. That 
> way hadoop sorts the records, spilling to disk when necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14339) HBase Bulk Load and super wide rows

2016-03-05 Thread Jerry He (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181934#comment-15181934
 ] 

Jerry He commented on HBASE-14339:
--

Also see HBASE-13897.  Something has been done there where Import tools writes 
to HFileOutputFileFormat.

> HBase Bulk Load and super wide rows
> ---
>
> Key: HBASE-14339
> URL: https://issues.apache.org/jira/browse/HBASE-14339
> Project: HBase
>  Issue Type: Bug
>Reporter: Ted Malaska
>Priority: Minor
>
> This may not be a huge issues but it does come up.  If the number of columns 
> in a row are to many then KeyValueSortReducer will blow up with a out of 
> memory exception, because it uses a TreeMap to sort the columns with in the 
> memory of the reducer.
> A solution would be to add the column family and qualifier to the key so the 
> shuffle would handle the sort.
> The partitioner would only partition on the rowKey but ordering would apply 
> to the RowKey, Column Family, and Column Qualifier.
> Look at the Spark Bulk load as an example.  HBASE-14150  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-05 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15392:
--
Attachment: 15392v3.wip.patch

So, here is v3. Still does not have tests (though it sounds like we need lots 
of tests in this area if we are to prevent regression going forward; I'll add 
them).

v3 does enjoys the benefit of RamAnoop insight.

In ScanQueryMatcher, we make use of isGetScan to save on unnecessary row 
compares.

In StoreScanner#optimize, if it is a Get Scan, we skip out early -- optimize is 
for long-scans.

TODO: If the scan has a stop row, we will overread. Let me fix in v4.

> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392.wip.patch, 15392v2.wip.patch, 15392v3.wip.patch, 
> no_optimize.patch, no_optimize.patch, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> 

[jira] [Updated] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-05 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15392:
--
Attachment: no_optimize.patch
no_optimize.patch

Ok. If I comment out the optimize method, I get the proper behavior (for gets). 
 Attached is my hacked up logging. We only fetch one datablock.

The optimize came in here:

commit 464e7ce685486e3ede13ec2351b45b0a0b65696c
Author: Lars Hofhansl 
Date:   Wed Mar 4 14:03:47 2015 -0800

HBASE-13109 Make better SEEK vs SKIP decisions during scanning.

It optimizes for scans but F's random reads (smile). Discrimination!

So, I think I have a bit of a clue now and understand what you fellows were on 
about.I was just going to add a if isGetScan, don't optimize but looking at the 
optimize, I think it broke for the case where there is a stoprow.. i.e. 
we'll keep reading till we hit next row though we might be able to return early 
out of the last row in a long scan with stoprow.

> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392.wip.patch, 15392v2.wip.patch, no_optimize.patch, 
> no_optimize.patch, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> 

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181896#comment-15181896
 ] 

Hadoop QA commented on HBASE-15392:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 3s {color} 
| {color:red} HBASE-15392 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/latest/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12791642/no_optimize.patch |
| JIRA Issue | HBASE-15392 |
| Powered by | Apache Yetus 0.1.0   http://yetus.apache.org |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/856/console |


This message was automatically generated.



> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392.wip.patch, 15392v2.wip.patch, no_optimize.patch, 
> no_optimize.patch, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> 

[jira] [Commented] (HBASE-14703) HTable.mutateRow does not collect stats

2016-03-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181880#comment-15181880
 ] 

Hudson commented on HBASE-14703:


FAILURE: Integrated in HBase-Trunk_matrix #759 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/759/])
HBASE-14703 HTable.mutateRow does not collect stats (Heng Chen) (jyates: rev 
ef712df944b0745892bc13bcecdfd6e358a71b66)
* hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/StatsTrackingRpcRetryingCaller.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerFactory.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/client/MultiResponse.java
* hbase-protocol/src/main/protobuf/Client.proto
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestCheckAndMutate.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/RetryingTimeTracker.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/StatisticTrackable.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerImpl.java
* 
hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java
* 
hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestAsyncProcess.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/ServerStatisticTracker.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/MultiServerCallable.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestReplicasClient.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/client/ResultStatsUtil.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestClientPushback.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/PayloadCarryingServerCallable.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ResponseConverter.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/MetricsConnection.java


> HTable.mutateRow does not collect stats
> ---
>
> Key: HBASE-14703
> URL: https://issues.apache.org/jira/browse/HBASE-14703
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>Assignee: Heng Chen
> Fix For: 2.0.0
>
> Attachments: HBASE-14702_v5.2_addendum-addendum.patch, 
> HBASE-14703-5.2-addendum.patch, HBASE-14703-async.patch, 
> HBASE-14703-start.patch, HBASE-14703-v4.1.patch, HBASE-14703-v4.patch, 
> HBASE-14703-v6_with-check-and-mutate.patch, HBASE-14703.patch, 
> HBASE-14703_v1.patch, HBASE-14703_v10.patch, HBASE-14703_v10.patch, 
> HBASE-14703_v11.patch, HBASE-14703_v12.patch, HBASE-14703_v13.patch, 
> HBASE-14703_v2.patch, HBASE-14703_v3.patch, HBASE-14703_v5.1.patch, 
> HBASE-14703_v5.2.patch, HBASE-14703_v5.patch, HBASE-14703_v6-addendum.patch, 
> HBASE-14703_v6.patch, HBASE-14703_v7.patch, HBASE-14703_v8.patch, 
> HBASE-14703_v9.patch
>
>
> We are trying to fix the stats implementation, by moving it out of the Result 
> object and into an Rpc payload (but not the 'cell payload', just as part of 
> the values returned from the request). This change will also us use easily 
> switch to AsyncProcess as the executor, and support stats, for nearly all the 
> rpc calls. However, that means when you upgrade the client or server, you 
> will lose stats visibility until the other side is upgraded. We could keep 
> around the Result based stats storage to accommodate the old api and send 
> both stats back from the server (in each result and in the rpc payload).
> Note that we will still be wire compatible - protobufs mean we can just ride 
> over the lack of information.
> The other tricky part of this is that Result has a 
> non-InterfaceAudience.Private getStatistics() method (along with two 
> InterfaceAudience.Private addResults and setStatistics methods), so we might 
> need a release to deprecate the getStats() method before throwing it out?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-05 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-15392:
--
Attachment: two_seeks.txt

Here is log when stuff is broke -- we are getting two blocks. The query is:

{code}
hbase(main):009:0> get 'ycsb', 'user991012814165691998', {COLUMN => 
['family:field0', 'family:field1']}
{code}

i.e. get two adjacent fields.

I don't expect anyone to make sense of my hacked log output (though Ram you 
seem to be doing a good job) but yeah, as you speculate, the 
INCLUDE_AND_SEEK_NEXT_ROW gets 'optimized' to be INCLUDE only.

Now I'm thinking my 'fix' is not as good as I thought. I think the isGetScan is 
a good addition to moreRowsMayExistAfter -- we save on compares Let me see 
about this optimize method.

> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392.wip.patch, 15392v2.wip.patch, two_seeks.txt
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> 

[jira] [Commented] (HBASE-14703) HTable.mutateRow does not collect stats

2016-03-05 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181837#comment-15181837
 ] 

Sean Busbey commented on HBASE-14703:
-

{quote}
However, that means when you upgrade the client or server, you will lose stats 
visibility until the other side is upgraded. 
{quote}

Is this true in the final implementation that went in?

> HTable.mutateRow does not collect stats
> ---
>
> Key: HBASE-14703
> URL: https://issues.apache.org/jira/browse/HBASE-14703
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>Assignee: Heng Chen
> Fix For: 2.0.0
>
> Attachments: HBASE-14702_v5.2_addendum-addendum.patch, 
> HBASE-14703-5.2-addendum.patch, HBASE-14703-async.patch, 
> HBASE-14703-start.patch, HBASE-14703-v4.1.patch, HBASE-14703-v4.patch, 
> HBASE-14703-v6_with-check-and-mutate.patch, HBASE-14703.patch, 
> HBASE-14703_v1.patch, HBASE-14703_v10.patch, HBASE-14703_v10.patch, 
> HBASE-14703_v11.patch, HBASE-14703_v12.patch, HBASE-14703_v13.patch, 
> HBASE-14703_v2.patch, HBASE-14703_v3.patch, HBASE-14703_v5.1.patch, 
> HBASE-14703_v5.2.patch, HBASE-14703_v5.patch, HBASE-14703_v6-addendum.patch, 
> HBASE-14703_v6.patch, HBASE-14703_v7.patch, HBASE-14703_v8.patch, 
> HBASE-14703_v9.patch
>
>
> We are trying to fix the stats implementation, by moving it out of the Result 
> object and into an Rpc payload (but not the 'cell payload', just as part of 
> the values returned from the request). This change will also us use easily 
> switch to AsyncProcess as the executor, and support stats, for nearly all the 
> rpc calls. However, that means when you upgrade the client or server, you 
> will lose stats visibility until the other side is upgraded. We could keep 
> around the Result based stats storage to accommodate the old api and send 
> both stats back from the server (in each result and in the rpc payload).
> Note that we will still be wire compatible - protobufs mean we can just ride 
> over the lack of information.
> The other tricky part of this is that Result has a 
> non-InterfaceAudience.Private getStatistics() method (along with two 
> InterfaceAudience.Private addResults and setStatistics methods), so we might 
> need a release to deprecate the getStats() method before throwing it out?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15222) Use less contended classes for metrics

2016-03-05 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-15222:

Issue Type: Improvement  (was: Bug)

> Use less contended classes for metrics
> --
>
> Key: HBASE-15222
> URL: https://issues.apache.org/jira/browse/HBASE-15222
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 2.0.0, 1.2.0, 1.3.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Critical
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-15222-ADD-0.patch, HBASE-15222-v1.patch, 
> HBASE-15222-v10.patch, HBASE-15222-v11.patch, HBASE-15222-v12.patch, 
> HBASE-15222-v13.patch, HBASE-15222-v2.patch, HBASE-15222-v3.patch, 
> HBASE-15222-v5.patch, HBASE-15222-v6.patch, HBASE-15222-v8.patch, 
> HBASE-15222-v9.patch, HBASE-15222.patch
>
>
> Running the benchmarks now, but it looks like the results are pretty extreme. 
> The locking in our histograms is pretty extreme.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15354) Use same criteria for clearing meta cache for all operations

2016-03-05 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181836#comment-15181836
 ] 

Sean Busbey commented on HBASE-15354:
-

unless someone is opposed, I'm fine with this one going into branch-1.2. please 
release note the change in behavior.

> Use same criteria for clearing meta cache for all operations
> 
>
> Key: HBASE-15354
> URL: https://issues.apache.org/jira/browse/HBASE-15354
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.0.0, 1.2.0, 1.3.0
>Reporter: Ashu Pachauri
>Assignee: Ashu Pachauri
> Attachments: HBASE-15354-V0.patch, HBASE-15354-V1.patch, 
> HBASE-15354-V2.patch, HBASE-15354-V3.patch, HBASE-15354-V4.patch, 
> HBASE-15354-V5.patch
>
>
> Currently we do not clear/update meta cache for some special exceptions if 
> the operation went through AsyncProcess#submit like HTable#put calls. But, we 
> clear meta cache without checking for these special exceptions in case of 
> other operations like gets, deletes etc because they directly go through the 
> RpcRetryingCaller#callWithRetries instead of the AsyncProcess. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14703) HTable.mutateRow does not collect stats

2016-03-05 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181832#comment-15181832
 ] 

Sean Busbey commented on HBASE-14703:
-

That's not  a legitimate branch, it's slated for removal in HBASE-15006. I 
would recommend you avoid making a patch for it.

> HTable.mutateRow does not collect stats
> ---
>
> Key: HBASE-14703
> URL: https://issues.apache.org/jira/browse/HBASE-14703
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>Assignee: Heng Chen
> Fix For: 2.0.0
>
> Attachments: HBASE-14702_v5.2_addendum-addendum.patch, 
> HBASE-14703-5.2-addendum.patch, HBASE-14703-async.patch, 
> HBASE-14703-start.patch, HBASE-14703-v4.1.patch, HBASE-14703-v4.patch, 
> HBASE-14703-v6_with-check-and-mutate.patch, HBASE-14703.patch, 
> HBASE-14703_v1.patch, HBASE-14703_v10.patch, HBASE-14703_v10.patch, 
> HBASE-14703_v11.patch, HBASE-14703_v12.patch, HBASE-14703_v13.patch, 
> HBASE-14703_v2.patch, HBASE-14703_v3.patch, HBASE-14703_v5.1.patch, 
> HBASE-14703_v5.2.patch, HBASE-14703_v5.patch, HBASE-14703_v6-addendum.patch, 
> HBASE-14703_v6.patch, HBASE-14703_v7.patch, HBASE-14703_v8.patch, 
> HBASE-14703_v9.patch
>
>
> We are trying to fix the stats implementation, by moving it out of the Result 
> object and into an Rpc payload (but not the 'cell payload', just as part of 
> the values returned from the request). This change will also us use easily 
> switch to AsyncProcess as the executor, and support stats, for nearly all the 
> rpc calls. However, that means when you upgrade the client or server, you 
> will lose stats visibility until the other side is upgraded. We could keep 
> around the Result based stats storage to accommodate the old api and send 
> both stats back from the server (in each result and in the rpc payload).
> Note that we will still be wire compatible - protobufs mean we can just ride 
> over the lack of information.
> The other tricky part of this is that Result has a 
> non-InterfaceAudience.Private getStatistics() method (along with two 
> InterfaceAudience.Private addResults and setStatistics methods), so we might 
> need a release to deprecate the getStats() method before throwing it out?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15006) clean up defunct git branches

2016-03-05 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181831#comment-15181831
 ] 

Sean Busbey commented on HBASE-15006:
-

I can confirm branch-2 was a mispush in Sep: [commits@hbase mail for branch 
creation|http://mail-archives.apache.org/mod_mbox/hbase-commits/201509.mbox/%3C853cc5c68e674688aa608f3e38d1bd41%40git.apache.org%3E].

The branch appears to come off of hte 0.99 line and currently claims to be 1.2.

> clean up defunct git branches
> -
>
> Key: HBASE-15006
> URL: https://issues.apache.org/jira/browse/HBASE-15006
> Project: HBase
>  Issue Type: Task
>  Components: build
>Reporter: Sean Busbey
>
> When the ASF ban on branch deletion is lifted, clean up the mistakenly pushed 
> stuff we have sitting in git.
> This issue should track the discussion of which branches should go.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14703) HTable.mutateRow does not collect stats

2016-03-05 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181830#comment-15181830
 ] 

Andrew Purtell commented on HBASE-14703:


In 0.98 please. We are evolving the client pushback feature there. Thanks. 

> HTable.mutateRow does not collect stats
> ---
>
> Key: HBASE-14703
> URL: https://issues.apache.org/jira/browse/HBASE-14703
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>Assignee: Heng Chen
> Fix For: 2.0.0
>
> Attachments: HBASE-14702_v5.2_addendum-addendum.patch, 
> HBASE-14703-5.2-addendum.patch, HBASE-14703-async.patch, 
> HBASE-14703-start.patch, HBASE-14703-v4.1.patch, HBASE-14703-v4.patch, 
> HBASE-14703-v6_with-check-and-mutate.patch, HBASE-14703.patch, 
> HBASE-14703_v1.patch, HBASE-14703_v10.patch, HBASE-14703_v10.patch, 
> HBASE-14703_v11.patch, HBASE-14703_v12.patch, HBASE-14703_v13.patch, 
> HBASE-14703_v2.patch, HBASE-14703_v3.patch, HBASE-14703_v5.1.patch, 
> HBASE-14703_v5.2.patch, HBASE-14703_v5.patch, HBASE-14703_v6-addendum.patch, 
> HBASE-14703_v6.patch, HBASE-14703_v7.patch, HBASE-14703_v8.patch, 
> HBASE-14703_v9.patch
>
>
> We are trying to fix the stats implementation, by moving it out of the Result 
> object and into an Rpc payload (but not the 'cell payload', just as part of 
> the values returned from the request). This change will also us use easily 
> switch to AsyncProcess as the executor, and support stats, for nearly all the 
> rpc calls. However, that means when you upgrade the client or server, you 
> will lose stats visibility until the other side is upgraded. We could keep 
> around the Result based stats storage to accommodate the old api and send 
> both stats back from the server (in each result and in the rpc payload).
> Note that we will still be wire compatible - protobufs mean we can just ride 
> over the lack of information.
> The other tricky part of this is that Result has a 
> non-InterfaceAudience.Private getStatistics() method (along with two 
> InterfaceAudience.Private addResults and setStatistics methods), so we might 
> need a release to deprecate the getStats() method before throwing it out?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14703) HTable.mutateRow does not collect stats

2016-03-05 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181829#comment-15181829
 ] 

Jesse Yates commented on HBASE-14703:
-

I think we do have a [branch-2|https://github.com/apache/hbase/tree/branch-2] 
and its not exactly master, so going to take me a little bit to cut a new 
version of the patch.

> HTable.mutateRow does not collect stats
> ---
>
> Key: HBASE-14703
> URL: https://issues.apache.org/jira/browse/HBASE-14703
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>Assignee: Heng Chen
> Fix For: 2.0.0
>
> Attachments: HBASE-14702_v5.2_addendum-addendum.patch, 
> HBASE-14703-5.2-addendum.patch, HBASE-14703-async.patch, 
> HBASE-14703-start.patch, HBASE-14703-v4.1.patch, HBASE-14703-v4.patch, 
> HBASE-14703-v6_with-check-and-mutate.patch, HBASE-14703.patch, 
> HBASE-14703_v1.patch, HBASE-14703_v10.patch, HBASE-14703_v10.patch, 
> HBASE-14703_v11.patch, HBASE-14703_v12.patch, HBASE-14703_v13.patch, 
> HBASE-14703_v2.patch, HBASE-14703_v3.patch, HBASE-14703_v5.1.patch, 
> HBASE-14703_v5.2.patch, HBASE-14703_v5.patch, HBASE-14703_v6-addendum.patch, 
> HBASE-14703_v6.patch, HBASE-14703_v7.patch, HBASE-14703_v8.patch, 
> HBASE-14703_v9.patch
>
>
> We are trying to fix the stats implementation, by moving it out of the Result 
> object and into an Rpc payload (but not the 'cell payload', just as part of 
> the values returned from the request). This change will also us use easily 
> switch to AsyncProcess as the executor, and support stats, for nearly all the 
> rpc calls. However, that means when you upgrade the client or server, you 
> will lose stats visibility until the other side is upgraded. We could keep 
> around the Result based stats storage to accommodate the old api and send 
> both stats back from the server (in each result and in the rpc payload).
> Note that we will still be wire compatible - protobufs mean we can just ride 
> over the lack of information.
> The other tricky part of this is that Result has a 
> non-InterfaceAudience.Private getStatistics() method (along with two 
> InterfaceAudience.Private addResults and setStatistics methods), so we might 
> need a release to deprecate the getStats() method before throwing it out?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14703) HTable.mutateRow does not collect stats

2016-03-05 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181828#comment-15181828
 ] 

Sean Busbey commented on HBASE-14703:
-

IIRC, we don't have a branch-2 yet. master should be sufficient. I'll have to 
review the extent of changes before giving an opinion on branch-1.2 
appropriateness.

> HTable.mutateRow does not collect stats
> ---
>
> Key: HBASE-14703
> URL: https://issues.apache.org/jira/browse/HBASE-14703
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>Assignee: Heng Chen
> Fix For: 2.0.0
>
> Attachments: HBASE-14702_v5.2_addendum-addendum.patch, 
> HBASE-14703-5.2-addendum.patch, HBASE-14703-async.patch, 
> HBASE-14703-start.patch, HBASE-14703-v4.1.patch, HBASE-14703-v4.patch, 
> HBASE-14703-v6_with-check-and-mutate.patch, HBASE-14703.patch, 
> HBASE-14703_v1.patch, HBASE-14703_v10.patch, HBASE-14703_v10.patch, 
> HBASE-14703_v11.patch, HBASE-14703_v12.patch, HBASE-14703_v13.patch, 
> HBASE-14703_v2.patch, HBASE-14703_v3.patch, HBASE-14703_v5.1.patch, 
> HBASE-14703_v5.2.patch, HBASE-14703_v5.patch, HBASE-14703_v6-addendum.patch, 
> HBASE-14703_v6.patch, HBASE-14703_v7.patch, HBASE-14703_v8.patch, 
> HBASE-14703_v9.patch
>
>
> We are trying to fix the stats implementation, by moving it out of the Result 
> object and into an Rpc payload (but not the 'cell payload', just as part of 
> the values returned from the request). This change will also us use easily 
> switch to AsyncProcess as the executor, and support stats, for nearly all the 
> rpc calls. However, that means when you upgrade the client or server, you 
> will lose stats visibility until the other side is upgraded. We could keep 
> around the Result based stats storage to accommodate the old api and send 
> both stats back from the server (in each result and in the rpc payload).
> Note that we will still be wire compatible - protobufs mean we can just ride 
> over the lack of information.
> The other tricky part of this is that Result has a 
> non-InterfaceAudience.Private getStatistics() method (along with two 
> InterfaceAudience.Private addResults and setStatistics methods), so we might 
> need a release to deprecate the getStats() method before throwing it out?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14703) HTable.mutateRow does not collect stats

2016-03-05 Thread Sean Busbey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Busbey updated HBASE-14703:

Fix Version/s: (was: 3.0.0)

> HTable.mutateRow does not collect stats
> ---
>
> Key: HBASE-14703
> URL: https://issues.apache.org/jira/browse/HBASE-14703
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>Assignee: Heng Chen
> Fix For: 2.0.0
>
> Attachments: HBASE-14702_v5.2_addendum-addendum.patch, 
> HBASE-14703-5.2-addendum.patch, HBASE-14703-async.patch, 
> HBASE-14703-start.patch, HBASE-14703-v4.1.patch, HBASE-14703-v4.patch, 
> HBASE-14703-v6_with-check-and-mutate.patch, HBASE-14703.patch, 
> HBASE-14703_v1.patch, HBASE-14703_v10.patch, HBASE-14703_v10.patch, 
> HBASE-14703_v11.patch, HBASE-14703_v12.patch, HBASE-14703_v13.patch, 
> HBASE-14703_v2.patch, HBASE-14703_v3.patch, HBASE-14703_v5.1.patch, 
> HBASE-14703_v5.2.patch, HBASE-14703_v5.patch, HBASE-14703_v6-addendum.patch, 
> HBASE-14703_v6.patch, HBASE-14703_v7.patch, HBASE-14703_v8.patch, 
> HBASE-14703_v9.patch
>
>
> We are trying to fix the stats implementation, by moving it out of the Result 
> object and into an Rpc payload (but not the 'cell payload', just as part of 
> the values returned from the request). This change will also us use easily 
> switch to AsyncProcess as the executor, and support stats, for nearly all the 
> rpc calls. However, that means when you upgrade the client or server, you 
> will lose stats visibility until the other side is upgraded. We could keep 
> around the Result based stats storage to accommodate the old api and send 
> both stats back from the server (in each result and in the rpc payload).
> Note that we will still be wire compatible - protobufs mean we can just ride 
> over the lack of information.
> The other tricky part of this is that Result has a 
> non-InterfaceAudience.Private getStatistics() method (along with two 
> InterfaceAudience.Private addResults and setStatistics methods), so we might 
> need a release to deprecate the getStats() method before throwing it out?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14703) HTable.mutateRow does not collect stats

2016-03-05 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181826#comment-15181826
 ] 

Jesse Yates commented on HBASE-14703:
-

I committed to master and will work on a version for branch-2.

I'm tentative to move this any further back due to the recent discussion on the 
dev list on what should and shouldn't be moved back. Its a nice improvement, 
but does introduce some changes to the AsyncProcess. I'll leave it up the RMs 
if they want it (but I'll do my best to do the work, if they want it 
backported). thoughts [~apurtell] [~ndimiduk] [~mantonov] [~busbey]?

> HTable.mutateRow does not collect stats
> ---
>
> Key: HBASE-14703
> URL: https://issues.apache.org/jira/browse/HBASE-14703
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>Assignee: Heng Chen
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HBASE-14702_v5.2_addendum-addendum.patch, 
> HBASE-14703-5.2-addendum.patch, HBASE-14703-async.patch, 
> HBASE-14703-start.patch, HBASE-14703-v4.1.patch, HBASE-14703-v4.patch, 
> HBASE-14703-v6_with-check-and-mutate.patch, HBASE-14703.patch, 
> HBASE-14703_v1.patch, HBASE-14703_v10.patch, HBASE-14703_v10.patch, 
> HBASE-14703_v11.patch, HBASE-14703_v12.patch, HBASE-14703_v13.patch, 
> HBASE-14703_v2.patch, HBASE-14703_v3.patch, HBASE-14703_v5.1.patch, 
> HBASE-14703_v5.2.patch, HBASE-14703_v5.patch, HBASE-14703_v6-addendum.patch, 
> HBASE-14703_v6.patch, HBASE-14703_v7.patch, HBASE-14703_v8.patch, 
> HBASE-14703_v9.patch
>
>
> We are trying to fix the stats implementation, by moving it out of the Result 
> object and into an Rpc payload (but not the 'cell payload', just as part of 
> the values returned from the request). This change will also us use easily 
> switch to AsyncProcess as the executor, and support stats, for nearly all the 
> rpc calls. However, that means when you upgrade the client or server, you 
> will lose stats visibility until the other side is upgraded. We could keep 
> around the Result based stats storage to accommodate the old api and send 
> both stats back from the server (in each result and in the rpc payload).
> Note that we will still be wire compatible - protobufs mean we can just ride 
> over the lack of information.
> The other tricky part of this is that Result has a 
> non-InterfaceAudience.Private getStatistics() method (along with two 
> InterfaceAudience.Private addResults and setStatistics methods), so we might 
> need a release to deprecate the getStats() method before throwing it out?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14703) HTable.mutateRow does not collect stats

2016-03-05 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-14703:

Fix Version/s: (was: 1.4.0)
   (was: 0.98.18)
   (was: 1.1.4)
   (was: 1.2.1)
   (was: 1.3.0)

> HTable.mutateRow does not collect stats
> ---
>
> Key: HBASE-14703
> URL: https://issues.apache.org/jira/browse/HBASE-14703
> Project: HBase
>  Issue Type: Bug
>Reporter: Heng Chen
>Assignee: Heng Chen
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HBASE-14702_v5.2_addendum-addendum.patch, 
> HBASE-14703-5.2-addendum.patch, HBASE-14703-async.patch, 
> HBASE-14703-start.patch, HBASE-14703-v4.1.patch, HBASE-14703-v4.patch, 
> HBASE-14703-v6_with-check-and-mutate.patch, HBASE-14703.patch, 
> HBASE-14703_v1.patch, HBASE-14703_v10.patch, HBASE-14703_v10.patch, 
> HBASE-14703_v11.patch, HBASE-14703_v12.patch, HBASE-14703_v13.patch, 
> HBASE-14703_v2.patch, HBASE-14703_v3.patch, HBASE-14703_v5.1.patch, 
> HBASE-14703_v5.2.patch, HBASE-14703_v5.patch, HBASE-14703_v6-addendum.patch, 
> HBASE-14703_v6.patch, HBASE-14703_v7.patch, HBASE-14703_v8.patch, 
> HBASE-14703_v9.patch
>
>
> We are trying to fix the stats implementation, by moving it out of the Result 
> object and into an Rpc payload (but not the 'cell payload', just as part of 
> the values returned from the request). This change will also us use easily 
> switch to AsyncProcess as the executor, and support stats, for nearly all the 
> rpc calls. However, that means when you upgrade the client or server, you 
> will lose stats visibility until the other side is upgraded. We could keep 
> around the Result based stats storage to accommodate the old api and send 
> both stats back from the server (in each result and in the rpc payload).
> Note that we will still be wire compatible - protobufs mean we can just ride 
> over the lack of information.
> The other tricky part of this is that Result has a 
> non-InterfaceAudience.Private getStatistics() method (along with two 
> InterfaceAudience.Private addResults and setStatistics methods), so we might 
> need a release to deprecate the getStats() method before throwing it out?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15393) Enable table replication command will fail when parent znode is not default in peer cluster

2016-03-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181825#comment-15181825
 ] 

Hudson commented on HBASE-15393:


FAILURE: Integrated in HBase-Trunk_matrix #758 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/758/])
HBASE-15393 Enable table replication command will fail when parent znode 
(tedyu: rev d083e4f29f5ae5698be7cce8414a2c97b6800af4)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/replication/TestReplicationAdmin.java
* 
hbase-client/src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java


> Enable table replication command will fail when parent znode is not default 
> in peer cluster
> ---
>
> Key: HBASE-15393
> URL: https://issues.apache.org/jira/browse/HBASE-15393
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 1.0.3, 0.98.17
>Reporter: Ashish Singhi
>Assignee: Ashish Singhi
> Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.4, 1.0.4, 0.98.18
>
> Attachments: HBASE-15393.patch, HBASE-15393.v1.patch, 
> HBASE-15393.v2.patch, HBASE-15393.v3.patch
>
>
> Enable table replication command will fail when parent znode is not 
> /hbase(default) in peer cluster and there is only one peer cluster added in 
> the source cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181810#comment-15181810
 ] 

stack commented on HBASE-15392:
---

Let me look at making the change instead where you fellas are fingering the 
issue. All tests pass but that don't mean much I'd say... I need to add more 
test around this area. Also need to find how this was broken. Will be back. 
Again, appreciate the reviews lads.

> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392.wip.patch, 15392v2.wip.patch
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:806)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:795)
> at 
> 

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181808#comment-15181808
 ] 

stack commented on HBASE-15392:
---

bq. and the compare is >= 

Pardon me. I see what you mean now. The above is if a reversed scan.

> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392.wip.patch, 15392v2.wip.patch
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:321)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.requestSeek(KeyValueHeap.java:279)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:806)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.seekAsDirection(StoreScanner.java:795)
> at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:624)
> at 
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:153)
> at 
> 

[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181797#comment-15181797
 ] 

stack commented on HBASE-15392:
---

I appreciate the review. I don't know this part of the code so your questions 
are really helping.

bq. That is true. But am still seeing that a Get has a stopRow and already the 
check works fine. 

I see, yeah, has a stopRow, but the stopRow is current row and the compare is 
>= ... so will only trip when we go to the next row. We know it a GetScan and 
we are being called to see if we can short circuit out. Seems to make sense 
that we exploit the fact that we know it a GetScan. 

bq. You got an optimized QCODE INCLUDE because the INCLUDE_SEEK_NEXT_ROW got 
changed to INCLUDE. Still your fix works?

Good question.  Let me put up my hacked rig again. I'll be back.

> Single Cell Get reads two HFileBlocks
> -
>
> Key: HBASE-15392
> URL: https://issues.apache.org/jira/browse/HBASE-15392
> Project: HBase
>  Issue Type: Sub-task
>  Components: BucketCache
>Reporter: stack
>Assignee: stack
> Attachments: 15392.wip.patch, 15392v2.wip.patch
>
>
> As found by Daniel "SystemTap" Pol, a simple Get results in our reading two 
> HFileBlocks, the one that contains the wanted Cell, and the block that 
> follows.
> Here is a bit of custom logging that logs a stack trace on each HFileBlock 
> read so you can see the call stack responsible:
> {code}
> 2016-03-03 22:20:30,191 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> START LOOP
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] regionserver.StoreScanner: 
> QCODE SEEK_NEXT_COL
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileBlockIndex: 
> STARTED WHILE
> 2016-03-03 22:20:30,192 INFO  
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.CombinedBlockCache: 
> OUT OF L2
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.BucketCache: Read 
> offset=31409152, len=2103
> 2016-03-03 22:20:30,192 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] bucket.FileIOEngine: 
> offset=31409152, length=2103
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> From Cache [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> 2016-03-03 22:20:30,193 TRACE 
> [B.defaultRpcServer.handler=20,queue=2,port=16020] hfile.HFileReaderImpl: 
> Cache hit return [blockType=DATA, fileOffset=2055421, headerSize=33, 
> onDiskSizeWithoutHeader=2024, uncompressedSizeWithoutHeader=2020, 
> prevBlockOffset=2053364, isUseHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, onDiskDataSizeWithHeader=2053, 
> getOnDiskSizeWithHeader=2057, totalChecksumBytes=4, isUnpacked=true, 
> buf=[org.apache.hadoop.hbase.nio.SingleByteBuff@e19fbd54], 
> dataBeginsWith=\x00\x00\x00)\x00\x00\x01`\x00\x16user995139035672819231, 
> fileContext=[usesHBaseChecksum=true, checksumType=CRC32C, 
> bytesPerChecksum=16384, blocksize=65536, encoding=NONE, includesMvcc=true, 
> includesTags=false, compressAlgo=NONE, compressTags=false, 
> cryptoContext=[cipher=NONE keyHash=NONE]]]
> java.lang.Throwable
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl.readBlock(HFileReaderImpl.java:1515)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$CellBasedKeyBlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:324)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.seekTo(HFileReaderImpl.java:831)
> at 
> org.apache.hadoop.hbase.io.hfile.HFileReaderImpl$HFileScannerImpl.reseekTo(HFileReaderImpl.java:812)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:288)
> at 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:198)
> at 
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> at 
> 

[jira] [Commented] (HBASE-9556) Provide key range support to bulkload to avoid too many reducers even the data belongs to few regions

2016-03-05 Thread beeshma (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181788#comment-15181788
 ] 

beeshma commented on HBASE-9556:


How about  below logic to find start keys of regions?

HTable ht=new HTable(con,"test"); // Table object
NavigableMap np=ht.getRegionLocations();

Set setinfo=np.keySet();
List lis=new ArrayList();
lis.addAll(setinfo);
for(org.apache.hadoop.hbase.HRegionInfo h :lis)
{
System.out.println(h.getRegionId() + "getRegionId");

String s = new String(h.getStartKey());

System.out.println(s.toString()+"---start key");
}

Please suggest if anything wrong in this

> Provide key range support to bulkload to avoid too many reducers even the 
> data belongs to few regions
> -
>
> Key: HBASE-9556
> URL: https://issues.apache.org/jira/browse/HBASE-9556
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: rajeshbabu
>Assignee: rajeshbabu
>Priority: Minor
>
> Presently the number of reducers in bulk load are equal to number of regions.
> Lets suppose a table has 500 regions and import data only belongs 10 regions, 
> still we are starting 500(equal to no. of regions) reducers instead of 10. 
> Which will consume more time and resources. 
> If user knows the row key range of import data, then we can pass startkey 
> and/or endkey as input and based on the key range we can define the 
> partitions and number of reducers(regions to which the data belongs). This 
> helps to avoid too many reducers to start and do nothing and also avoids 
> contention in shuffling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15398) Cells loss or disorder when using family essential filter and partial scanning protocol

2016-03-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181748#comment-15181748
 ] 

Ted Yu commented on HBASE-15398:


This is what I meant:
http://pastebin.com/mdLyUsc1

> Cells loss or disorder when using family essential filter and partial 
> scanning protocol
> ---
>
> Key: HBASE-15398
> URL: https://issues.apache.org/jira/browse/HBASE-15398
> Project: HBase
>  Issue Type: Bug
>  Components: dataloss, Scanners
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Phil Yang
>Assignee: Phil Yang
>Priority: Critical
> Attachments: 15398-test.txt
>
>
> In RegionScannerImpl, we have two heaps, storeHeap and joinedHeap. If we have 
> a filter and it doesn't apply to all cf, the stores whose families needn't be 
>  filtered will be in joinedHeap. We scan storeHeap first, then joinedHeap, 
> and merge the results and sort and return to client. We need sort because the 
> order of Cell is rowkey/cf/cq/ts and a smaller cf may be in the joinedHeap.
> However, after HBASE-11544 we may transfer partial results when we get 
> SIZE_LIMIT_REACHED_MID_ROW or other similar states. We may return a larger cf 
> first because it is in storeHeap and then a smaller cf because it is in 
> joinedHeap. Server won't hold all cells in a row and client doesn't have a 
> sorting logic. The order of cf in Result for user is wrong.
> And a more critical bug is, if we get a LIMIT_REACHED_MID_ROW on the last 
> cell of a row in storeHeap, we will break scanning in RegionScannerImpl and 
> in populateResult we will change the state to SIZE_LIMIT_REACHED because next 
> peeked cell is next row. But this is only the last cell of one and we have 
> two... And SIZE_LIMIT_REACHED means this Result is not partial (by 
> ScannerContext.partialResultFormed), client will see it and merge them and 
> return to user with losing data of joinedHeap. On next scan we will read next 
> row of storeHeap and joinedHeap is forgotten and never be read...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HBASE-15398) Cells loss or disorder when using family essential filter and partial scanning protocol

2016-03-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181748#comment-15181748
 ] 

Ted Yu edited comment on HBASE-15398 at 3/5/16 3:14 PM:


This is what I meant:
http://pastebin.com/mdLyUsc1

Similar check can be performed server side.


was (Author: yuzhih...@gmail.com):
This is what I meant:
http://pastebin.com/mdLyUsc1

> Cells loss or disorder when using family essential filter and partial 
> scanning protocol
> ---
>
> Key: HBASE-15398
> URL: https://issues.apache.org/jira/browse/HBASE-15398
> Project: HBase
>  Issue Type: Bug
>  Components: dataloss, Scanners
>Affects Versions: 1.2.0, 1.1.3
>Reporter: Phil Yang
>Assignee: Phil Yang
>Priority: Critical
> Attachments: 15398-test.txt
>
>
> In RegionScannerImpl, we have two heaps, storeHeap and joinedHeap. If we have 
> a filter and it doesn't apply to all cf, the stores whose families needn't be 
>  filtered will be in joinedHeap. We scan storeHeap first, then joinedHeap, 
> and merge the results and sort and return to client. We need sort because the 
> order of Cell is rowkey/cf/cq/ts and a smaller cf may be in the joinedHeap.
> However, after HBASE-11544 we may transfer partial results when we get 
> SIZE_LIMIT_REACHED_MID_ROW or other similar states. We may return a larger cf 
> first because it is in storeHeap and then a smaller cf because it is in 
> joinedHeap. Server won't hold all cells in a row and client doesn't have a 
> sorting logic. The order of cf in Result for user is wrong.
> And a more critical bug is, if we get a LIMIT_REACHED_MID_ROW on the last 
> cell of a row in storeHeap, we will break scanning in RegionScannerImpl and 
> in populateResult we will change the state to SIZE_LIMIT_REACHED because next 
> peeked cell is next row. But this is only the last cell of one and we have 
> two... And SIZE_LIMIT_REACHED means this Result is not partial (by 
> ScannerContext.partialResultFormed), client will see it and merge them and 
> return to user with losing data of joinedHeap. On next scan we will read next 
> row of storeHeap and joinedHeap is forgotten and never be read...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15400) Using Multiple Output for Date Tiered Compaction

2016-03-05 Thread Clara Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clara Xiong updated HBASE-15400:

Summary: Using Multiple Output for Date Tiered Compaction  (was: Date 
Tiered Multiple Output for Date Tiered Compaction)

> Using Multiple Output for Date Tiered Compaction
> 
>
> Key: HBASE-15400
> URL: https://issues.apache.org/jira/browse/HBASE-15400
> Project: HBase
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Clara Xiong
>Assignee: Clara Xiong
> Fix For: 2.0.0
>
>
> When we compact, we can output multiple files along the current window 
> boundaries. There are two use cases:
> 1. Major compaction: We want to output date tiered store files.
> 2. Bulk load files and the old file generated by major compaction before 
> upgrading to DTCP.
> Pros: 
> 1. Restore locality, process versioning, updates and deletes while 
> maintaining the tiered layout.
> 2. The best way to fix a skewed layout.
>  
> I am starting from a prototype of date tiered file writer from HBASE-15389 
> and will upload a patch soon. I have to call out a few design decisions:
> 1. We only want to output the files along all windows for major compaction. 
> 2. For minor compaction, we don't want to output too many files, which will 
> remain around because of current restriction of contiguous compaction by seq 
> id. I will only output two files if all the files in the windows are being 
> combined, one for the data within window and the other for the out-of-window 
> tail. If there is any file in the window excluded from compaction, only one 
> file will be output from compaction. When the windows are promoted, the 
> situation of out of order data will gradually improve.
> 3. We have to pass the boundaries with the list of store file as a complete 
> time snapshot instead of two separate calls because window layout is 
> determined by the time the computation is called. So we will need new type of 
> compaction request. 
> 4. Since we will assign the same seq id for all output files, we need to sort 
> by maxTimestamp subsequently. Right now all compaction policy gets the files 
> sorted for StoreFileManager which sorts by seq id and other criteria. I will 
> use this order for DTCP only, to avoid impacting other compaction policies. 
> 5. We need some cleanup of current design of StoreEngine and CompactionPolicy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15393) Enable table replication command will fail when parent znode is not default in peer cluster

2016-03-05 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181735#comment-15181735
 ] 

Ted Yu commented on HBASE-15393:


Patch doesn't apply cleanly on branch-1.

Mind attaching patch for branch-1 ?

> Enable table replication command will fail when parent znode is not default 
> in peer cluster
> ---
>
> Key: HBASE-15393
> URL: https://issues.apache.org/jira/browse/HBASE-15393
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 1.0.3, 0.98.17
>Reporter: Ashish Singhi
>Assignee: Ashish Singhi
> Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.4, 1.0.4, 0.98.18
>
> Attachments: HBASE-15393.patch, HBASE-15393.v1.patch, 
> HBASE-15393.v2.patch, HBASE-15393.v3.patch
>
>
> Enable table replication command will fail when parent znode is not 
> /hbase(default) in peer cluster and there is only one peer cluster added in 
> the source cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15393) Enable table replication command will fail when parent znode is not default in peer cluster

2016-03-05 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-15393:
---
Summary: Enable table replication command will fail when parent znode is 
not default in peer cluster  (was: Enable table replication command will fail 
when parent znode is not /hbase(default) in peer cluster)

> Enable table replication command will fail when parent znode is not default 
> in peer cluster
> ---
>
> Key: HBASE-15393
> URL: https://issues.apache.org/jira/browse/HBASE-15393
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Affects Versions: 1.0.3, 0.98.17
>Reporter: Ashish Singhi
>Assignee: Ashish Singhi
> Fix For: 2.0.0, 1.3.0, 1.2.1, 1.1.4, 1.0.4, 0.98.18
>
> Attachments: HBASE-15393.patch, HBASE-15393.v1.patch, 
> HBASE-15393.v2.patch, HBASE-15393.v3.patch
>
>
> Enable table replication command will fail when parent znode is not 
> /hbase(default) in peer cluster and there is only one peer cluster added in 
> the source cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-15400) Date Tiered Multiple Output for Date Tiered Compaction

2016-03-05 Thread Dave Latham (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Latham updated HBASE-15400:

Description: 
When we compact, we can output multiple files along the current window 
boundaries. There are two use cases:

1. Major compaction: We want to output date tiered store files.
2. Bulk load files and the old file generated by major compaction before 
upgrading to DTCP.

Pros: 
1. Restore locality, process versioning, updates and deletes while maintaining 
the tiered layout.
2. The best way to fix a skewed layout.
 
I am starting from a prototype of date tiered file writer from HBASE-15389 and 
will upload a patch soon. I have to call out a few design decisions:

1. We only want to output the files along all windows for major compaction. 

2. For minor compaction, we don't want to output too many files, which will 
remain around because of current restriction of contiguous compaction by seq 
id. I will only output two files if all the files in the windows are being 
combined, one for the data within window and the other for the out-of-window 
tail. If there is any file in the window excluded from compaction, only one 
file will be output from compaction. When the windows are promoted, the 
situation of out of order data will gradually improve.

3. We have to pass the boundaries with the list of store file as a complete 
time snapshot instead of two separate calls because window layout is determined 
by the time the computation is called. So we will need new type of compaction 
request. 

4. Since we will assign the same seq id for all output files, we need to sort 
by maxTimestamp subsequently. Right now all compaction policy gets the files 
sorted for StoreFileManager which sorts by seq id and other criteria. I will 
use this order for DTCP only, to avoid impacting other compaction policies. 

5. We need some cleanup of current design of StoreEngine and CompactionPolicy.



  was:
When we compact, we can output multiple files along the current window 
boundaries. There are two use cases:

1. Major compaction: We want to output date tiered store files.
2. Bulk load files and the old file generated by major compaction before 
upgrading to DTCP.

Pros: 
1. Restore locality, process versioning, updates and deletes while maintaining 
the tiered layout.
2. The best way to fix a skewed layout.
 
I am starting from a prototype of date tiered file writer from 
https://issues.apache.org/jira/browse/HBASE-15389 and will upload a patch soon. 
I have to call out a few design decisions:

1. We only want to output the files along all windows for major compaction. 

2. For minor compaction, we don't want to output too many files, which will 
remain around because of current restriction of contiguous compaction by seq 
id. I will only output two files if all the files in the windows are being 
combined, one for the data within window and the other for the out-of-window 
tail. If there is any file in the window excluded from compaction, only one 
file will be output from compaction. When the windows are promoted, the 
situation of out of order data will gradually improve.

3. We have to pass the boundaries with the list of store file as a complete 
time snapshot instead of two separate calls because window layout is determined 
by the time the computation is called. So we will need new type of compaction 
request. 

4. Since we will assign the same seq id for all output files, we need to sort 
by maxTimestamp subsequently. Right now all compaction policy gets the files 
sorted for StoreFileManager which sorts by seq id and other criteria. I will 
use this order for DTCP only, to avoid impacting other compaction policies. 

5. We need some cleanup of current design of StoreEngine and CompactionPolicy.




> Date Tiered Multiple Output for Date Tiered Compaction
> --
>
> Key: HBASE-15400
> URL: https://issues.apache.org/jira/browse/HBASE-15400
> Project: HBase
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Clara Xiong
>Assignee: Clara Xiong
> Fix For: 2.0.0
>
>
> When we compact, we can output multiple files along the current window 
> boundaries. There are two use cases:
> 1. Major compaction: We want to output date tiered store files.
> 2. Bulk load files and the old file generated by major compaction before 
> upgrading to DTCP.
> Pros: 
> 1. Restore locality, process versioning, updates and deletes while 
> maintaining the tiered layout.
> 2. The best way to fix a skewed layout.
>  
> I am starting from a prototype of date tiered file writer from HBASE-15389 
> and will upload a patch soon. I have to call out a few design decisions:
> 1. We only want to output the files along all windows for major compaction. 
> 2. For minor compaction, we don't want to 

[jira] [Commented] (HBASE-15213) Fix increment performance regression caused by HBASE-8763 on branch-1.0

2016-03-05 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181694#comment-15181694
 ] 

Yu Li commented on HBASE-15213:
---

bq. T1 finishes, removes W1 ~ W10 at once
[~junegunn] Sorry, after a second thought, since WriteEntry of W1~W10 will be 
removed in one go and then notify all waiting threads, it seems to me we cannot 
promise the exit sequence of threads handling W2~W10, it all depends on the 
lock acquirement sequence? This possibly will make T3 return earlier than T2 
which is incorrect? And I guess this is exactly why we remove only one element 
at a time from the queue in 0.98/master codes?

> Fix increment performance regression caused by HBASE-8763 on branch-1.0
> ---
>
> Key: HBASE-15213
> URL: https://issues.apache.org/jira/browse/HBASE-15213
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: Junegunn Choi
>Assignee: Junegunn Choi
> Fix For: 1.1.4, 1.0.4
>
> Attachments: 15157v3.branch-1.1.patch, HBASE-15213-increment.png, 
> HBASE-15213.branch-1.0.patch, HBASE-15213.v1.branch-1.0.patch
>
>
> This is an attempt to fix the increment performance regression caused by 
> HBASE-8763 on branch-1.0.
> I'm aware that hbase.increment.fast.but.narrow.consistency was added to 
> branch-1.0 (HBASE-15031) to address the issue and a separate work is ongoing 
> on master branch, but anyway, this is my take on the problem.
> I read through HBASE-14460 and HBASE-8763 but it wasn't clear to me what 
> caused the slowdown but I could indeed reproduce the performance regression.
> Test setup:
> - Server: 4-core Xeon 2.4GHz Linux server running mini cluster (100 handlers, 
> JDK 1.7)
> - Client: Another box of the same spec
> - Increments on random 10k records on a single-region table, recreated every 
> time
> Increment throughput (TPS):
> || Num threads || Before HBASE-8763 (d6cc2fb) || branch-1.0 || branch-1.0 
> (narrow-consistency) ||
> || 1| 2661 | 2486| 2359  |
> || 2| 5048 | 5064| 4867  |
> || 4| 7503 | 8071| 8690  |
> || 8| 10471| 10886   | 13980 |
> || 16   | 15515| 9418| 18601 |
> || 32   | 17699| 5421| 20540 |
> || 64   | 20601| 4038| 25591 |
> || 96   | 19177| 3891| 26017 |
> We can clearly observe that the throughtput degrades as we increase the 
> number of concurrent requests, which led me to believe that there's severe 
> context switching overhead and I could indirectly confirm that suspicion with 
> cs entry in vmstat output. branch-1.0 shows a much higher number of context 
> switches even with much lower throughput.
> Here are the observations:
> - WriteEntry in the writeQueue can only be removed by the very handler that 
> put it, only when it is at the front of the queue and marked complete.
> - Since a WriteEntry is marked complete after the wait-loop, only one entry 
> can be removed at a time.
> - This stringent condition causes O(N^2) context switches where n is the 
> number of concurrent handlers processing requests.
> So what I tried here is to mark WriteEntry complete before we go into 
> wait-loop. With the change, multiple WriteEntries can be shifted at a time 
> without context switches. I changed writeQueue to LinkedHashSet since fast 
> containment check is needed as WriteEntry can be removed by any handler.
> The numbers look good, it's virtually identical to pre-HBASE-8763 era.
> || Num threads || branch-1.0 with fix ||
> || 1| 2459 |
> || 2| 4976 |
> || 4| 8033 |
> || 8| 12292|
> || 16   | 15234|
> || 32   | 16601|
> || 64   | 19994|
> || 96   | 20052|
> So what do you think about it? Please let me know if I'm missing anything.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15392) Single Cell Get reads two HFileBlocks

2016-03-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181670#comment-15181670
 ] 

Hadoop QA commented on HBASE-15392:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 
22s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 5m 
17s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
21s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
42s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
4s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 5m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
27m 40s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 113m 59s 
{color} | {color:green} hbase-server in the patch passed with JDK v1.8.0. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 111m 34s 
{color} | {color:green} hbase-server in the patch passed with JDK v1.7.0_79. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 281m 45s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12791609/15392v2.wip.patch |
| JIRA Issue | HBASE-15392 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux asf910.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git 

[jira] [Commented] (HBASE-15340) Partial row result of scan may return data violates the row-level transaction

2016-03-05 Thread Jianwei Cui (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181663#comment-15181663
 ] 

Jianwei Cui commented on HBASE-15340:
-

{quote}
The solution of having a client aware readPnt will solve even that(?)
{quote}
It seems [HBASE-13099|https://issues.apache.org/jira/browse/HBASE-13099] has 
proposed such solution: 
https://issues.apache.org/jira/browse/HBASE-13099?focusedCommentId=14337017=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14337017.
 However, there are cases the solution can't cover(if I am not wrong). For 
example:
1. the client holds the readPoint when the scanner is created on serverA and 
the client has read partial row data from serverA
2. move the region to another serverB before the whole row returned
3. before the client created a new scanner for the row with the readPoint on 
serverB: new mutations applied to the region, including deletes for the row, 
and a major compaction happens and completed.
The major compaction could delete the cells of the row because the new server 
can't get a proper smallestReadPoint for the compaction before all ongoing scan 
requests arrived. Then, the client can not read the remaining cells of the row 
after the compaction, and will break per-row atomicity for scan. 

> Partial row result of scan may return data violates the row-level transaction 
> --
>
> Key: HBASE-15340
> URL: https://issues.apache.org/jira/browse/HBASE-15340
> Project: HBase
>  Issue Type: Bug
>  Components: Scanners, Transactions/MVCC
>Affects Versions: 2.0.0
>Reporter: Jianwei Cui
>
> There are cases the region sever will return partial row result, such as the 
> client set batch for scan or configured size limit reached. In these 
> situations, the client may return data that violates the row-level 
> transaction to the application. The following steps show the problem:
> {code}
> // assume there is a test table 'test_table' with one family 'F' and one 
> region 'region'. 
> // meanwhile there are two region servers 'rsA' and 'rsB'.
> 1. Let 'region' firstly located in 'rsA' and put one row with two columns 
> 'c1' and 'c2' as:
> > put 'test_table', 'row', 'F:c1', 'value1', 'F:c2', 'value1'
> 2. Start a client to scan 'test_table', with scan.setBatch(1) and 
> scan.setCaching(1). The client will get one column as : {column='F:c1' and 
> value='value1'} in the first rpc call after scanner created, and the result 
> will be returned to application.
> 3. Before the client issues the next request, the 'region' was moved to 'rsB' 
> and accepted another mutations for the two columns 'c1' and 'c2' as:
> > put 'test_table', 'row', 'F:c1', 'value2', 'F:c2', 'value2'
> 4. Then, the client  will receive a RegionMovedException when issuing next 
> request and will retry to open scanner on 'rsB'. The newly opened scanner 
> will higher mvcc than old data so that could read out column as : { 
> column='F:c2' with value='value2'} and return the result to application.
>Therefore, the application will get data as:
> 'row'column='F:c1'   value='value1'
> 'row'column='F:c2',  value='value2'
>The returned data is combined from two different mutations and violates 
> the row-level transaction.
> {code}
> The reason is that the newly opened scanner after region moved will get a 
> different mvcc. I am not sure whether this result is by design for scan if 
> partial row result is allowed. However, such row result combined from 
> different transactions may make the application have unexpected state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15393) Enable table replication command will fail when parent znode is not /hbase(default) in peer cluster

2016-03-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181661#comment-15181661
 ] 

Hadoop QA commented on HBASE-15393:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 
29s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 7m 
7s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
30s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
23s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
5s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 6s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 6s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 6m 
36s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
26m 28s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
47s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 19s 
{color} | {color:green} hbase-client in the patch passed with JDK v1.8.0. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 118m 46s 
{color} | {color:green} hbase-server in the patch passed with JDK v1.8.0. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 18s 
{color} | {color:green} hbase-client in the patch passed with JDK v1.7.0_79. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 111m 0s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
24s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 295m 48s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.7.0_79 Timed out junit tests | 
org.apache.hadoop.hbase.snapshot.TestMobFlushSnapshotFromClient |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12791606/HBASE-15393.v3.patch |
| JIRA Issue | HBASE-15393 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  

[jira] [Commented] (HBASE-15354) Use same criteria for clearing meta cache for all operations

2016-03-05 Thread Mikhail Antonov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181655#comment-15181655
 ] 

Mikhail Antonov commented on HBASE-15354:
-

[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-checkstyle-plugin:2.17:checkstyle (default-cli) 
on project hbase-testing-util: An error has occurred in Checkstyle report 
generation. Failed during checkstyle execution: Unable to process suppressions 
file location: hbase/checkstyle-suppressions.xml: Cannot create file-based 
resource:invalid distance too far back -> [Help 1]

Doesn't seem legit.

> Use same criteria for clearing meta cache for all operations
> 
>
> Key: HBASE-15354
> URL: https://issues.apache.org/jira/browse/HBASE-15354
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 2.0.0, 1.2.0, 1.3.0
>Reporter: Ashu Pachauri
>Assignee: Ashu Pachauri
> Attachments: HBASE-15354-V0.patch, HBASE-15354-V1.patch, 
> HBASE-15354-V2.patch, HBASE-15354-V3.patch, HBASE-15354-V4.patch, 
> HBASE-15354-V5.patch
>
>
> Currently we do not clear/update meta cache for some special exceptions if 
> the operation went through AsyncProcess#submit like HTable#put calls. But, we 
> clear meta cache without checking for these special exceptions in case of 
> other operations like gets, deletes etc because they directly go through the 
> RpcRetryingCaller#callWithRetries instead of the AsyncProcess. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15402) Add on thrift server that uses cpp client.

2016-03-05 Thread Elliott Clark (JIRA)
Elliott Clark created HBASE-15402:
-

 Summary: Add on thrift server that uses cpp client.
 Key: HBASE-15402
 URL: https://issues.apache.org/jira/browse/HBASE-15402
 Project: HBase
  Issue Type: Sub-task
Reporter: Elliott Clark






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-15396) Enhance mapreduce.TableSplit to add encoded region name

2016-03-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181629#comment-15181629
 ] 

Hadoop QA commented on HBASE-15396:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
57s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 
27s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
51s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} master passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
41s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
24m 37s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 85m 12s {color} 
| {color:red} hbase-server in the patch failed with JDK v1.8.0. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 89m 19s 
{color} | {color:green} hbase-server in the patch passed with JDK v1.7.0_79. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
14s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 221m 5s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0 Timed out junit tests | 
org.apache.hadoop.hbase.mapred.TestTableInputFormat |
|   | org.apache.hadoop.hbase.namespace.TestNamespaceAuditor |
|   | org.apache.hadoop.hbase.TestMultiVersions |
|   | 
org.apache.hadoop.hbase.replication.regionserver.TestRegionReplicaReplicationEndpoint
 |
|   | org.apache.hadoop.hbase.mapred.TestTableMapReduceUtil |
|   | org.apache.hadoop.hbase.TestZooKeeper |
|   | org.apache.hadoop.hbase.mob.compactions.TestPartitionedMobCompactor |
|   | org.apache.hadoop.hbase.ipc.TestRpcClientLeaks |
|   | org.apache.hadoop.hbase.mob.mapreduce.TestMobSweepReducer |
|   | org.apache.hadoop.hbase.io.encoding.TestDataBlockEncoders |
|   | 

[jira] [Created] (HBASE-15401) Add Zookeeper to third party

2016-03-05 Thread Elliott Clark (JIRA)
Elliott Clark created HBASE-15401:
-

 Summary: Add Zookeeper to third party
 Key: HBASE-15401
 URL: https://issues.apache.org/jira/browse/HBASE-15401
 Project: HBase
  Issue Type: Sub-task
Reporter: Elliott Clark






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HBASE-14854) Read meta location from zk

2016-03-05 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-14854:
--
Attachment: HBASE-14854.patch

> Read meta location from zk
> --
>
> Key: HBASE-14854
> URL: https://issues.apache.org/jira/browse/HBASE-14854
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HBASE-14854.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)