[jira] [Commented] (HBASE-16466) HBase snapshots support in VerifyReplication tool to reduce load on live HBase cluster with large tables

2017-07-28 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-16466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104582#comment-16104582
 ] 

Zheng Hu commented on HBASE-16466:
--

[~sukuna...@gmail.com],  I have one question for your test provided.   Did you 
run the MR job on the same HDFS cluster for Source/Peer HBase Cluster & Yarn 
Cluster ?   

Seems like that when source hbase cluster / peer hbase cluster  / yarn cluster 
locate in three different HDFS cluster , it has one problem. 

when restoring the snapshot into tmpdir ,  we need to  create region by 
following code (HRegion#createHRegion)
{code}
  public static HRegion createHRegion(final HRegionInfo info, final Path 
rootDir,
final Configuration conf, final TableDescriptor hTableDescriptor,
final WAL wal, final boolean initialize)
  throws IOException {
LOG.info("creating HRegion " + info.getTable().getNameAsString()
+ " HTD == " + hTableDescriptor + " RootDir = " + rootDir +
" Table name == " + info.getTable().getNameAsString());
FileSystem fs = FileSystem.get(conf);  
<---  Here our code use  fs.defaultFs configuration to create 
region.
Path tableDir = FSUtils.getTableDir(rootDir, info.getTable());
HRegionFileSystem.createRegionOnFileSystem(conf, fs, tableDir, info);
HRegion region = HRegion.newHRegion(tableDir, wal, fs, conf, info, 
hTableDescriptor, null);
if (initialize) region.initialize(null);
return region;
  }
{code}

When source cluster & peer cluster locate in two difference file systems , then 
their  fs.defaultFs should be difference,   so at least one cluster will fail 
when restore snapshot into tmpdir .  after I added the following fix, it works 
fine for me.

{code}
-  FileSystem fs = FileSystem.get(conf);  
+ FileSystem fs = rootDir.getFileSystem(conf);
{code}

Looking forward to your reply, Thanks. 


> HBase snapshots support in VerifyReplication tool to reduce load on live 
> HBase cluster with large tables
> 
>
> Key: HBASE-16466
> URL: https://issues.apache.org/jira/browse/HBASE-16466
> Project: HBase
>  Issue Type: Improvement
>  Components: hbase
>Affects Versions: 0.98.21
>Reporter: Sukumar Maddineni
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0
>
> Attachments: HBASE-16466.branch-1.3.001.patch, HBASE-16466.v1.patch, 
> HBASE-16466.v2.patch, HBASE-16466.v3.patch, HBASE-16466.v4.patch, 
> HBASE-16466.v5.patch
>
>
> As of now VerifyReplicatin tool is running using normal HBase scanners. If 
> you  want to run VerifyReplication multiple times on a production live 
> cluster with large tables then it creates extra load on HBase layer. So if we 
> implement snapshot based support then both in source and target we can read 
> data from snapshots which reduces load on HBase



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18469) Correct RegionServer metric of totalRequestCount

2017-07-28 Thread Shibin Zhang (JIRA)
Shibin Zhang created HBASE-18469:


 Summary: Correct  RegionServer metric of  totalRequestCount
 Key: HBASE-18469
 URL: https://issues.apache.org/jira/browse/HBASE-18469
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Shibin Zhang
Priority: Minor


when i get the metric ,i found  this three metric may be have some error  as 
follow :

"totalRequestCount" : 17541,
"readRequestCount" : 17483,
"writeRequestCount" : 1633,









--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-15134) Add visibility into Flush and Compaction queues

2017-07-28 Thread Abhishek Singh Chouhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Singh Chouhan updated HBASE-15134:
---
Attachment: HBASE-15134.branch-1.001.patch

Fixed trailing whitespaces.

> Add visibility into Flush and Compaction queues
> ---
>
> Key: HBASE-15134
> URL: https://issues.apache.org/jira/browse/HBASE-15134
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction, metrics, regionserver
>Reporter: Elliott Clark
>Assignee: Abhishek Singh Chouhan
> Fix For: 2.0.0
>
> Attachments: HBASE-15134.branch-1.001.patch, 
> HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, 
> HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, 
> HBASE-15134.patch, HBASE-15134.patch
>
>
> On busy spurts we can see regionservers start to see large queues for 
> compaction. It's really hard to tell if the server is queueing a lot of 
> compactions for the same region, lots of compactions for lots of regions, or 
> just falling behind.
> For flushes much the same. There can be flushes in queue that aren't being 
> run because of delayed flushes. There's no way to know from the metrics how 
> many flushes are for each region, how many are delayed. Etc.
> We should add either more metrics around this ( num per region, max per 
> region, min per region ) or add on a UI page that has the list of compactions 
> and flushes.
> Or both.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-15134) Add visibility into Flush and Compaction queues

2017-07-28 Thread Abhishek Singh Chouhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104606#comment-16104606
 ] 

Abhishek Singh Chouhan commented on HBASE-15134:


Pushed to branch-1.4+.

> Add visibility into Flush and Compaction queues
> ---
>
> Key: HBASE-15134
> URL: https://issues.apache.org/jira/browse/HBASE-15134
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction, metrics, regionserver
>Reporter: Elliott Clark
>Assignee: Abhishek Singh Chouhan
> Fix For: 2.0.0
>
> Attachments: HBASE-15134.branch-1.001.patch, 
> HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, 
> HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, 
> HBASE-15134.patch, HBASE-15134.patch
>
>
> On busy spurts we can see regionservers start to see large queues for 
> compaction. It's really hard to tell if the server is queueing a lot of 
> compactions for the same region, lots of compactions for lots of regions, or 
> just falling behind.
> For flushes much the same. There can be flushes in queue that aren't being 
> run because of delayed flushes. There's no way to know from the metrics how 
> many flushes are for each region, how many are delayed. Etc.
> We should add either more metrics around this ( num per region, max per 
> region, min per region ) or add on a UI page that has the list of compactions 
> and flushes.
> Or both.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-15134) Add visibility into Flush and Compaction queues

2017-07-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104608#comment-16104608
 ] 

Hadoop QA commented on HBASE-15134:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} HBASE-15134 does not apply to branch-1. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.4.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-15134 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12879310/HBASE-15134.branch-1.001.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7823/console |
| Powered by | Apache Yetus 0.4.0   http://yetus.apache.org |


This message was automatically generated.



> Add visibility into Flush and Compaction queues
> ---
>
> Key: HBASE-15134
> URL: https://issues.apache.org/jira/browse/HBASE-15134
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction, metrics, regionserver
>Reporter: Elliott Clark
>Assignee: Abhishek Singh Chouhan
> Fix For: 2.0.0
>
> Attachments: HBASE-15134.branch-1.001.patch, 
> HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, 
> HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, 
> HBASE-15134.patch, HBASE-15134.patch
>
>
> On busy spurts we can see regionservers start to see large queues for 
> compaction. It's really hard to tell if the server is queueing a lot of 
> compactions for the same region, lots of compactions for lots of regions, or 
> just falling behind.
> For flushes much the same. There can be flushes in queue that aren't being 
> run because of delayed flushes. There's no way to know from the metrics how 
> many flushes are for each region, how many are delayed. Etc.
> We should add either more metrics around this ( num per region, max per 
> region, min per region ) or add on a UI page that has the list of compactions 
> and flushes.
> Or both.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Work started] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-18451 started by nihed mbarek.

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. 

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Attachment: 0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch

Patch provided with code refactor to support boolean return for 
requestDelayedFlush and requestFlush 

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: 
> 0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after rand

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Status: Patch Available  (was: In Progress)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: 
> 0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,6002

[jira] [Updated] (HBASE-15134) Add visibility into Flush and Compaction queues

2017-07-28 Thread Abhishek Singh Chouhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Singh Chouhan updated HBASE-15134:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: (was: 2.0.0)
   2.0.0-alpha-2
   1.5.0
   1.4.0
   3.0.0
   Status: Resolved  (was: Patch Available)

> Add visibility into Flush and Compaction queues
> ---
>
> Key: HBASE-15134
> URL: https://issues.apache.org/jira/browse/HBASE-15134
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction, metrics, regionserver
>Reporter: Elliott Clark
>Assignee: Abhishek Singh Chouhan
> Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-2
>
> Attachments: HBASE-15134.branch-1.001.patch, 
> HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, 
> HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, 
> HBASE-15134.patch, HBASE-15134.patch
>
>
> On busy spurts we can see regionservers start to see large queues for 
> compaction. It's really hard to tell if the server is queueing a lot of 
> compactions for the same region, lots of compactions for lots of regions, or 
> just falling behind.
> For flushes much the same. There can be flushes in queue that aren't being 
> run because of delayed flushes. There's no way to know from the metrics how 
> many flushes are for each region, how many are delayed. Etc.
> We should add either more metrics around this ( num per region, max per 
> region, min per region ) or add on a UI page that has the list of compactions 
> and flushes.
> Or both.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HBASE-15134) Add visibility into Flush and Compaction queues

2017-07-28 Thread Abhishek Singh Chouhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104606#comment-16104606
 ] 

Abhishek Singh Chouhan edited comment on HBASE-15134 at 7/28/17 8:08 AM:
-

Pushed to branch-1.4+. Thanks for the reviews !!


was (Author: abhishek.chouhan):
Pushed to branch-1.4+.

> Add visibility into Flush and Compaction queues
> ---
>
> Key: HBASE-15134
> URL: https://issues.apache.org/jira/browse/HBASE-15134
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction, metrics, regionserver
>Reporter: Elliott Clark
>Assignee: Abhishek Singh Chouhan
> Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-2
>
> Attachments: HBASE-15134.branch-1.001.patch, 
> HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, 
> HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, 
> HBASE-15134.patch, HBASE-15134.patch
>
>
> On busy spurts we can see regionservers start to see large queues for 
> compaction. It's really hard to tell if the server is queueing a lot of 
> compactions for the same region, lots of compactions for lots of regions, or 
> just falling behind.
> For flushes much the same. There can be flushes in queue that aren't being 
> run because of delayed flushes. There's no way to know from the metrics how 
> many flushes are for each region, how many are delayed. Etc.
> We should add either more metrics around this ( num per region, max per 
> region, min per region ) or add on a UI page that has the list of compactions 
> and flushes.
> Or both.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104618#comment-16104618
 ] 

Hadoop QA commented on HBASE-18451:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} HBASE-18451 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.4.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-18451 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12879311/0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7824/console |
| Powered by | Apache Yetus 0.4.0   http://yetus.apache.org |


This message was automatically generated.



> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: 
> 0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of

[jira] [Commented] (HBASE-15134) Add visibility into Flush and Compaction queues

2017-07-28 Thread Abhishek Singh Chouhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104623#comment-16104623
 ] 

Abhishek Singh Chouhan commented on HBASE-15134:


Hadoop QA ran after the patch with whitespace fix was reattached and pushed to 
branch hence errored out.

> Add visibility into Flush and Compaction queues
> ---
>
> Key: HBASE-15134
> URL: https://issues.apache.org/jira/browse/HBASE-15134
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction, metrics, regionserver
>Reporter: Elliott Clark
>Assignee: Abhishek Singh Chouhan
> Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-2
>
> Attachments: HBASE-15134.branch-1.001.patch, 
> HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, 
> HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, 
> HBASE-15134.patch, HBASE-15134.patch
>
>
> On busy spurts we can see regionservers start to see large queues for 
> compaction. It's really hard to tell if the server is queueing a lot of 
> compactions for the same region, lots of compactions for lots of regions, or 
> just falling behind.
> For flushes much the same. There can be flushes in queue that aren't being 
> run because of delayed flushes. There's no way to know from the metrics how 
> many flushes are for each region, how many are delayed. Etc.
> We should add either more metrics around this ( num per region, max per 
> region, min per region ) or add on a UI page that has the list of compactions 
> and flushes.
> Or both.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Status: Open  (was: Patch Available)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: 
> 0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,15009

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Attachment: 0001-HBASE-15134-Add-visibility-into-Flush-and-Compaction.patch

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: 
> 0001-HBASE-15134-Add-visibility-into-Flush-and-Compaction.patch, 
> 0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:

[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104649#comment-16104649
 ] 

Hadoop QA commented on HBASE-18451:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} HBASE-18451 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.4.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-18451 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12879313/0001-HBASE-15134-Add-visibility-into-Flush-and-Compaction.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7825/console |
| Powered by | Apache Yetus 0.4.0   http://yetus.apache.org |


This message was automatically generated.



> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: 
> 0001-HBASE-15134-Add-visibility-into-Flush-and-Compaction.patch, 
> 0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Status: Patch Available  (was: Open)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: 
> 0001-HBASE-15134-Add-visibility-into-Flush-and-Compaction.patch, 
> 0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Status: In Progress  (was: Patch Available)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Attachment: (was: 
0001-HBASE-18451-PeriodicMemstoreFlusher-should-inspect-t.patch)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Attachment: (was: 
0001-HBASE-15134-Add-visibility-into-Flush-and-Compaction.patch)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 

[jira] [Commented] (HBASE-18446) Mark StoreFileScanner as IA.Private

2017-07-28 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104677#comment-16104677
 ] 

Duo Zhang commented on HBASE-18446:
---

OK, got it. Seems we need to introduce an interface for {{StoreFileReader}} if 
we want to hide the implementation details to CP users. A big refactoring...

> Mark StoreFileScanner as IA.Private
> ---
>
> Key: HBASE-18446
> URL: https://issues.apache.org/jira/browse/HBASE-18446
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: Duo Zhang
> Fix For: 2.0.0, 3.0.0, 2.0.0-alpha-2
>
>
> Do not see any reason why it is marked as IA.LimitedPrivate. It is not 
> referenced in any CPs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Attachment: ISSUE.patch

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: ISSUE.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Status: Patch Available  (was: In Progress)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: ISSUE.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flus

[jira] [Commented] (HBASE-17131) Avoid livelock caused by HRegion#processRowsWithLocks

2017-07-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104679#comment-16104679
 ] 

Hudson commented on HBASE-17131:


SUCCESS: Integrated in Jenkins build HBase-1.2-JDK8 #170 (See 
[https://builds.apache.org/job/HBase-1.2-JDK8/170/])
HBASE-17131 Avoid livelock caused by HRegion#processRowsWithLocks (chia7712: 
rev 670e9431d40d35df4802bc0445012271ee904efc)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide3.java


> Avoid livelock caused by HRegion#processRowsWithLocks
> -
>
> Key: HBASE-17131
> URL: https://issues.apache.org/jira/browse/HBASE-17131
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0, 1.4.0, 1.3.1, 1.2.6
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0, 1.4.0, 1.3.2, 1.2.7
>
> Attachments: HBASE-17131.branch-1.2.v0.patch, 
> HBASE-17131.branch-1.3.v0.patch, HBASE-17131.branch-1.v0.patch, 
> HBASE-17131.v0.patch
>
>
> {code:title=HRegion.java|borderStyle=solid}
> try {
>   // STEP 2. Acquire the row lock(s)
>   acquiredRowLocks = new ArrayList(rowsToLock.size());
>   for (byte[] row : rowsToLock) {
> // Attempt to lock all involved rows, throw if any lock times out
> // use a writer lock for mixed reads and writes
> acquiredRowLocks.add(getRowLockInternal(row, false));
>   }
>   // STEP 3. Region lock
>   lock(this.updatesLock.readLock(), acquiredRowLocks.size() == 0 ? 1 : 
> acquiredRowLocks.size());
>   locked = true;
>   boolean success = false;
>   long now = EnvironmentEdgeManager.currentTime();
>   try {
> {code}
> We should lock all involved rows in the second try-finally. Otherwise, we 
> won’t release the previous locks if any subsequent lock times out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104681#comment-16104681
 ] 

Hadoop QA commented on HBASE-18451:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} HBASE-18451 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.4.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-18451 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12879318/ISSUE.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7826/console |
| Powered by | Apache Yetus 0.4.0   http://yetus.apache.org |


This message was automatically generated.



> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: ISSUE.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WA

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Attachment: (was: ISSUE.patch)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Status: Open  (was: Patch Available)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Attachment: ISSUE.patch

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: ISSUE.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Status: Patch Available  (was: Open)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: ISSUE.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of te

[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104694#comment-16104694
 ] 

Hadoop QA commented on HBASE-18451:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} HBASE-18451 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.4.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-18451 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12879321/ISSUE.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7827/console |
| Powered by | Apache Yetus 0.4.0   http://yetus.apache.org |


This message was automatically generated.



> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: ISSUE.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WA

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Attachment: HBASE-18451.master.patch

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: HBASE-18451.master.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Status: Patch Available  (was: Open)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: HBASE-18451.master.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Status: Open  (was: Patch Available)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: HBASE-18451.master.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Attachment: (was: ISSUE.patch)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: HBASE-18451.master.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> 

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Status: Patch Available  (was: Open)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: HBASE-18451.master.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Status: Open  (was: Patch Available)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: HBASE-18451.master.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 

[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104706#comment-16104706
 ] 

Hadoop QA commented on HBASE-18451:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} HBASE-18451 does not apply to master. Rebase required? Wrong 
Branch? See https://yetus.apache.org/documentation/0.4.0/precommit-patchnames 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HBASE-18451 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12879322/HBASE-18451.master.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7828/console |
| Powered by | Apache Yetus 0.4.0   http://yetus.apache.org |


This message was automatically generated.



> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: HBASE-18451.master.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an o

[jira] [Created] (HBASE-18470) `RetriesExhaustedWithDetailsException#getDesc` describe is not right

2017-07-28 Thread Benedict Jin (JIRA)
Benedict Jin created HBASE-18470:


 Summary: `RetriesExhaustedWithDetailsException#getDesc` describe 
is not right
 Key: HBASE-18470
 URL: https://issues.apache.org/jira/browse/HBASE-18470
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 2.0.0-alpha-1
Reporter: Benedict Jin


The describe from `RetriesExhaustedWithDetailsException#getDesc` is `
org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 3 
actions: FailedServerException: 3 times, `, there is a not need ', ' in the 
tail.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Attachment: (was: HBASE-18451.master.patch)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Attachment: HBASE-18451.master.patch

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: HBASE-18451.master.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Status: Patch Available  (was: Open)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: HBASE-18451.master.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Status: Open  (was: Patch Available)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: HBASE-18451.master.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Status: Patch Available  (was: Open)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: HBASE-18451.master.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 

[jira] [Updated] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread nihed mbarek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nihed mbarek updated HBASE-18451:
-
Status: Open  (was: Patch Available)

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: HBASE-18451.master.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 

[jira] [Commented] (HBASE-17131) Avoid livelock caused by HRegion#processRowsWithLocks

2017-07-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104723#comment-16104723
 ] 

Hudson commented on HBASE-17131:


SUCCESS: Integrated in Jenkins build HBase-1.3-JDK8 #224 (See 
[https://builds.apache.org/job/HBase-1.3-JDK8/224/])
HBASE-17131 Avoid livelock caused by HRegion#processRowsWithLocks (chia7712: 
rev f18f916f050cf4dc106543d3dc7c6d2f78077661)
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide3.java


> Avoid livelock caused by HRegion#processRowsWithLocks
> -
>
> Key: HBASE-17131
> URL: https://issues.apache.org/jira/browse/HBASE-17131
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0, 1.4.0, 1.3.1, 1.2.6
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
> Fix For: 2.0.0, 1.4.0, 1.3.2, 1.2.7
>
> Attachments: HBASE-17131.branch-1.2.v0.patch, 
> HBASE-17131.branch-1.3.v0.patch, HBASE-17131.branch-1.v0.patch, 
> HBASE-17131.v0.patch
>
>
> {code:title=HRegion.java|borderStyle=solid}
> try {
>   // STEP 2. Acquire the row lock(s)
>   acquiredRowLocks = new ArrayList(rowsToLock.size());
>   for (byte[] row : rowsToLock) {
> // Attempt to lock all involved rows, throw if any lock times out
> // use a writer lock for mixed reads and writes
> acquiredRowLocks.add(getRowLockInternal(row, false));
>   }
>   // STEP 3. Region lock
>   lock(this.updatesLock.readLock(), acquiredRowLocks.size() == 0 ? 1 : 
> acquiredRowLocks.size());
>   locked = true;
>   boolean success = false;
>   long now = EnvironmentEdgeManager.currentTime();
>   try {
> {code}
> We should lock all involved rows in the second try-finally. Otherwise, we 
> won’t release the previous locks if any subsequent lock times out.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-15134) Add visibility into Flush and Compaction queues

2017-07-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104744#comment-16104744
 ] 

Hudson commented on HBASE-15134:


FAILURE: Integrated in Jenkins build HBase-1.4 #826 (See 
[https://builds.apache.org/job/HBase-1.4/826/])
HBASE-15134 Add visibility into Flush and Compaction queues (achouhan: rev 
92780371080a341d0b6f98307a0ea176db327c5a)
* (edit) 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSource.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperStub.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java
* (edit) 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapper.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegion.java
* (edit) 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperImpl.java
* (edit) 
hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionSourceImpl.java


> Add visibility into Flush and Compaction queues
> ---
>
> Key: HBASE-15134
> URL: https://issues.apache.org/jira/browse/HBASE-15134
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction, metrics, regionserver
>Reporter: Elliott Clark
>Assignee: Abhishek Singh Chouhan
> Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-2
>
> Attachments: HBASE-15134.branch-1.001.patch, 
> HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, 
> HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, 
> HBASE-15134.patch, HBASE-15134.patch
>
>
> On busy spurts we can see regionservers start to see large queues for 
> compaction. It's really hard to tell if the server is queueing a lot of 
> compactions for the same region, lots of compactions for lots of regions, or 
> just falling behind.
> For flushes much the same. There can be flushes in queue that aren't being 
> run because of delayed flushes. There's no way to know from the metrics how 
> many flushes are for each region, how many are delayed. Etc.
> We should add either more metrics around this ( num per region, max per 
> region, min per region ) or add on a UI page that has the list of compactions 
> and flushes.
> Or both.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18437) Revoke access permissions of a user from a table does not work as expected

2017-07-28 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104826#comment-16104826
 ] 

Anoop Sam John commented on HBASE-18437:


bq.if 
(Bytes.toString(perm.getUser()).equals(Bytes.toString(userPerm.getUser( {
The permsList is obtained for this user and why again user check? Sorry not 
getting.  Or u have to check the table details?
bq.perm.setActions(leftActions.toArray(new 
Permission.Action[leftActions.size()]));
Should we create new UserPermission instance than adding this setter? Seems 
like as per design the actions has to be final (Even though it is not marked so)

> Revoke access permissions of a user from a table does not work as expected
> --
>
> Key: HBASE-18437
> URL: https://issues.apache.org/jira/browse/HBASE-18437
> Project: HBase
>  Issue Type: Bug
>  Components: security
>Affects Versions: 1.1.12
>Reporter: Ashish Singhi
>Assignee: Ashish Singhi
> Attachments: HBASE-18437.patch
>
>
> A table for which a user was granted 'RW' permission. Now when we want to 
> revoke its 'W' permission only, code removes the user itself from that table 
> permissions.
> Below is the test code which reproduces the issue.
> {noformat}
> @Test(timeout = 18)
>   public void testRevokeOnlySomePerms() throws Throwable {
> TableName name = TableName.valueOf("testAgain");
> HTableDescriptor htd = new HTableDescriptor(name);
> HColumnDescriptor hcd = new HColumnDescriptor("cf");
> htd.addFamily(hcd);
> createTable(TEST_UTIL, htd);
> TEST_UTIL.waitUntilAllRegionsAssigned(name);
> try (Connection conn = ConnectionFactory.createConnection(conf)) {
>   AccessControlClient.grant(conn, name, USER_RO.getShortName(), null, 
> null, Action.READ, Action.WRITE);
>   ListMultimap tablePermissions = 
> AccessControlLists.getTablePermissions(conf, name);
>   // hbase user and USER_RO has permis
>   assertEquals(2, tablePermissions.size());
>   AccessControlClient.revoke(conn, name, USER_RO.getShortName(), null, 
> null, Action.WRITE);
>   tablePermissions = AccessControlLists.getTablePermissions(conf, name);
>   List userPerm = 
> tablePermissions.get(USER_RO.getShortName());
>   assertEquals(1, userPerm.size());
> } finally {
>   deleteTable(TEST_UTIL, name);
> }
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18471) Deleted qualifier re-appearing after multiple puts.

2017-07-28 Thread Thomas Martens (JIRA)
Thomas Martens created HBASE-18471:
--

 Summary: Deleted qualifier re-appearing after multiple puts.
 Key: HBASE-18471
 URL: https://issues.apache.org/jira/browse/HBASE-18471
 Project: HBase
  Issue Type: Bug
  Components: Deletes, hbase, scan
Affects Versions: 1.3.0
Reporter: Thomas Martens


The qualifier of a deleted row (with keep deleted cells true) re-appears after 
re-inserting the same row multiple times (with different timestamp) with an 
empty qualifier.

Scenario:
# Put row with family and qualifier (timestamp 1).
# Delete entire row (timestamp 2).
# Put same row again with family without qualifier (timestamp 3).
A scan (latest version) returns the row with family without qualifier, version 
3 (which is correct).
# Put the same row again with family without qualifier (timestamp 4).
A scan (latest version) returns multiple rows:
* the row with family without qualifier, version 4 (which is correct).
* the row with family with qualifier, version 1 (which is wrong).




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18142) Deletion of a cell deletes the previous versions too

2017-07-28 Thread Sahil Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104838#comment-16104838
 ] 

Sahil Aggarwal commented on HBASE-18142:


Noob quest:

On looking at  _deleteall_internal in table.rb, if row name is a Hash then we 
call _deleterows_internal but here we don't honor the timestamp provided in row 
hash and use latest timestamp, could this be the reason we are deleting all the 
versions of all the cells in that row?

> Deletion of a cell deletes the previous versions too
> 
>
> Key: HBASE-18142
> URL: https://issues.apache.org/jira/browse/HBASE-18142
> Project: HBase
>  Issue Type: Bug
>  Components: API
>Reporter: Karthick
>  Labels: beginner
>
> When I tried to delete a cell using it's timestamp in the Hbase Shell, the 
> previous versions of the same cell also got deleted. But when I tried the 
> same using the Java API, then the previous versions are not deleted and I can 
> retrive the previous values.
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Delete.java
> see this file to fix the issue. This method (public Delete addColumns(final 
> byte [] family, final byte [] qualifier, final long timestamp)) only deletes 
> the current version of the cell. The previous versions are not deleted.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18471) Deleted qualifier re-appearing after multiple puts.

2017-07-28 Thread Thomas Martens (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Martens updated HBASE-18471:
---
Attachment: HBaseDmlTest.java

> Deleted qualifier re-appearing after multiple puts.
> ---
>
> Key: HBASE-18471
> URL: https://issues.apache.org/jira/browse/HBASE-18471
> Project: HBase
>  Issue Type: Bug
>  Components: Deletes, hbase, scan
>Affects Versions: 1.3.0
>Reporter: Thomas Martens
> Attachments: HBaseDmlTest.java
>
>
> The qualifier of a deleted row (with keep deleted cells true) re-appears 
> after re-inserting the same row multiple times (with different timestamp) 
> with an empty qualifier.
> Scenario:
> # Put row with family and qualifier (timestamp 1).
> # Delete entire row (timestamp 2).
> # Put same row again with family without qualifier (timestamp 3).
> A scan (latest version) returns the row with family without qualifier, 
> version 3 (which is correct).
> # Put the same row again with family without qualifier (timestamp 4).
> A scan (latest version) returns multiple rows:
> * the row with family without qualifier, version 4 (which is correct).
> * the row with family with qualifier, version 1 (which is wrong).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18446) Mark StoreFileScanner as IA.Private

2017-07-28 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104860#comment-16104860
 ] 

Anoop Sam John commented on HBASE-18446:


Ya then StoreFileReader is what should be exposed.  As Duo said not the impl 
class but an interface.
Then why expose StoreFileScanner and StoreFile?

> Mark StoreFileScanner as IA.Private
> ---
>
> Key: HBASE-18446
> URL: https://issues.apache.org/jira/browse/HBASE-18446
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: Duo Zhang
> Fix For: 2.0.0, 3.0.0, 2.0.0-alpha-2
>
>
> Do not see any reason why it is marked as IA.LimitedPrivate. It is not 
> referenced in any CPs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18471) Deleted qualifier re-appearing after multiple puts.

2017-07-28 Thread Thomas Martens (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Martens updated HBASE-18471:
---
Description: 
The qualifier of a deleted row (with keep deleted cells true) re-appears after 
re-inserting the same row multiple times (with different timestamp) with an 
empty qualifier.

Scenario:
# Put row with family and qualifier (timestamp 1).
# Delete entire row (timestamp 2).
# Put same row again with family without qualifier (timestamp 3).
A scan (latest version) returns the row with family without qualifier, version 
3 (which is correct).
# Put the same row again with family without qualifier (timestamp 4).
A scan (latest version) returns multiple rows:
* the row with family without qualifier, version 4 (which is correct).
* the row with family with qualifier, version 1 (which is wrong).

There is a test scenario attached.
output:
 13:42:53,952 [main] client.HBaseAdmin - Started disable of test_dml
 13:42:55,801 [main] client.HBaseAdmin - Disabled test_dml
 13:42:57,256 [main] client.HBaseAdmin - Deleted test_dml
 13:42:58,592 [main] client.HBaseAdmin - Created test_dml
Put row: 'myRow' with family: 'myFamily' with qualifier: 'myQualifier' with 
timestamp: '1'
Scan printout =>
  Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', 
Value: 'myValue'
Delete row: 'myRow'
Scan printout =>
Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with timestamp: 
'3'
Scan printout =>
  Row: 'myRow', Timestamp: '3', Family: 'myFamily', Qualifier: '', Value: 
'myValue'
Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with timestamp: 
'4'
Scan printout =>
  Row: 'myRow', Timestamp: '4', Family: 'myFamily', Qualifier: '', Value: 
'myValue'
  Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', 
Value: 'myValue'


  was:
The qualifier of a deleted row (with keep deleted cells true) re-appears after 
re-inserting the same row multiple times (with different timestamp) with an 
empty qualifier.

Scenario:
# Put row with family and qualifier (timestamp 1).
# Delete entire row (timestamp 2).
# Put same row again with family without qualifier (timestamp 3).
A scan (latest version) returns the row with family without qualifier, version 
3 (which is correct).
# Put the same row again with family without qualifier (timestamp 4).
A scan (latest version) returns multiple rows:
* the row with family without qualifier, version 4 (which is correct).
* the row with family with qualifier, version 1 (which is wrong).



> Deleted qualifier re-appearing after multiple puts.
> ---
>
> Key: HBASE-18471
> URL: https://issues.apache.org/jira/browse/HBASE-18471
> Project: HBase
>  Issue Type: Bug
>  Components: Deletes, hbase, scan
>Affects Versions: 1.3.0
>Reporter: Thomas Martens
> Attachments: HBaseDmlTest.java
>
>
> The qualifier of a deleted row (with keep deleted cells true) re-appears 
> after re-inserting the same row multiple times (with different timestamp) 
> with an empty qualifier.
> Scenario:
> # Put row with family and qualifier (timestamp 1).
> # Delete entire row (timestamp 2).
> # Put same row again with family without qualifier (timestamp 3).
> A scan (latest version) returns the row with family without qualifier, 
> version 3 (which is correct).
> # Put the same row again with family without qualifier (timestamp 4).
> A scan (latest version) returns multiple rows:
> * the row with family without qualifier, version 4 (which is correct).
> * the row with family with qualifier, version 1 (which is wrong).
> There is a test scenario attached.
> output:
>  13:42:53,952 [main] client.HBaseAdmin - Started disable of test_dml
>  13:42:55,801 [main] client.HBaseAdmin - Disabled test_dml
>  13:42:57,256 [main] client.HBaseAdmin - Deleted test_dml
>  13:42:58,592 [main] client.HBaseAdmin - Created test_dml
> Put row: 'myRow' with family: 'myFamily' with qualifier: 'myQualifier' with 
> timestamp: '1'
> Scan printout =>
>   Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', 
> Value: 'myValue'
> Delete row: 'myRow'
> Scan printout =>
> Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with 
> timestamp: '3'
> Scan printout =>
>   Row: 'myRow', Timestamp: '3', Family: 'myFamily', Qualifier: '', Value: 
> 'myValue'
> Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with 
> timestamp: '4'
> Scan printout =>
>   Row: 'myRow', Timestamp: '4', Family: 'myFamily', Qualifier: '', Value: 
> 'myValue'
>   Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', 
> Value: 'myValue'



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18471) Deleted qualifier re-appearing after multiple puts.

2017-07-28 Thread Thomas Martens (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Martens updated HBASE-18471:
---
Description: 
The qualifier of a deleted row (with keep deleted cells true) re-appears after 
re-inserting the same row multiple times (with different timestamp) with an 
empty qualifier.

Scenario:
# Put row with family and qualifier (timestamp 1).
# Delete entire row (timestamp 2).
# Put same row again with family without qualifier (timestamp 3).
A scan (latest version) returns the row with family without qualifier, version 
3 (which is correct).
# Put the same row again with family without qualifier (timestamp 4).
A scan (latest version) returns multiple rows:
* the row with family without qualifier, version 4 (which is correct).
* the row with family with qualifier, version 1 (which is wrong).

There is a test scenario attached.
output:
 13:42:53,952 [main] client.HBaseAdmin - Started disable of test_dml
 13:42:55,801 [main] client.HBaseAdmin - Disabled test_dml
 13:42:57,256 [main] client.HBaseAdmin - Deleted test_dml
 13:42:58,592 [main] client.HBaseAdmin - Created test_dml
Put row: 'myRow' with family: 'myFamily' with qualifier: 'myQualifier' with 
timestamp: '1'
Scan printout =>
  Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', 
Value: 'myValue'
Delete row: 'myRow'
Scan printout =>
Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with timestamp: 
'3'
Scan printout =>
  Row: 'myRow', Timestamp: '3', Family: 'myFamily', Qualifier: '', Value: 
'myValue'
Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with timestamp: 
'4'
Scan printout =>
  Row: 'myRow', Timestamp: '4', Family: 'myFamily', Qualifier: '', Value: 
'myValue'
  {color:red}Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 
'myQualifier', Value: 'myValue'{color}


  was:
The qualifier of a deleted row (with keep deleted cells true) re-appears after 
re-inserting the same row multiple times (with different timestamp) with an 
empty qualifier.

Scenario:
# Put row with family and qualifier (timestamp 1).
# Delete entire row (timestamp 2).
# Put same row again with family without qualifier (timestamp 3).
A scan (latest version) returns the row with family without qualifier, version 
3 (which is correct).
# Put the same row again with family without qualifier (timestamp 4).
A scan (latest version) returns multiple rows:
* the row with family without qualifier, version 4 (which is correct).
* the row with family with qualifier, version 1 (which is wrong).

There is a test scenario attached.
output:
 13:42:53,952 [main] client.HBaseAdmin - Started disable of test_dml
 13:42:55,801 [main] client.HBaseAdmin - Disabled test_dml
 13:42:57,256 [main] client.HBaseAdmin - Deleted test_dml
 13:42:58,592 [main] client.HBaseAdmin - Created test_dml
Put row: 'myRow' with family: 'myFamily' with qualifier: 'myQualifier' with 
timestamp: '1'
Scan printout =>
  Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', 
Value: 'myValue'
Delete row: 'myRow'
Scan printout =>
Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with timestamp: 
'3'
Scan printout =>
  Row: 'myRow', Timestamp: '3', Family: 'myFamily', Qualifier: '', Value: 
'myValue'
Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with timestamp: 
'4'
Scan printout =>
  Row: 'myRow', Timestamp: '4', Family: 'myFamily', Qualifier: '', Value: 
'myValue'
  Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', 
Value: 'myValue'



> Deleted qualifier re-appearing after multiple puts.
> ---
>
> Key: HBASE-18471
> URL: https://issues.apache.org/jira/browse/HBASE-18471
> Project: HBase
>  Issue Type: Bug
>  Components: Deletes, hbase, scan
>Affects Versions: 1.3.0
>Reporter: Thomas Martens
> Attachments: HBaseDmlTest.java
>
>
> The qualifier of a deleted row (with keep deleted cells true) re-appears 
> after re-inserting the same row multiple times (with different timestamp) 
> with an empty qualifier.
> Scenario:
> # Put row with family and qualifier (timestamp 1).
> # Delete entire row (timestamp 2).
> # Put same row again with family without qualifier (timestamp 3).
> A scan (latest version) returns the row with family without qualifier, 
> version 3 (which is correct).
> # Put the same row again with family without qualifier (timestamp 4).
> A scan (latest version) returns multiple rows:
> * the row with family without qualifier, version 4 (which is correct).
> * the row with family with qualifier, version 1 (which is wrong).
> There is a test scenario attached.
> output:
>  13:42:53,952 [main] client.HBaseAdmin - Started disable of test_dml
>  13:42:55,801 [main] client.HBaseAdmin - Disabled test_dml
>  13:42:57,256 [main] client.HBaseAd

[jira] [Commented] (HBASE-15134) Add visibility into Flush and Compaction queues

2017-07-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104868#comment-16104868
 ] 

Hudson commented on HBASE-15134:


FAILURE: Integrated in Jenkins build HBase-2.0 #248 (See 
[https://builds.apache.org/job/HBase-2.0/248/])
HBASE-15134 Add visibility into Flush and Compaction queues (achouhan: rev 
12b9a151e6338297b253ca2e005eda22b1f2da4e)
* (edit) 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapper.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegion.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperStub.java
* (edit) 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSource.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplit.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperImpl.java
* (edit) 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
* (edit) 
hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionSourceImpl.java


> Add visibility into Flush and Compaction queues
> ---
>
> Key: HBASE-15134
> URL: https://issues.apache.org/jira/browse/HBASE-15134
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction, metrics, regionserver
>Reporter: Elliott Clark
>Assignee: Abhishek Singh Chouhan
> Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-2
>
> Attachments: HBASE-15134.branch-1.001.patch, 
> HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, 
> HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, 
> HBASE-15134.patch, HBASE-15134.patch
>
>
> On busy spurts we can see regionservers start to see large queues for 
> compaction. It's really hard to tell if the server is queueing a lot of 
> compactions for the same region, lots of compactions for lots of regions, or 
> just falling behind.
> For flushes much the same. There can be flushes in queue that aren't being 
> run because of delayed flushes. There's no way to know from the metrics how 
> many flushes are for each region, how many are delayed. Etc.
> We should add either more metrics around this ( num per region, max per 
> region, min per region ) or add on a UI page that has the list of compactions 
> and flushes.
> Or both.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18446) Mark StoreFileScanner as IA.Private

2017-07-28 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104871#comment-16104871
 ] 

Duo Zhang commented on HBASE-18446:
---

{quote}
but with local index we need to do full HFile scan and should be able to find 
the whether Cell belongs the child region based on actual data row key
{quote}

What happens if the file is compacted?

> Mark StoreFileScanner as IA.Private
> ---
>
> Key: HBASE-18446
> URL: https://issues.apache.org/jira/browse/HBASE-18446
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: Duo Zhang
> Fix For: 2.0.0, 3.0.0, 2.0.0-alpha-2
>
>
> Do not see any reason why it is marked as IA.LimitedPrivate. It is not 
> referenced in any CPs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18446) Mark StoreFileScanner as IA.Private

2017-07-28 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104876#comment-16104876
 ] 

Anoop Sam John commented on HBASE-18446:


For the compaction purpose also, each of the region's compaction would have 
selected the corresponding Half file and its reader for doing the compaction 
work (Scan for compaction).  As of 2.0 design, we wont archive the compacted 
away files immediately after the compaction. The old scans will continue using 
them.  See CompactedHFilesDischarger.
The new compacted files will have proper data as per the split daughter regions 

> Mark StoreFileScanner as IA.Private
> ---
>
> Key: HBASE-18446
> URL: https://issues.apache.org/jira/browse/HBASE-18446
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: Duo Zhang
> Fix For: 2.0.0, 3.0.0, 2.0.0-alpha-2
>
>
> Do not see any reason why it is marked as IA.LimitedPrivate. It is not 
> referenced in any CPs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104880#comment-16104880
 ] 

Hadoop QA commented on HBASE-18451:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
31m  9s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}119m 
11s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:bdc94b1 |
| JIRA Issue | HBASE-18451 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12879325/HBASE-18451.master.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 963f4e251072 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 2d06a06 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7829/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7829/console |
| Powered by | Apache Yetus 0.4.0   http://yetus.apache.org |


This message was automatically generated.



> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>

[jira] [Commented] (HBASE-18446) Mark StoreFileScanner as IA.Private

2017-07-28 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104881#comment-16104881
 ] 

Duo Zhang commented on HBASE-18446:
---

Oh, seems no problem. The compaction will also use the replaced StoreFileReader 
so it can read the index and write them to the new StoreFile.

> Mark StoreFileScanner as IA.Private
> ---
>
> Key: HBASE-18446
> URL: https://issues.apache.org/jira/browse/HBASE-18446
> Project: HBase
>  Issue Type: Sub-task
>  Components: Coprocessors
>Reporter: Duo Zhang
> Fix For: 2.0.0, 3.0.0, 2.0.0-alpha-2
>
>
> Do not see any reason why it is marked as IA.LimitedPrivate. It is not 
> referenced in any CPs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104884#comment-16104884
 ] 

Jean-Marc Spaggiari commented on HBASE-18451:
-

You got it! ;)

LGTM.

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: HBASE-18451.master.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,

[jira] [Commented] (HBASE-18451) PeriodicMemstoreFlusher should inspect the queue before adding a delayed flush request

2017-07-28 Thread Jean-Marc Spaggiari (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104888#comment-16104888
 ] 

Jean-Marc Spaggiari commented on HBASE-18451:
-

Thanks Anoop.

> PeriodicMemstoreFlusher should inspect the queue before adding a delayed 
> flush request
> --
>
> Key: HBASE-18451
> URL: https://issues.apache.org/jira/browse/HBASE-18451
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 2.0.0-alpha-1
>Reporter: Jean-Marc Spaggiari
>Assignee: nihed mbarek
> Attachments: HBASE-18451.master.patch
>
>
> If you run a big job every 4 hours, impacting many tables (they have 150 
> regions per server), ad the end all the regions might have some data to be 
> flushed, and we want, after one hour, trigger a periodic flush. That's 
> totally fine.
> Now, to avoid a flush storm, when we detect a region to be flushed, we add a 
> "randomDelay" to the delayed flush, that way we spread them away.
> RANGE_OF_DELAY is 5 minutes. So we spread the flush over the next 5 minutes, 
> which is very good.
> However, because we don't check if there is already a request in the queue, 
> 10 seconds after, we create a new request, with a new randomDelay.
> If you generate a randomDelay every 10 seconds, at some point, you will end 
> up having a small one, and the flush will be triggered almost immediatly.
> As a result, instead of spreading all the flush within the next 5 minutes, 
> you end-up getting them all way more quickly. Like within the first minute. 
> Which not only feed the queue to to many flush requests, but also defeats the 
> purpose of the randomDelay.
> {code}
> @Override
> protected void chore() {
>   final StringBuffer whyFlush = new StringBuffer();
>   for (Region r : this.server.onlineRegions.values()) {
> if (r == null) continue;
> if (((HRegion)r).shouldFlush(whyFlush)) {
>   FlushRequester requester = server.getFlushRequester();
>   if (requester != null) {
> long randomDelay = RandomUtils.nextInt(RANGE_OF_DELAY) + 
> MIN_DELAY_TIME;
> LOG.info(getName() + " requesting flush of " +
>   r.getRegionInfo().getRegionNameAsString() + " because " +
>   whyFlush.toString() +
>   " after random delay " + randomDelay + "ms");
> //Throttle the flushes by putting a delay. If we don't throttle, 
> and there
> //is a balanced write-load on the regions in a table, we might 
> end up
> //overwhelming the filesystem with too many flushes at once.
> requester.requestDelayedFlush(r, randomDelay, false);
>   }
> }
>   }
> }
> {code}
> {code}
> 2017-07-24 18:44:33,338 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 270785ms
> 2017-07-24 18:44:43,328 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 200143ms
> 2017-07-24 18:44:53,954 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 191082ms
> 2017-07-24 18:45:03,528 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 92532ms
> 2017-07-24 18:45:14,201 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 238780ms
> 2017-07-24 18:45:24,195 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,1500916375517-MemstoreFlusherChore requesting 
> flush of testflush,,1500932649126.578c27d2eb7ef0ad437bf2ff38c053ae. because f 
> has an old edit so flush to free WALs after random delay 35390ms
> 2017-07-24 18:45:33,362 INFO 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> hbasetest2.domainname.com,60020,15

[jira] [Commented] (HBASE-15134) Add visibility into Flush and Compaction queues

2017-07-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-15134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104915#comment-16104915
 ] 

Hudson commented on HBASE-15134:


FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #3449 (See 
[https://builds.apache.org/job/HBase-Trunk_matrix/3449/])
HBASE-15134 Add visibility into Flush and Compaction queues (achouhan: rev 
2d06a06ba4bbd2f64e28be5973eb1d447114bedc)
* (edit) 
hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSourceImpl.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplit.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegion.java
* (edit) 
hbase-hadoop2-compat/src/test/java/org/apache/hadoop/hbase/regionserver/TestMetricsRegionSourceImpl.java
* (edit) 
hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperStub.java
* (edit) 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionSource.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreFlusher.java
* (edit) 
hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapper.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* (edit) 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MetricsRegionWrapperImpl.java


> Add visibility into Flush and Compaction queues
> ---
>
> Key: HBASE-15134
> URL: https://issues.apache.org/jira/browse/HBASE-15134
> Project: HBase
>  Issue Type: New Feature
>  Components: Compaction, metrics, regionserver
>Reporter: Elliott Clark
>Assignee: Abhishek Singh Chouhan
> Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-2
>
> Attachments: HBASE-15134.branch-1.001.patch, 
> HBASE-15134.branch-1.001.patch, HBASE-15134.master.001.patch, 
> HBASE-15134.master.002.patch, HBASE-15134.master.003.patch, 
> HBASE-15134.patch, HBASE-15134.patch
>
>
> On busy spurts we can see regionservers start to see large queues for 
> compaction. It's really hard to tell if the server is queueing a lot of 
> compactions for the same region, lots of compactions for lots of regions, or 
> just falling behind.
> For flushes much the same. There can be flushes in queue that aren't being 
> run because of delayed flushes. There's no way to know from the metrics how 
> many flushes are for each region, how many are delayed. Etc.
> We should add either more metrics around this ( num per region, max per 
> region, min per region ) or add on a UI page that has the list of compactions 
> and flushes.
> Or both.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18304) Start enforcing upperbounds on dependencies

2017-07-28 Thread Tamas Penzes (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tamas Penzes updated HBASE-18304:
-
Attachment: HBASE-18304.master.001.patch

> Start enforcing upperbounds on dependencies
> ---
>
> Key: HBASE-18304
> URL: https://issues.apache.org/jira/browse/HBASE-18304
> Project: HBase
>  Issue Type: Task
>  Components: build, dependencies
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Tamas Penzes
>  Labels: beginner
> Fix For: 2.0.0
>
> Attachments: HBASE-18304.master.001.patch
>
>
> would be nice to get this going before our next major version.
> http://maven.apache.org/enforcer/enforcer-rules/requireUpperBoundDeps.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18374) RegionServer Metrics improvements

2017-07-28 Thread Abhishek Singh Chouhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Singh Chouhan updated HBASE-18374:
---
Attachment: HBASE-18374.master.005.patch

Added putbatch metrics.

> RegionServer Metrics improvements
> -
>
> Key: HBASE-18374
> URL: https://issues.apache.org/jira/browse/HBASE-18374
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Abhishek Singh Chouhan
>Assignee: Abhishek Singh Chouhan
> Fix For: 3.0.0
>
> Attachments: HBASE-18374.branch-1.001.patch, 
> HBASE-18374.branch-1.001.patch, HBASE-18374.branch-1.002.patch, 
> HBASE-18374.master.001.patch, HBASE-18374.master.002.patch, 
> HBASE-18374.master.003.patch, HBASE-18374.master.004.patch, 
> HBASE-18374.master.005.patch
>
>
> At the RS level we have latency metrics for mutate/puts and deletes that are 
> updated per batch (ie. at the end of entire batchop if it contains put/delete 
> update the respective metric) in contrast with append/increment/get metrics 
> that are updated per op. This is a bit ambiguous since the delete and put 
> metrics are updated for multi row mutations that happen to contain a 
> put/delete. We should rename the metric(eg. delete_batch)/add better 
> description. Also we should add metrics for single delete client operations 
> that come through RSRpcServer.mutate path. We should also add metrics for 
> checkAndPut and checkAndDelete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18304) Start enforcing upperbounds on dependencies

2017-07-28 Thread Tamas Penzes (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tamas Penzes updated HBASE-18304:
-
Status: Patch Available  (was: In Progress)

> Start enforcing upperbounds on dependencies
> ---
>
> Key: HBASE-18304
> URL: https://issues.apache.org/jira/browse/HBASE-18304
> Project: HBase
>  Issue Type: Task
>  Components: build, dependencies
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Tamas Penzes
>  Labels: beginner
> Fix For: 2.0.0
>
> Attachments: HBASE-18304.master.001.patch
>
>
> would be nice to get this going before our next major version.
> http://maven.apache.org/enforcer/enforcer-rules/requireUpperBoundDeps.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18471) Deleted qualifier re-appearing after multiple puts.

2017-07-28 Thread Thomas Martens (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Martens updated HBASE-18471:
---
Affects Version/s: 1.3.1

> Deleted qualifier re-appearing after multiple puts.
> ---
>
> Key: HBASE-18471
> URL: https://issues.apache.org/jira/browse/HBASE-18471
> Project: HBase
>  Issue Type: Bug
>  Components: Deletes, hbase, scan
>Affects Versions: 1.3.0, 1.3.1
>Reporter: Thomas Martens
> Attachments: HBaseDmlTest.java
>
>
> The qualifier of a deleted row (with keep deleted cells true) re-appears 
> after re-inserting the same row multiple times (with different timestamp) 
> with an empty qualifier.
> Scenario:
> # Put row with family and qualifier (timestamp 1).
> # Delete entire row (timestamp 2).
> # Put same row again with family without qualifier (timestamp 3).
> A scan (latest version) returns the row with family without qualifier, 
> version 3 (which is correct).
> # Put the same row again with family without qualifier (timestamp 4).
> A scan (latest version) returns multiple rows:
> * the row with family without qualifier, version 4 (which is correct).
> * the row with family with qualifier, version 1 (which is wrong).
> There is a test scenario attached.
> output:
>  13:42:53,952 [main] client.HBaseAdmin - Started disable of test_dml
>  13:42:55,801 [main] client.HBaseAdmin - Disabled test_dml
>  13:42:57,256 [main] client.HBaseAdmin - Deleted test_dml
>  13:42:58,592 [main] client.HBaseAdmin - Created test_dml
> Put row: 'myRow' with family: 'myFamily' with qualifier: 'myQualifier' with 
> timestamp: '1'
> Scan printout =>
>   Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 'myQualifier', 
> Value: 'myValue'
> Delete row: 'myRow'
> Scan printout =>
> Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with 
> timestamp: '3'
> Scan printout =>
>   Row: 'myRow', Timestamp: '3', Family: 'myFamily', Qualifier: '', Value: 
> 'myValue'
> Put row: 'myRow' with family: 'myFamily' with qualifier: 'null' with 
> timestamp: '4'
> Scan printout =>
>   Row: 'myRow', Timestamp: '4', Family: 'myFamily', Qualifier: '', Value: 
> 'myValue'
>   {color:red}Row: 'myRow', Timestamp: '1', Family: 'myFamily', Qualifier: 
> 'myQualifier', Value: 'myValue'{color}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18304) Start enforcing upperbounds on dependencies

2017-07-28 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16104988#comment-16104988
 ] 

Mike Drob commented on HBASE-18304:
---

Hi [~tamaas], when I said to exclude the protobuf dep I meant to exclude it 
from the configuration, not the actual dependency tree. I think we can use the 
mechanism in MENFORCER-273 to do this.

> Start enforcing upperbounds on dependencies
> ---
>
> Key: HBASE-18304
> URL: https://issues.apache.org/jira/browse/HBASE-18304
> Project: HBase
>  Issue Type: Task
>  Components: build, dependencies
>Affects Versions: 2.0.0
>Reporter: Sean Busbey
>Assignee: Tamas Penzes
>  Labels: beginner
> Fix For: 2.0.0
>
> Attachments: HBASE-18304.master.001.patch
>
>
> would be nice to get this going before our next major version.
> http://maven.apache.org/enforcer/enforcer-rules/requireUpperBoundDeps.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18142) Deletion of a cell deletes the previous versions too

2017-07-28 Thread Chia-Ping Tsai (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105160#comment-16105160
 ] 

Chia-Ping Tsai commented on HBASE-18142:


bq.  could this be the reason we are deleting all the versions of all the cells 
in that row?
Not exactly. The _deleterows_internal call _createdelete_internal for getting 
the Delete object. The _createdelete_internal create the Delete object through 
[Delete#addColumns|https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Delete.java#L258].
 The purpose of the Delete#addColumns is to *delete all versions of the 
specified column with a timestamp less than or equal to the specified 
timestamp*.

> Deletion of a cell deletes the previous versions too
> 
>
> Key: HBASE-18142
> URL: https://issues.apache.org/jira/browse/HBASE-18142
> Project: HBase
>  Issue Type: Bug
>  Components: API
>Reporter: Karthick
>  Labels: beginner
>
> When I tried to delete a cell using it's timestamp in the Hbase Shell, the 
> previous versions of the same cell also got deleted. But when I tried the 
> same using the Java API, then the previous versions are not deleted and I can 
> retrive the previous values.
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Delete.java
> see this file to fix the issue. This method (public Delete addColumns(final 
> byte [] family, final byte [] qualifier, final long timestamp)) only deletes 
> the current version of the cell. The previous versions are not deleted.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12387) committer guidelines should include patch signoff

2017-07-28 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105247#comment-16105247
 ] 

Mike Drob commented on HBASE-12387:
---

Did the proposed DISCUSS thread ever happen?

> committer guidelines should include patch signoff
> -
>
> Key: HBASE-12387
> URL: https://issues.apache.org/jira/browse/HBASE-12387
> Project: HBase
>  Issue Type: Task
>  Components: documentation
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>
> Right now our guide for committers apply patches has them use {{git am}} 
> without a signoff flag. This works okay, but it misses adding the 
> "signed-off-by" blurb in the commit message.
> Those messages make it easier to see at a glance with e.g. {{git log}} which 
> committer applied the patch.
> this section:
> {quote}
> The directive to use git format-patch rather than git diff, and not to use 
> --no-prefix, is a new one. See the second example for how to apply a patch 
> created with git diff, and educate the person who created the patch.
> {code}
> $ git checkout -b HBASE-
> $ git am ~/Downloads/HBASE--v2.patch
> $ git checkout master
> $ git pull --rebase
> $ git cherry-pick 
> # Resolve conflicts if necessary or ask the submitter to do it
> $ git pull --rebase  # Better safe than sorry
> $ git push origin master
> $ git checkout branch-1
> $ git pull --rebase
> $ git cherry-pick 
> # Resolve conflicts if necessary
> $ git pull --rebase  # Better safe than sorry
> $ git push origin branch-1
> $ git branch -D HBASE-
> {code}
> {quote}
> Should be
> {quote}
> The directive to use git format-patch rather than git diff, and not to use 
> --no-prefix, is a new one. See the second example for how to apply a patch 
> created with git diff, and educate the person who created the patch.
> Note that the {{--signoff}} flag to {{git am}} will insert a line in the 
> commit message that the patch was checked by your author string. This 
> addition to your inclusion as the commit's committer makes your participation 
> more prominent to users browsing {{git log}}.
> {code}
> $ git checkout -b HBASE-
> $ git am --signoff ~/Downloads/HBASE--v2.patch
> $ git checkout master
> $ git pull --rebase
> $ git cherry-pick 
> # Resolve conflicts if necessary or ask the submitter to do it
> $ git pull --rebase  # Better safe than sorry
> $ git push origin master
> $ git checkout branch-1
> $ git pull --rebase
> $ git cherry-pick 
> # Resolve conflicts if necessary
> $ git pull --rebase  # Better safe than sorry
> $ git push origin branch-1
> $ git branch -D HBASE-
> {code}
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18472) Add guava license and update supplemental-models.xml

2017-07-28 Thread Yi Liang (JIRA)
Yi Liang created HBASE-18472:


 Summary: Add guava license and update supplemental-models.xml
 Key: HBASE-18472
 URL: https://issues.apache.org/jira/browse/HBASE-18472
 Project: HBase
  Issue Type: Bug
Reporter: Yi Liang
Assignee: Yi Liang
Priority: Blocker


When I run mvn clean install -DskipTests on my local machine, lt always shows 
error below 
{quote}
WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed 
with message:
License errors detected, for more detail find ERROR in 
hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE

Failed to execute goal 
org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce 
(check-aggregate-license) on project hbase-assembly: Some Enforcer rules have 
failed. Look above for specific messages explaining why the rule failed.
{quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18472) Add guava license and update supplemental-models.xml

2017-07-28 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105253#comment-16105253
 ] 

Mike Drob commented on HBASE-18472:
---

Hi [~easyliangjob] - is this on a specific branch? Possibly related to 
HBASE-17908 but I haven't seen this failure happen locally.

> Add guava license and update supplemental-models.xml
> 
>
> Key: HBASE-18472
> URL: https://issues.apache.org/jira/browse/HBASE-18472
> Project: HBase
>  Issue Type: Bug
>Reporter: Yi Liang
>Assignee: Yi Liang
>Priority: Blocker
>
> When I run mvn clean install -DskipTests on my local machine, lt always shows 
> error below 
> {quote}
> WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed 
> with message:
> License errors detected, for more detail find ERROR in 
> hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE
> Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce 
> (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have 
> failed. Look above for specific messages explaining why the rule failed.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18472) Add guava license and update supplemental-models.xml

2017-07-28 Thread Yi Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105254#comment-16105254
 ] 

Yi Liang commented on HBASE-18472:
--

the error on my machine was caused by HBASE-16351, it seems we do not add guava 
license into supplemental-models.xml in HBASE-17908.  

Hi [~mdrob],
   Have you seens this error when you mvn install,  I think this errors has 
been here for while, it is strange no one report it. Just want to make sure if 
it only happens on my machine. 

> Add guava license and update supplemental-models.xml
> 
>
> Key: HBASE-18472
> URL: https://issues.apache.org/jira/browse/HBASE-18472
> Project: HBase
>  Issue Type: Bug
>Reporter: Yi Liang
>Assignee: Yi Liang
>Priority: Blocker
>
> When I run mvn clean install -DskipTests on my local machine, lt always shows 
> error below 
> {quote}
> WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed 
> with message:
> License errors detected, for more detail find ERROR in 
> hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE
> Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce 
> (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have 
> failed. Look above for specific messages explaining why the rule failed.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18472) Add guava license and update supplemental-models.xml

2017-07-28 Thread Yi Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liang updated HBASE-18472:
-
Attachment: HBASE-18472-master-v1.patch

> Add guava license and update supplemental-models.xml
> 
>
> Key: HBASE-18472
> URL: https://issues.apache.org/jira/browse/HBASE-18472
> Project: HBase
>  Issue Type: Bug
>Reporter: Yi Liang
>Assignee: Yi Liang
>Priority: Blocker
> Attachments: HBASE-18472-master-v1.patch
>
>
> When I run mvn clean install -DskipTests on my local machine, lt always shows 
> error below 
> {quote}
> WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed 
> with message:
> License errors detected, for more detail find ERROR in 
> hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE
> Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce 
> (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have 
> failed. Look above for specific messages explaining why the rule failed.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18472) Add guava license and update supplemental-models.xml

2017-07-28 Thread Yi Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liang updated HBASE-18472:
-
Fix Version/s: 3.0.0
   2.0.0
Affects Version/s: 3.0.0
   2.0.0-alpha-1
   Status: Patch Available  (was: Open)

> Add guava license and update supplemental-models.xml
> 
>
> Key: HBASE-18472
> URL: https://issues.apache.org/jira/browse/HBASE-18472
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha-1, 3.0.0
>Reporter: Yi Liang
>Assignee: Yi Liang
>Priority: Blocker
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HBASE-18472-master-v1.patch
>
>
> When I run mvn clean install -DskipTests on my local machine, lt always shows 
> error below 
> {quote}
> WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed 
> with message:
> License errors detected, for more detail find ERROR in 
> hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE
> Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce 
> (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have 
> failed. Look above for specific messages explaining why the rule failed.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18472) Add guava license and update supplemental-models.xml

2017-07-28 Thread Yi Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105260#comment-16105260
 ] 

Yi Liang commented on HBASE-18472:
--

I try it on master branch 

> Add guava license and update supplemental-models.xml
> 
>
> Key: HBASE-18472
> URL: https://issues.apache.org/jira/browse/HBASE-18472
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.0-alpha-1
>Reporter: Yi Liang
>Assignee: Yi Liang
>Priority: Blocker
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HBASE-18472-master-v1.patch
>
>
> When I run mvn clean install -DskipTests on my local machine, lt always shows 
> error below 
> {quote}
> WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed 
> with message:
> License errors detected, for more detail find ERROR in 
> hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE
> Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce 
> (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have 
> failed. Look above for specific messages explaining why the rule failed.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18374) RegionServer Metrics improvements

2017-07-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105289#comment-16105289
 ] 

Hadoop QA commented on HBASE-18374:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
30m 11s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
23s{color} | {color:green} hbase-hadoop-compat in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
26s{color} | {color:green} hbase-hadoop2-compat in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}111m 
44s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
44s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}163m 10s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:bdc94b1 |
| JIRA Issue | HBASE-18374 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12879350/HBASE-18374.master.005.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux 2134b41544f1 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 2d06a06 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7830/testReport/ |
| modules | C: hbase-hadoop-compat hbase-hadoop2-compat hbase-server U: . |
| Console output | 
https://b

[jira] [Commented] (HBASE-18472) Add guava license and update supplemental-models.xml

2017-07-28 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105295#comment-16105295
 ] 

Mike Drob commented on HBASE-18472:
---

HBASE-16351 made the error reporting friendlier, it shouldn't have exposed any 
new errors.

Can you reproduce this with a clean workspace and m2 repository?

Guava 11.0.2 inherits it's license from guava-parent 
https://repo1.maven.org/maven2/com/google/guava/guava-parent/11.0.2/guava-parent-11.0.2.pom
 which has 

{code}

The Apache Software License, Version 2.0
http://www.apache.org/licenses/LICENSE-2.0.txt
repo

{code}

and that looks 100% correct.

Can you post more snippet from the ERROR message you see inside of the 
generated LICENSE file?

> Add guava license and update supplemental-models.xml
> 
>
> Key: HBASE-18472
> URL: https://issues.apache.org/jira/browse/HBASE-18472
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.0-alpha-1
>Reporter: Yi Liang
>Assignee: Yi Liang
>Priority: Blocker
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HBASE-18472-master-v1.patch
>
>
> When I run mvn clean install -DskipTests on my local machine, lt always shows 
> error below 
> {quote}
> WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed 
> with message:
> License errors detected, for more detail find ERROR in 
> hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE
> Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce 
> (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have 
> failed. Look above for specific messages explaining why the rule failed.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18472) Add guava license and update supplemental-models.xml

2017-07-28 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105296#comment-16105296
 ] 

Sean Busbey commented on HBASE-18472:
-

This shouldn't be needed. guava 11 has a parent of guava-parent 11, which lists 
a license of "Apache Software License, Version 2.0". Guava was expressly 
removed as a part of HBASE-18202 since that license name got handled fine.

> Add guava license and update supplemental-models.xml
> 
>
> Key: HBASE-18472
> URL: https://issues.apache.org/jira/browse/HBASE-18472
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.0-alpha-1
>Reporter: Yi Liang
>Assignee: Yi Liang
>Priority: Blocker
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HBASE-18472-master-v1.patch
>
>
> When I run mvn clean install -DskipTests on my local machine, lt always shows 
> error below 
> {quote}
> WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed 
> with message:
> License errors detected, for more detail find ERROR in 
> hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE
> Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce 
> (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have 
> failed. Look above for specific messages explaining why the rule failed.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18472) Add guava license and update supplemental-models.xml

2017-07-28 Thread Yi Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105299#comment-16105299
 ] 

Yi Liang commented on HBASE-18472:
--

OK, let me try on a clean workspace and m2 repository

> Add guava license and update supplemental-models.xml
> 
>
> Key: HBASE-18472
> URL: https://issues.apache.org/jira/browse/HBASE-18472
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.0-alpha-1
>Reporter: Yi Liang
>Assignee: Yi Liang
>Priority: Blocker
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HBASE-18472-master-v1.patch
>
>
> When I run mvn clean install -DskipTests on my local machine, lt always shows 
> error below 
> {quote}
> WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed 
> with message:
> License errors detected, for more detail find ERROR in 
> hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE
> Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce 
> (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have 
> failed. Look above for specific messages explaining why the rule failed.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18472) Add guava license and update supplemental-models.xml

2017-07-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105317#comment-16105317
 ] 

Hadoop QA commented on HBASE-18472:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
31m 15s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m  
8s{color} | {color:green} hbase-resource-bundle in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 8s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 36m 25s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.03.0-ce Server=17.03.0-ce Image:yetus/hbase:bdc94b1 |
| JIRA Issue | HBASE-18472 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12879373/HBASE-18472-master-v1.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  xml  |
| uname | Linux 362f41f6c671 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 2d06a06 |
| Default Java | 1.8.0_131 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7832/testReport/ |
| modules | C: hbase-resource-bundle U: hbase-resource-bundle |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/7832/console |
| Powered by | Apache Yetus 0.4.0   http://yetus.apache.org |


This message was automatically generated.



> Add guava license and update supplemental-models.xml
> 
>
> Key: HBASE-18472
> URL: https://issues.apache.org/jira/browse/HBASE-18472
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.0-alpha-1
>Reporter: Yi Liang
>Assignee: Yi Liang
>Priority: Blocker
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HBASE-18472-master-v1.patch
>
>
> When I run mvn clean install -DskipTests on my local machine, lt always shows 
> error below 
> {quote}
> WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed 
> with message:
> License errors detected, for more detail find ERROR in 
> hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE
> Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce 
> (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have 
> failed. Look above for specific messages explaining why the rule failed.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-12387) committer guidelines should include patch signoff

2017-07-28 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105323#comment-16105323
 ] 

Sean Busbey commented on HBASE-12387:
-

It didn't. since the ref guide includes a reference to a DISCUSS thread about 
the old attribution approach of "HBASE-121334 foo bar thing (contributor)", we 
should have one before we push on this.

Mind starting the thread [~mdrob]? If you don't think you have enough context, 
let me know and I'll do it.

> committer guidelines should include patch signoff
> -
>
> Key: HBASE-12387
> URL: https://issues.apache.org/jira/browse/HBASE-12387
> Project: HBase
>  Issue Type: Task
>  Components: documentation
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>
> Right now our guide for committers apply patches has them use {{git am}} 
> without a signoff flag. This works okay, but it misses adding the 
> "signed-off-by" blurb in the commit message.
> Those messages make it easier to see at a glance with e.g. {{git log}} which 
> committer applied the patch.
> this section:
> {quote}
> The directive to use git format-patch rather than git diff, and not to use 
> --no-prefix, is a new one. See the second example for how to apply a patch 
> created with git diff, and educate the person who created the patch.
> {code}
> $ git checkout -b HBASE-
> $ git am ~/Downloads/HBASE--v2.patch
> $ git checkout master
> $ git pull --rebase
> $ git cherry-pick 
> # Resolve conflicts if necessary or ask the submitter to do it
> $ git pull --rebase  # Better safe than sorry
> $ git push origin master
> $ git checkout branch-1
> $ git pull --rebase
> $ git cherry-pick 
> # Resolve conflicts if necessary
> $ git pull --rebase  # Better safe than sorry
> $ git push origin branch-1
> $ git branch -D HBASE-
> {code}
> {quote}
> Should be
> {quote}
> The directive to use git format-patch rather than git diff, and not to use 
> --no-prefix, is a new one. See the second example for how to apply a patch 
> created with git diff, and educate the person who created the patch.
> Note that the {{--signoff}} flag to {{git am}} will insert a line in the 
> commit message that the patch was checked by your author string. This 
> addition to your inclusion as the commit's committer makes your participation 
> more prominent to users browsing {{git log}}.
> {code}
> $ git checkout -b HBASE-
> $ git am --signoff ~/Downloads/HBASE--v2.patch
> $ git checkout master
> $ git pull --rebase
> $ git cherry-pick 
> # Resolve conflicts if necessary or ask the submitter to do it
> $ git pull --rebase  # Better safe than sorry
> $ git push origin master
> $ git checkout branch-1
> $ git pull --rebase
> $ git cherry-pick 
> # Resolve conflicts if necessary
> $ git pull --rebase  # Better safe than sorry
> $ git push origin branch-1
> $ git branch -D HBASE-
> {code}
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18472) Add guava license and update supplemental-models.xml

2017-07-28 Thread Yi Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liang updated HBASE-18472:
-
Resolution: Invalid
Status: Resolved  (was: Patch Available)

> Add guava license and update supplemental-models.xml
> 
>
> Key: HBASE-18472
> URL: https://issues.apache.org/jira/browse/HBASE-18472
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.0-alpha-1
>Reporter: Yi Liang
>Assignee: Yi Liang
>Priority: Blocker
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HBASE-18472-master-v1.patch
>
>
> When I run mvn clean install -DskipTests on my local machine, lt always shows 
> error below 
> {quote}
> WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed 
> with message:
> License errors detected, for more detail find ERROR in 
> hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE
> Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce 
> (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have 
> failed. Look above for specific messages explaining why the rule failed.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18472) Add guava license and update supplemental-models.xml

2017-07-28 Thread Yi Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105390#comment-16105390
 ] 

Yi Liang commented on HBASE-18472:
--

It seems something messed up in my local m2, it works fine now. Sorry for 
taking up you guys' time

> Add guava license and update supplemental-models.xml
> 
>
> Key: HBASE-18472
> URL: https://issues.apache.org/jira/browse/HBASE-18472
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.0-alpha-1
>Reporter: Yi Liang
>Assignee: Yi Liang
>Priority: Blocker
> Fix For: 2.0.0, 3.0.0
>
> Attachments: HBASE-18472-master-v1.patch
>
>
> When I run mvn clean install -DskipTests on my local machine, lt always shows 
> error below 
> {quote}
> WARNING] Rule 0: org.apache.maven.plugins.enforcer.EvaluateBeanshell failed 
> with message:
> License errors detected, for more detail find ERROR in 
> hbase-assembly/target/maven-shared-archive-resources/META-INF/LICENSE
> Failed to execute goal 
> org.apache.maven.plugins:maven-enforcer-plugin:1.4.1:enforce 
> (check-aggregate-license) on project hbase-assembly: Some Enforcer rules have 
> failed. Look above for specific messages explaining why the rule failed.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18469) Correct RegionServer metric of totalRequestCount

2017-07-28 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105407#comment-16105407
 ] 

Josh Elser commented on HBASE-18469:


[~zhangshibin], do you plan on inspecting this further yourself?

I don't think there's much one of us could do with the information you provided.

> Correct  RegionServer metric of  totalRequestCount
> --
>
> Key: HBASE-18469
> URL: https://issues.apache.org/jira/browse/HBASE-18469
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.2.0
>Reporter: Shibin Zhang
>Priority: Minor
>
> when i get the metric ,i found  this three metric may be have some error  as 
> follow :
> "totalRequestCount" : 17541,
> "readRequestCount" : 17483,
> "writeRequestCount" : 1633,



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18409) Migrate Client Metrics from codahale to hbase-metrics

2017-07-28 Thread Ronald Macmaster (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ronald Macmaster updated HBASE-18409:
-
Fix Version/s: 3.0.0
Affects Version/s: (was: 2.0.0-alpha-1)
   3.0.0
   Status: Patch Available  (was: Open)

The patch refactors the original MetricsConnection in the hbase-client module 
to report metrics via the Hadoop metrics2 system. 

Originally, metrics were reported privately through a codahale JMXReporter in 
the MetricsConnection class. The MetricsConnection class also recorded metrics 
using the codahale metrics classes rather than the hbase-metrics classes. These 
classes prove to be inflexible for the extendability and customization that 
hbase-client needs. 

Now, the MetricsConnection delegates updates to metrics to the metrics2 system. 
It does this through the addition of two new classes, MetricsClientSource and 
MetricsClientSourceImpl in the hbase-hadoop-compat and hbase-hadoop2-compat 
modules respectively. The new model closely resembles the architecture for 
collecting and reporting metrics from the Zookeeper client, master, and region 
server daemons. 

The patch unifies the concept of metrics reporting behind a single API.
Once the native infrastructure for metrics reporting via hbase-metrics is 
completed, metrics2 sources and sinks can be phased out accordingly. 

> Migrate Client Metrics from codahale to hbase-metrics
> -
>
> Key: HBASE-18409
> URL: https://issues.apache.org/jira/browse/HBASE-18409
> Project: HBase
>  Issue Type: Improvement
>  Components: Client, java, metrics
>Affects Versions: 3.0.0
>Reporter: Ronald Macmaster
>  Labels: newbie
> Fix For: 3.0.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Currently, the metrics for hbase-client are tailored for reporting via a 
> client-side JMX server.
> The MetricsConnection handles the metrics management and reporting via the 
> metrics platform from codahale. 
> This approach worked well for hbase-1.3.1 when the metrics platform was still 
> relatively young, but it could be improved by using the new 
> hbase-metrics-api. 
> Now that we have an actual hbase-metrics-api that master, regionserver, 
> zookeeper, and other daemons use, it would be good to also allow the client 
> to leverage the metrics-api. 
> Then, the client could also report its metrics via Hadoop's metrics2 if 
> desired or through another platform that utilizes the hbase-metrics-api. 
> If left alone, client metrics will continue to be only barely visible through 
> a client-side JMX server.
> The migration to the new metrics-api could be done by simply changing the 
> Metrics data types from codahale types to hbase-metrics types without 
> changing the metrics signatures of MetricsConnection unless completely 
> necessary. 
> The codahale MetricsRegistry would also have to be exchanged for a 
> hbase-metrics MetricsRegistry. 
> I found this to be a necessary change after attempting to implement my own 
> Reporter to use within the MetricsConnection class.
> I was attempting to create a HadoopMetrics2Reporter that extends the codahale 
> ScheduledReporter and reports the MetricsConnection metrics to Hadoop's 
> metrics2 system. 
> The already existing infrastructure in the hbase-metrics and 
> hbase-metrics-api projects could be easily leveraged for a cleaner solution.
> If completed successfully, users could instead access their client-side 
> metrics through the hbase-metrics-api. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18409) Migrate Client Metrics from codahale to hbase-metrics

2017-07-28 Thread Ronald Macmaster (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ronald Macmaster updated HBASE-18409:
-
Attachment: 
0001-HBASE-18409-MetricsConnection-client-metrics-migration.patch

> Migrate Client Metrics from codahale to hbase-metrics
> -
>
> Key: HBASE-18409
> URL: https://issues.apache.org/jira/browse/HBASE-18409
> Project: HBase
>  Issue Type: Improvement
>  Components: Client, java, metrics
>Affects Versions: 3.0.0
>Reporter: Ronald Macmaster
>  Labels: newbie
> Fix For: 3.0.0
>
> Attachments: 
> 0001-HBASE-18409-MetricsConnection-client-metrics-migration.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Currently, the metrics for hbase-client are tailored for reporting via a 
> client-side JMX server.
> The MetricsConnection handles the metrics management and reporting via the 
> metrics platform from codahale. 
> This approach worked well for hbase-1.3.1 when the metrics platform was still 
> relatively young, but it could be improved by using the new 
> hbase-metrics-api. 
> Now that we have an actual hbase-metrics-api that master, regionserver, 
> zookeeper, and other daemons use, it would be good to also allow the client 
> to leverage the metrics-api. 
> Then, the client could also report its metrics via Hadoop's metrics2 if 
> desired or through another platform that utilizes the hbase-metrics-api. 
> If left alone, client metrics will continue to be only barely visible through 
> a client-side JMX server.
> The migration to the new metrics-api could be done by simply changing the 
> Metrics data types from codahale types to hbase-metrics types without 
> changing the metrics signatures of MetricsConnection unless completely 
> necessary. 
> The codahale MetricsRegistry would also have to be exchanged for a 
> hbase-metrics MetricsRegistry. 
> I found this to be a necessary change after attempting to implement my own 
> Reporter to use within the MetricsConnection class.
> I was attempting to create a HadoopMetrics2Reporter that extends the codahale 
> ScheduledReporter and reports the MetricsConnection metrics to Hadoop's 
> metrics2 system. 
> The already existing infrastructure in the hbase-metrics and 
> hbase-metrics-api projects could be easily leveraged for a cleaner solution.
> If completed successfully, users could instead access their client-side 
> metrics through the hbase-metrics-api. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18473) VC.listLabels() erroneously closes any connection

2017-07-28 Thread Lars George (JIRA)
Lars George created HBASE-18473:
---

 Summary: VC.listLabels() erroneously closes any connection
 Key: HBASE-18473
 URL: https://issues.apache.org/jira/browse/HBASE-18473
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 1.1.11, 1.2.6, 1.3.1
Reporter: Lars George


In HBASE-13358 the {{VisibilityClient.listLabels()}} was amended to take in a 
connection from the caller, which totally makes sense. But the patch forgot to 
remove the unconditional call to {{connection.close()}} in the {{finally}} 
block:

{code}
finally {
  if (table != null) {
table.close();
  }
  if (connection != null) {
connection.close();
  }
}
{code}

Remove the second {{if}} completely.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-17908) Upgrade guava

2017-07-28 Thread Yi Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105463#comment-16105463
 ] 

Yi Liang commented on HBASE-17908:
--

Never mind my above comments, something on my mvn repository messed up. It 
works fine now

> Upgrade guava
> -
>
> Key: HBASE-17908
> URL: https://issues.apache.org/jira/browse/HBASE-17908
> Project: HBase
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Balazs Meszaros
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: 0001-HBASE-17908-Upgrade-guava.022.patch, 
> HBASE-17908.master.001.patch, HBASE-17908.master.002.patch, 
> HBASE-17908.master.003.patch, HBASE-17908.master.004.patch, 
> HBASE-17908.master.005.patch, HBASE-17908.master.006.patch, 
> HBASE-17908.master.007.patch, HBASE-17908.master.008.patch, 
> HBASE-17908.master.009.patch, HBASE-17908.master.010.patch, 
> HBASE-17908.master.011.patch, HBASE-17908.master.012.patch, 
> HBASE-17908.master.013.patch, HBASE-17908.master.013.patch, 
> HBASE-17908.master.014.patch, HBASE-17908.master.015.patch, 
> HBASE-17908.master.015.patch, HBASE-17908.master.016.patch, 
> HBASE-17908.master.017.patch, HBASE-17908.master.018.patch, 
> HBASE-17908.master.019.patch, HBASE-17908.master.020.patch, 
> HBASE-17908.master.021.patch, HBASE-17908.master.021.patch, 
> HBASE-17908.master.022.patch, HBASE-17908.master.023.patch, 
> HBASE-17908.master.024.patch, HBASE-17908.master.025.patch, 
> HBASE-17908.master.026.patch, HBASE-17908.master.027.patch, 
> HBASE-17908.master.028.patch
>
>
> Currently we are using guava 12.0.1, but the latest version is 21.0. 
> Upgrading guava is always a hassle because it is not always backward 
> compatible with itself.
> Currently I think there are to approaches:
> 1. Upgrade guava to the newest version (21.0) and shade it.
> 2. Upgrade guava to a version which does not break or builds (15.0).
> If we can update it, some dependencies should be removed: 
> commons-collections, commons-codec, ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18304) Start enforcing upperbounds on dependencies

2017-07-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105482#comment-16105482
 ] 

Hadoop QA commented on HBASE-18304:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
4s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  2m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
40s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
56s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  3m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
8s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  0m 
21s{color} | {color:red} The patch causes 10 errors with Hadoop v2.6.1. {color} 
|
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  0m 
34s{color} | {color:red} The patch causes 10 errors with Hadoop v2.6.2. {color} 
|
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  0m 
47s{color} | {color:red} The patch causes 10 errors with Hadoop v2.6.3. {color} 
|
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  1m  
3s{color} | {color:red} The patch causes 10 errors with Hadoop v2.6.4. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  1m 
16s{color} | {color:red} The patch causes 10 errors with Hadoop v2.6.5. {color} 
|
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
10s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
47s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 27s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  5m  
4s{color} | {color:green} hbase-spark in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hbase-spark-it in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}139m 
51s{color} | {color:green} root in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}281m 46s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | 
org.apache.hadoop.hbase.client.TestScanWithoutFetchingData |
|   | org.apache.hadoop.hbase.mapreduce.TestWALPlayer |
|   | org.apache.hadoop.hbase.coprocessor.TestHTableWrapper |
|   | org.apache.hadoop.hbase.client.TestSnapshotCloneIndependence |
|   | org.apache.hadoop.hbase.mapreduce.TestTa

[jira] [Commented] (HBASE-18024) HRegion#initializeRegionInternals should not re-create .hregioninfo file when the region directory no longer exists

2017-07-28 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105532#comment-16105532
 ] 

Josh Elser commented on HBASE-18024:


Any update on those test failures, [~esteban]? Anything I can do to help out?

> HRegion#initializeRegionInternals should not re-create .hregioninfo file when 
> the region directory no longer exists
> ---
>
> Key: HBASE-18024
> URL: https://issues.apache.org/jira/browse/HBASE-18024
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment, regionserver
>Affects Versions: 2.0.0, 1.4.0, 1.3.1, 1.2.5
>Reporter: Esteban Gutierrez
>Assignee: Esteban Gutierrez
> Attachments: HBASE-18024.001.patch
>
>
> When a RegionSever attempts to open a region, during initialization the RS 
> tries to open the {{/data///.hregioninfo}} 
> file, however if the {{.hregioninfofile}} doesn't exist, the RegionServer 
> will create a new one on {{HRegionFileSystem#checkRegionInfoOnFilesystem}}. A 
> side effect of that tools like hbck will incorrectly assume an inconsistency 
> due the presence of this new {{.hregioninfofile}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18025) CatalogJanitor should collect outdated RegionStates from the AM

2017-07-28 Thread Josh Elser (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105540#comment-16105540
 ] 

Josh Elser commented on HBASE-18025:


Took a glance at v3 and the problem you described makes sense. I'm a bit scared 
to +1 because I know how tricky this state management is :)

bq. A problem that we have observed is when region replicas are being used and 
there is a split, the region replica from parent doesn't get collected from the 
region states and when the balancer tries to assign the old parent region 
replica, this will cause the RegionServer to create a new HRI with the details 
of the parent causing an inconstancy

Is this something that reliably happens and is possible to capture in a test?

> CatalogJanitor should collect outdated RegionStates from the AM
> ---
>
> Key: HBASE-18025
> URL: https://issues.apache.org/jira/browse/HBASE-18025
> Project: HBase
>  Issue Type: Bug
>Reporter: Esteban Gutierrez
>Assignee: Esteban Gutierrez
> Attachments: HBASE-18025.001.patch, HBASE-18025.002.patch, 
> HBASE-18025.003.patch
>
>
> I don't think this will matter on the long run for HBase 2, but at least in 
> branch-1 and the current master we keep in multiple places copies of the 
> region states in the master and this copies include information like the HRI. 
> A problem that we have observed is when region replicas are being used and 
> there is a split, the region replica from parent doesn't get collected from 
> the region states and when the balancer tries to assign the old parent region 
> replica, this will cause the RegionServer to create a new HRI with the 
> details of the parent causing an inconstancy (see HBASE-18024).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18024) HRegion#initializeRegionInternals should not re-create .hregioninfo file when the region directory no longer exists

2017-07-28 Thread Esteban Gutierrez (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105547#comment-16105547
 ] 

Esteban Gutierrez commented on HBASE-18024:
---

Got a fix for the test in TestWALMonotonicallyIncreasingSeqId which is fine. 
What is interesting is in TestStoreFileRefresherChore, which is what was 
uncovered while troubleshooting the issue that brought this JIRA. After my 
change HRegion.initialize() will not attempt to re-create the regioninfo file, 
but throwing an exception will cause the test to fail since the region replica 
cannot be instantiated. One option I've been thinking is to modify 
HRegion#initialize() to have an optional argument to initialize the region on 
the filesystem in order to skip writeRegionInfoOnFilesystem.

> HRegion#initializeRegionInternals should not re-create .hregioninfo file when 
> the region directory no longer exists
> ---
>
> Key: HBASE-18024
> URL: https://issues.apache.org/jira/browse/HBASE-18024
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment, regionserver
>Affects Versions: 2.0.0, 1.4.0, 1.3.1, 1.2.5
>Reporter: Esteban Gutierrez
>Assignee: Esteban Gutierrez
> Attachments: HBASE-18024.001.patch
>
>
> When a RegionSever attempts to open a region, during initialization the RS 
> tries to open the {{/data///.hregioninfo}} 
> file, however if the {{.hregioninfofile}} doesn't exist, the RegionServer 
> will create a new one on {{HRegionFileSystem#checkRegionInfoOnFilesystem}}. A 
> side effect of that tools like hbck will incorrectly assume an inconsistency 
> due the presence of this new {{.hregioninfofile}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18025) CatalogJanitor should collect outdated RegionStates from the AM

2017-07-28 Thread Esteban Gutierrez (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105557#comment-16105557
 ] 

Esteban Gutierrez commented on HBASE-18025:
---

Yeah, it happens reliable. In TesCatalogJanitor the behavior cannot be 
captured, but I created another test for CatalogJanitor where I can monitor 
that the RegionStates are not cleared up after a split. I was a little 
skeptical about ServerManager#removeRegion but it think it makes sense to clean 
up storeFlushedSequenceIdsByRegion and flushedSequenceIdByRegion after a split 
or a merge.

> CatalogJanitor should collect outdated RegionStates from the AM
> ---
>
> Key: HBASE-18025
> URL: https://issues.apache.org/jira/browse/HBASE-18025
> Project: HBase
>  Issue Type: Bug
>Reporter: Esteban Gutierrez
>Assignee: Esteban Gutierrez
> Attachments: HBASE-18025.001.patch, HBASE-18025.002.patch, 
> HBASE-18025.003.patch
>
>
> I don't think this will matter on the long run for HBase 2, but at least in 
> branch-1 and the current master we keep in multiple places copies of the 
> region states in the master and this copies include information like the HRI. 
> A problem that we have observed is when region replicas are being used and 
> there is a split, the region replica from parent doesn't get collected from 
> the region states and when the balancer tries to assign the old parent region 
> replica, this will cause the RegionServer to create a new HRI with the 
> details of the parent causing an inconstancy (see HBASE-18024).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-17908) Upgrade guava

2017-07-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105571#comment-16105571
 ] 

stack commented on HBASE-17908:
---

Thanks for reporting and taking a look [~easyliangjob]

> Upgrade guava
> -
>
> Key: HBASE-17908
> URL: https://issues.apache.org/jira/browse/HBASE-17908
> Project: HBase
>  Issue Type: Sub-task
>  Components: dependencies
>Reporter: Balazs Meszaros
>Assignee: stack
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: 0001-HBASE-17908-Upgrade-guava.022.patch, 
> HBASE-17908.master.001.patch, HBASE-17908.master.002.patch, 
> HBASE-17908.master.003.patch, HBASE-17908.master.004.patch, 
> HBASE-17908.master.005.patch, HBASE-17908.master.006.patch, 
> HBASE-17908.master.007.patch, HBASE-17908.master.008.patch, 
> HBASE-17908.master.009.patch, HBASE-17908.master.010.patch, 
> HBASE-17908.master.011.patch, HBASE-17908.master.012.patch, 
> HBASE-17908.master.013.patch, HBASE-17908.master.013.patch, 
> HBASE-17908.master.014.patch, HBASE-17908.master.015.patch, 
> HBASE-17908.master.015.patch, HBASE-17908.master.016.patch, 
> HBASE-17908.master.017.patch, HBASE-17908.master.018.patch, 
> HBASE-17908.master.019.patch, HBASE-17908.master.020.patch, 
> HBASE-17908.master.021.patch, HBASE-17908.master.021.patch, 
> HBASE-17908.master.022.patch, HBASE-17908.master.023.patch, 
> HBASE-17908.master.024.patch, HBASE-17908.master.025.patch, 
> HBASE-17908.master.026.patch, HBASE-17908.master.027.patch, 
> HBASE-17908.master.028.patch
>
>
> Currently we are using guava 12.0.1, but the latest version is 21.0. 
> Upgrading guava is always a hassle because it is not always backward 
> compatible with itself.
> Currently I think there are to approaches:
> 1. Upgrade guava to the newest version (21.0) and shade it.
> 2. Upgrade guava to a version which does not break or builds (15.0).
> If we can update it, some dependencies should be removed: 
> commons-collections, commons-codec, ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18474) HRegion#doMiniBatchMutation is acquiring read row locks

2017-07-28 Thread Andrew Purtell (JIRA)
Andrew Purtell created HBASE-18474:
--

 Summary: HRegion#doMiniBatchMutation is acquiring read row locks
 Key: HBASE-18474
 URL: https://issues.apache.org/jira/browse/HBASE-18474
 Project: HBase
  Issue Type: Bug
Reporter: Andrew Purtell






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18474) HRegion#doMiniBatchMutation is acquiring read row locks

2017-07-28 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-18474:
---
Description: 
Looking at 1.3, HRegion#doMiniBatchMutation is acquiring read row locks in step 
1. 
{code}
// If we haven't got any rows in our batch, we should block to  
 
// get the next one.
 
RowLock rowLock = null;
try {
  rowLock = getRowLockInternal(mutation.getRow(), true);
} catch (TimeoutIOException e) {
  // We will retry when other exceptions, but we should stop if we 
timeout . 
  throw e;
} catch (IOException ioe) {
  LOG.warn("Failed getting lock in batch put, row="
+ Bytes.toStringBinary(mutation.getRow()), ioe);
}
if (rowLock == null) {
  // We failed to grab another lock 
 
  break; // stop acquiring more rows for this batch 
 
} else {
  acquiredRowLocks.add(rowLock);
}
{code}


Other code paths that apply mutations are acquiring write locks.

In HRegion#append
{code}
try {
  rowLock = getRowLockInternal(row, false);
  assert rowLock != null;
...
{code}

In HRegion#doIn
{code}
try {
  rowLock = getRowLockInternal(increment.getRow(), false);
...
{code}

In HRegion#checkAndMutate
{code}
  // Lock row - note that doBatchMutate will relock this row if called  
 
  RowLock rowLock = getRowLockInternal(get.getRow(), false);
  // wait for all previous transactions to complete (with lock held)
 
  mvcc.await();
{code}

What doMiniBatchMutation is doing looks wrong. 

> HRegion#doMiniBatchMutation is acquiring read row locks
> ---
>
> Key: HBASE-18474
> URL: https://issues.apache.org/jira/browse/HBASE-18474
> Project: HBase
>  Issue Type: Bug
>Reporter: Andrew Purtell
>
> Looking at 1.3, HRegion#doMiniBatchMutation is acquiring read row locks in 
> step 1. 
> {code}
> // If we haven't got any rows in our batch, we should block to
>
> // get the next one.  
>
> RowLock rowLock = null;
> try {
>   rowLock = getRowLockInternal(mutation.getRow(), true);
> } catch (TimeoutIOException e) {
>   // We will retry when other exceptions, but we should stop if we 
> timeout . 
>   throw e;
> } catch (IOException ioe) {
>   LOG.warn("Failed getting lock in batch put, row="
> + Bytes.toStringBinary(mutation.getRow()), ioe);
> }
> if (rowLock == null) {
>   // We failed to grab another lock   
>
>   break; // stop acquiring more rows for this batch   
>
> } else {
>   acquiredRowLocks.add(rowLock);
> }
> {code}
> Other code paths that apply mutations are acquiring write locks.
> In HRegion#append
> {code}
> try {
>   rowLock = getRowLockInternal(row, false);
>   assert rowLock != null;
> ...
> {code}
> In HRegion#doIn
> {code}
> try {
>   rowLock = getRowLockInternal(increment.getRow(), false);
> ...
> {code}
> In HRegion#checkAndMutate
> {code}
>   // Lock row - note that doBatchMutate will relock this row if called
>
>   RowLock rowLock = getRowLockInternal(get.getRow(), false);
>   // wait for all previous transactions to complete (with lock held)  
>
>   mvcc.await();
> {code}
> What doMiniBatchMutation is doing looks wrong. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18474) HRegion#doMiniBatchMutation is acquiring read row locks

2017-07-28 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-18474:
---
Description: 
Looking at 1.3, HRegion#doMiniBatchMutation is acquiring read row locks in step 
1. 
{code}
// If we haven't got any rows in our batch, we should block to  
 
// get the next one.
 
RowLock rowLock = null;
try {
  rowLock = getRowLockInternal(mutation.getRow(), true);
} catch (TimeoutIOException e) {
  // We will retry when other exceptions, but we should stop if we 
timeout . 
  throw e;
} catch (IOException ioe) {
  LOG.warn("Failed getting lock in batch put, row="
+ Bytes.toStringBinary(mutation.getRow()), ioe);
}
if (rowLock == null) {
  // We failed to grab another lock 
 
  break; // stop acquiring more rows for this batch 
 
} else {
  acquiredRowLocks.add(rowLock);
}
{code}


Other code paths that apply mutations are acquiring write locks.

In HRegion#append
{code}
try {
  rowLock = getRowLockInternal(row, false);
  assert rowLock != null;
...
{code}

In HRegion#doIn
{code}
try {
  rowLock = getRowLockInternal(increment.getRow(), false);
...
{code}

In HRegion#checkAndMutate
{code}
  // Lock row - note that doBatchMutate will relock this row if called  
 
  RowLock rowLock = getRowLockInternal(get.getRow(), false);
  // wait for all previous transactions to complete (with lock held)
 
  mvcc.await();
{code}

In HRegion#processRowsWithLocks
{code}
  // 2. Acquire the row lock(s) 
 
  acquiredRowLocks = new ArrayList(rowsToLock.size());
  for (byte[] row : rowsToLock) {
// Attempt to lock all involved rows, throw if any lock times out   
 
// use a writer lock for mixed reads and writes 
 
acquiredRowLocks.add(getRowLockInternal(row, false));
  }
{code}

and so on.

What doMiniBatchMutation is doing looks wrong. 

  was:
Looking at 1.3, HRegion#doMiniBatchMutation is acquiring read row locks in step 
1. 
{code}
// If we haven't got any rows in our batch, we should block to  
 
// get the next one.
 
RowLock rowLock = null;
try {
  rowLock = getRowLockInternal(mutation.getRow(), true);
} catch (TimeoutIOException e) {
  // We will retry when other exceptions, but we should stop if we 
timeout . 
  throw e;
} catch (IOException ioe) {
  LOG.warn("Failed getting lock in batch put, row="
+ Bytes.toStringBinary(mutation.getRow()), ioe);
}
if (rowLock == null) {
  // We failed to grab another lock 
 
  break; // stop acquiring more rows for this batch 
 
} else {
  acquiredRowLocks.add(rowLock);
}
{code}


Other code paths that apply mutations are acquiring write locks.

In HRegion#append
{code}
try {
  rowLock = getRowLockInternal(row, false);
  assert rowLock != null;
...
{code}

In HRegion#doIn
{code}
try {
  rowLock = getRowLockInternal(increment.getRow(), false);
...
{code}

In HRegion#checkAndMutate
{code}
  // Lock row - note that doBatchMutate will relock this row if called  
 
  RowLock rowLock = getRowLockInternal(get.getRow(), false);
  // wait for all previous transactions to complete (with lock held)
 
  mvcc.await();
{code}

What doMiniBatchMutation is doing looks wrong. 


> HRegion#doMiniBatchMu

[jira] [Commented] (HBASE-12387) committer guidelines should include patch signoff

2017-07-28 Thread Mike Drob (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-12387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105609#comment-16105609
 ] 

Mike Drob commented on HBASE-12387:
---

It looks like this got included almost verbatim already at some point (see 
example 60) in https://hbase.apache.org/book.html#committing.patches

> committer guidelines should include patch signoff
> -
>
> Key: HBASE-12387
> URL: https://issues.apache.org/jira/browse/HBASE-12387
> Project: HBase
>  Issue Type: Task
>  Components: documentation
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>
> Right now our guide for committers apply patches has them use {{git am}} 
> without a signoff flag. This works okay, but it misses adding the 
> "signed-off-by" blurb in the commit message.
> Those messages make it easier to see at a glance with e.g. {{git log}} which 
> committer applied the patch.
> this section:
> {quote}
> The directive to use git format-patch rather than git diff, and not to use 
> --no-prefix, is a new one. See the second example for how to apply a patch 
> created with git diff, and educate the person who created the patch.
> {code}
> $ git checkout -b HBASE-
> $ git am ~/Downloads/HBASE--v2.patch
> $ git checkout master
> $ git pull --rebase
> $ git cherry-pick 
> # Resolve conflicts if necessary or ask the submitter to do it
> $ git pull --rebase  # Better safe than sorry
> $ git push origin master
> $ git checkout branch-1
> $ git pull --rebase
> $ git cherry-pick 
> # Resolve conflicts if necessary
> $ git pull --rebase  # Better safe than sorry
> $ git push origin branch-1
> $ git branch -D HBASE-
> {code}
> {quote}
> Should be
> {quote}
> The directive to use git format-patch rather than git diff, and not to use 
> --no-prefix, is a new one. See the second example for how to apply a patch 
> created with git diff, and educate the person who created the patch.
> Note that the {{--signoff}} flag to {{git am}} will insert a line in the 
> commit message that the patch was checked by your author string. This 
> addition to your inclusion as the commit's committer makes your participation 
> more prominent to users browsing {{git log}}.
> {code}
> $ git checkout -b HBASE-
> $ git am --signoff ~/Downloads/HBASE--v2.patch
> $ git checkout master
> $ git pull --rebase
> $ git cherry-pick 
> # Resolve conflicts if necessary or ask the submitter to do it
> $ git pull --rebase  # Better safe than sorry
> $ git push origin master
> $ git checkout branch-1
> $ git pull --rebase
> $ git cherry-pick 
> # Resolve conflicts if necessary
> $ git pull --rebase  # Better safe than sorry
> $ git push origin branch-1
> $ git branch -D HBASE-
> {code}
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18466) [C++] Support handling exception in RpcTestServer

2017-07-28 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HBASE-18466:
--
Attachment: HBASE-18466.001.patch

> [C++] Support handling exception in RpcTestServer
> -
>
> Key: HBASE-18466
> URL: https://issues.apache.org/jira/browse/HBASE-18466
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HBASE-18466.000.patch, HBASE-18466.001.patch
>
>
> In order simulate various error from servers,  exceptions should be handled 
> properly. The idea is to zip exception into hbase::Response in 
> RpcTestService, and Serialize response to folly::IOBuf and write it down the 
> pipeline.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HBASE-18466) [C++] Support handling exception in RpcTestServer

2017-07-28 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HBASE-18466:
--
Status: Patch Available  (was: Open)

> [C++] Support handling exception in RpcTestServer
> -
>
> Key: HBASE-18466
> URL: https://issues.apache.org/jira/browse/HBASE-18466
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HBASE-18466.000.patch, HBASE-18466.001.patch
>
>
> In order simulate various error from servers,  exceptions should be handled 
> properly. The idea is to zip exception into hbase::Response in 
> RpcTestService, and Serialize response to folly::IOBuf and write it down the 
> pipeline.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18466) [C++] Support handling exception in RpcTestServer

2017-07-28 Thread Xiaobing Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105643#comment-16105643
 ] 

Xiaobing Zhou commented on HBASE-18466:
---

Posted v1:
# fixed promise broken issue when 
ResponseHeader::set_allocated_exception(pb::ExceptionResponse) is called 
without new style allocation of pb::ExceptionResponse.

> [C++] Support handling exception in RpcTestServer
> -
>
> Key: HBASE-18466
> URL: https://issues.apache.org/jira/browse/HBASE-18466
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Xiaobing Zhou
>Assignee: Xiaobing Zhou
> Attachments: HBASE-18466.000.patch, HBASE-18466.001.patch
>
>
> In order simulate various error from servers,  exceptions should be handled 
> properly. The idea is to zip exception into hbase::Response in 
> RpcTestService, and Serialize response to folly::IOBuf and write it down the 
> pipeline.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HBASE-18474) HRegion#doMiniBatchMutation is acquiring read row locks

2017-07-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-18474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16105647#comment-16105647
 ] 

stack commented on HBASE-18474:
---

Our row locks are read/write since 1.2. Whenever a modification on a row, we 
take a read lock. Concurrent threads updating a row are allowed. mvcc ensures 
ongoing reads always get a consistent row 'view'. The exceptions are 
read/modify/write operations such as increment/append/checkAndPut. These need 
to 'read' and then update the value they just read. To ensure the row 'view' 
doesn't change between read and write (mvcc is column family scope only), r/m/w 
ops take a write lock. [~apurtell] FYI sir.

> HRegion#doMiniBatchMutation is acquiring read row locks
> ---
>
> Key: HBASE-18474
> URL: https://issues.apache.org/jira/browse/HBASE-18474
> Project: HBase
>  Issue Type: Bug
>Reporter: Andrew Purtell
>
> Looking at 1.3, HRegion#doMiniBatchMutation is acquiring read row locks in 
> step 1. 
> {code}
> // If we haven't got any rows in our batch, we should block to
>
> // get the next one.  
>
> RowLock rowLock = null;
> try {
>   rowLock = getRowLockInternal(mutation.getRow(), true);
> } catch (TimeoutIOException e) {
>   // We will retry when other exceptions, but we should stop if we 
> timeout . 
>   throw e;
> } catch (IOException ioe) {
>   LOG.warn("Failed getting lock in batch put, row="
> + Bytes.toStringBinary(mutation.getRow()), ioe);
> }
> if (rowLock == null) {
>   // We failed to grab another lock   
>
>   break; // stop acquiring more rows for this batch   
>
> } else {
>   acquiredRowLocks.add(rowLock);
> }
> {code}
> Other code paths that apply mutations are acquiring write locks.
> In HRegion#append
> {code}
> try {
>   rowLock = getRowLockInternal(row, false);
>   assert rowLock != null;
> ...
> {code}
> In HRegion#doIn
> {code}
> try {
>   rowLock = getRowLockInternal(increment.getRow(), false);
> ...
> {code}
> In HRegion#checkAndMutate
> {code}
>   // Lock row - note that doBatchMutate will relock this row if called
>
>   RowLock rowLock = getRowLockInternal(get.getRow(), false);
>   // wait for all previous transactions to complete (with lock held)  
>
>   mvcc.await();
> {code}
> In HRegion#processRowsWithLocks
> {code}
>   // 2. Acquire the row lock(s)   
>
>   acquiredRowLocks = new ArrayList(rowsToLock.size());
>   for (byte[] row : rowsToLock) {
> // Attempt to lock all involved rows, throw if any lock times out 
>
> // use a writer lock for mixed reads and writes   
>
> acquiredRowLocks.add(getRowLockInternal(row, false));
>   }
> {code}
> and so on.
> What doMiniBatchMutation is doing looks wrong. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >