[GitHub] [hbase] Apache9 commented on issue #594: HBASE-8458 Support for batch version of checkAndPut() and checkAndDelete()

2019-10-26 Thread GitBox
Apache9 commented on issue #594: HBASE-8458 Support for batch version of 
checkAndPut() and checkAndDelete()
URL: https://github.com/apache/hbase/pull/594#issuecomment-546612086
 
 
   Could you please upload a patch against master?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hbase] Apache-HBase commented on issue #761: HBASE-23213 : Reopen regions with very high Store Ref Counts(backport…

2019-10-26 Thread GitBox
Apache-HBase commented on issue #761: HBASE-23213 : Reopen regions with very 
high Store Ref Counts(backport…
URL: https://github.com/apache/hbase/pull/761#issuecomment-546626087
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | :blue_heart: |  reexec  |   1m 19s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | :blue_heart: |  prototool  |   0m  0s |  prototool was not available.  |
   | :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any 
anti-patterns.  |
   | :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 2 
new or modified test files.  |
   ||| _ branch-1 Compile Tests _ |
   | :blue_heart: |  mvndep  |   1m 23s |  Maven dependency ordering for branch 
 |
   | :green_heart: |  mvninstall  |   7m 29s |  branch-1 passed  |
   | :green_heart: |  compile  |   1m 39s |  branch-1 passed with JDK 
v1.8.0_232  |
   | :green_heart: |  compile  |   1m 46s |  branch-1 passed with JDK 
v1.7.0_242  |
   | :green_heart: |  checkstyle  |  10m 59s |  branch-1 passed  |
   | :blue_heart: |  refguide  |   3m 24s |  branch has no errors when building 
the reference guide. See footer for rendered docs, which you should manually 
inspect.  |
   | :green_heart: |  shadedjars  |   2m 50s |  branch has no errors when 
building our shaded downstream artifacts.  |
   | :green_heart: |  javadoc  |   3m 45s |  branch-1 passed with JDK 
v1.8.0_232  |
   | :green_heart: |  javadoc  |   5m 48s |  branch-1 passed with JDK 
v1.7.0_242  |
   | :blue_heart: |  spotbugs  |   2m 28s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | :green_heart: |  findbugs  |  18m  2s |  branch-1 passed  |
   ||| _ Patch Compile Tests _ |
   | :blue_heart: |  mvndep  |   0m 17s |  Maven dependency ordering for patch  
|
   | :green_heart: |  mvninstall  |   1m 58s |  the patch passed  |
   | :green_heart: |  compile  |   1m 42s |  the patch passed with JDK 
v1.8.0_232  |
   | :green_heart: |  cc  |   1m 42s |  the patch passed  |
   | :green_heart: |  javac  |   1m 42s |  the patch passed  |
   | :green_heart: |  compile  |   1m 46s |  the patch passed with JDK 
v1.7.0_242  |
   | :green_heart: |  cc  |   1m 46s |  the patch passed  |
   | :green_heart: |  javac  |   1m 46s |  the patch passed  |
   | :green_heart: |  checkstyle  |  10m 54s |  the patch passed  |
   | :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | :broken_heart: |  xml  |   0m  0s |  The patch has 1 ill-formed XML 
file(s).  |
   | :blue_heart: |  refguide  |   3m  0s |  patch has no errors when building 
the reference guide. See footer for rendered docs, which you should manually 
inspect.  |
   | :green_heart: |  shadedjars  |   2m 44s |  patch has no errors when 
building our shaded downstream artifacts.  |
   | :green_heart: |  hadoopcheck  |   4m 59s |  Patch does not cause any 
errors with Hadoop 2.8.5 2.9.2.  |
   | :green_heart: |  hbaseprotoc  |   4m 24s |  the patch passed  |
   | :green_heart: |  javadoc  |   3m 52s |  the patch passed with JDK 
v1.8.0_232  |
   | :green_heart: |  javadoc  |   5m 55s |  the patch passed with JDK 
v1.7.0_242  |
   | :broken_heart: |  findbugs  |   2m 55s |  hbase-server generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0)  |
   | :broken_heart: |  findbugs  |  10m 29s |  root generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0)  |
   ||| _ Other Tests _ |
   | :broken_heart: |  unit  | 166m 43s |  root in the patch failed.  |
   | :green_heart: |  asflicense  |   2m 50s |  The patch does not generate ASF 
License warnings.  |
   |  |   | 295m  4s |   |
   
   
   | Reason | Tests |
   |---:|:--|
   | XML | Parsing Error(s): |
   |   | hbase-common/src/main/resources/hbase-default.xml |
   | FindBugs | module:hbase-server |
   |  |  org.apache.hadoop.hbase.master.RegionsRecoveryChore.chore() makes 
inefficient use of keySet iterator instead of entrySet iterator  At 
RegionsRecoveryChore.java:keySet iterator instead of entrySet iterator  At 
RegionsRecoveryChore.java:[line 104] |
   | FindBugs | module:root |
   |  |  org.apache.hadoop.hbase.master.RegionsRecoveryChore.chore() makes 
inefficient use of keySet iterator instead of entrySet iterator  At 
RegionsRecoveryChore.java:keySet iterator instead of entrySet iterator  At 
RegionsRecoveryChore.java:[line 104] |
   | Failed junit tests | hadoop.hbase.client.TestAdmin1 |
   |   | 
hadoop.hbase.client.replication.TestReplicationAdminWithTwoDifferentZKClusters |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.4 Server=19.03.4 base: 
https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/1/artifact/out/Dockerfile
 |
   | 

[GitHub] [hbase] Apache9 commented on issue #742: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush of XXXX, because it is not online on us"

2019-10-26 Thread GitBox
Apache9 commented on issue #742: HBASE-23181 Blocked WAL archive: "LogRoller: 
Failed to schedule flush of , because it is not online on us"
URL: https://github.com/apache/hbase/pull/742#issuecomment-546611398
 
 
   We use #753 finally.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hbase] Apache9 closed pull request #742: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush of XXXX, because it is not online on us"

2019-10-26 Thread GitBox
Apache9 closed pull request #742: HBASE-23181 Blocked WAL archive: "LogRoller: 
Failed to schedule flush of , because it is not online on us"
URL: https://github.com/apache/hbase/pull/742
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hbase] Apache9 commented on a change in pull request #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush…

2019-10-26 Thread GitBox
Apache9 commented on a change in pull request #753: HBASE-23181 Blocked WAL 
archive: "LogRoller: Failed to schedule flush…
URL: https://github.com/apache/hbase/pull/753#discussion_r339287780
 
 

 ##
 File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java
 ##
 @@ -1001,8 +1001,13 @@ protected final boolean append(W writer, FSWALEntry 
entry) throws IOException {
 doAppend(writer, entry);
 assert highestUnsyncedTxid < entry.getTxid();
 highestUnsyncedTxid = entry.getTxid();
-sequenceIdAccounting.update(encodedRegionName, entry.getFamilyNames(), 
regionSequenceId,
-  entry.isInMemStore());
+if (entry.isCloseRegion()) {
+  // let's clean all the records of this region
+  sequenceIdAccounting.onRegionClose(encodedRegionName);
+} else {
+  sequenceIdAccounting.update(encodedRegionName, entry.getFamilyNames(), 
regionSequenceId,
+entry.isInMemStore());
 
 Review comment:
   This does not make sense... You do not want the closeRegion flag but still 
want the inMemstore flag? At least with appendData and appendMarker, we could 
remove at least one of the parameters in the methods of the WAL interface...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hbase] saintstack commented on a change in pull request #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush…

2019-10-26 Thread GitBox
saintstack commented on a change in pull request #753: HBASE-23181 Blocked WAL 
archive: "LogRoller: Failed to schedule flush…
URL: https://github.com/apache/hbase/pull/753#discussion_r339284636
 
 

 ##
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WAL.java
 ##
 @@ -97,44 +97,59 @@
   void close() throws IOException;
 
   /**
-   * Append a set of edits to the WAL. The WAL is not flushed/sync'd after 
this transaction
-   * completes BUT on return this edit must have its region edit/sequence id 
assigned
-   * else it messes up our unification of mvcc and sequenceid.  On return 
key will
-   * have the region edit/sequence id filled in.
+   * Append a set of data edits to the WAL. 'Data' here means that the content 
in the edits will
+   * also be added to memstore.
+   * 
+   * The WAL is not flushed/sync'd after this transaction completes BUT on 
return this edit must
+   * have its region edit/sequence id assigned else it messes up our 
unification of mvcc and
+   * sequenceid. On return key will have the region edit/sequence 
id filled in.
* @param info the regioninfo associated with append
* @param key Modified by this call; we add to it this edits region 
edit/sequence id.
* @param edits Edits to append. MAY CONTAIN NO EDITS for case where we want 
to get an edit
-   * sequence id that is after all currently appended edits.
-   * @param inMemstore Always true except for case where we are writing a 
compaction completion
-   * record into the WAL; in this case the entry is just so we can finish an 
unfinished compaction
-   * -- it is not an edit for memstore.
+   *  sequence id that is after all currently appended edits.
* @return Returns a 'transaction id' and key will have the 
region edit/sequence id
-   * in it.
+   * in it.
+   * @see #appendMarker(RegionInfo, WALKeyImpl, WALEdit, boolean)
*/
-  long append(RegionInfo info, WALKeyImpl key, WALEdit edits, boolean 
inMemstore) throws IOException;
+  long appendData(RegionInfo info, WALKeyImpl key, WALEdit edits) throws 
IOException;
+
+  /**
+   * Append a marker edit to the WAL. A marker could be a FlushDescriptor, a 
compaction marker, or
+   * region event marker. The difference here is that, a marker will not be 
added to memstore.
+   * 
+   * The WAL is not flushed/sync'd after this transaction completes BUT on 
return this edit must
+   * have its region edit/sequence id assigned else it messes up our 
unification of mvcc and
+   * sequenceid. On return key will have the region edit/sequence 
id filled in.
+   * @param info the regioninfo associated with append
+   * @param key Modified by this call; we add to it this edits region 
edit/sequence id.
+   * @param edits Edits to append. MAY CONTAIN NO EDITS for case where we want 
to get an edit
+   *  sequence id that is after all currently appended edits.
+   * @param closeRegion Whether this is a region close marker, i.e, the last 
wal edit for this
 
 Review comment:
   I messed w/ the patch. I see how the appendData and appendMarker don't take 
us far enough down... down to AbstractFSWAL#appendEntry where we could ask if a 
close marker. But looking at patch, what would be wrong w/ doing something like 
this?
   
   ```
   diff --git 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java
 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java
   index 2eb7c7436a..bc31204500 100644
   --- 
a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java
   +++ 
b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java
   @@ -1001,7 +1001,7 @@ public abstract class AbstractFSWAL implements WAL {
doAppend(writer, entry);
assert highestUnsyncedTxid < entry.getTxid();
highestUnsyncedTxid = entry.getTxid();
   -if (entry.isCloseRegion()) {
   +if (!entry.isInMemStore() && entry.isCloseMarker()) {
  // let's clean all the records of this region
  sequenceIdAccounting.onRegionClose(encodedRegionName);
} else {
   ```
   
   ... where entry.isCloseMarker would do something like WALEdit.isMetaEdit...
   
   ```
 public boolean isMetaEdit() {
   for (Cell cell: cells) {
 if (!isMetaEditFamily(cell)) {
   return false;
 }
   }
   return true;
 }
   ```
   
   ... only instead we'd look for METAFAMILY:HBASE::CLOSE, a new define added 
on WALEdit?
   
   Thanks for taking a look.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hbase] saintstack commented on a change in pull request #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush…

2019-10-26 Thread GitBox
saintstack commented on a change in pull request #753: HBASE-23181 Blocked WAL 
archive: "LogRoller: Failed to schedule flush…
URL: https://github.com/apache/hbase/pull/753#discussion_r339290979
 
 

 ##
 File path: 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java
 ##
 @@ -1001,8 +1001,13 @@ protected final boolean append(W writer, FSWALEntry 
entry) throws IOException {
 doAppend(writer, entry);
 assert highestUnsyncedTxid < entry.getTxid();
 highestUnsyncedTxid = entry.getTxid();
-sequenceIdAccounting.update(encodedRegionName, entry.getFamilyNames(), 
regionSequenceId,
-  entry.isInMemStore());
+if (entry.isCloseRegion()) {
+  // let's clean all the records of this region
+  sequenceIdAccounting.onRegionClose(encodedRegionName);
+} else {
+  sequenceIdAccounting.update(encodedRegionName, entry.getFamilyNames(), 
regionSequenceId,
+entry.isInMemStore());
 
 Review comment:
   The isInMemStore flag is there currently? No? I'm not advocating adding 
anything not already present.
   
   Looking at this patch, the new methods appendData and appendMarker do not 
seem to buy us much. They do not go deep enough down into AbstractFSWAL.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hbase] saintstack commented on a change in pull request #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush…

2019-10-26 Thread GitBox
saintstack commented on a change in pull request #753: HBASE-23181 Blocked WAL 
archive: "LogRoller: Failed to schedule flush…
URL: https://github.com/apache/hbase/pull/753#discussion_r339291059
 
 

 ##
 File path: hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WAL.java
 ##
 @@ -97,44 +97,59 @@
   void close() throws IOException;
 
   /**
-   * Append a set of edits to the WAL. The WAL is not flushed/sync'd after 
this transaction
-   * completes BUT on return this edit must have its region edit/sequence id 
assigned
-   * else it messes up our unification of mvcc and sequenceid.  On return 
key will
-   * have the region edit/sequence id filled in.
+   * Append a set of data edits to the WAL. 'Data' here means that the content 
in the edits will
+   * also be added to memstore.
+   * 
+   * The WAL is not flushed/sync'd after this transaction completes BUT on 
return this edit must
+   * have its region edit/sequence id assigned else it messes up our 
unification of mvcc and
+   * sequenceid. On return key will have the region edit/sequence 
id filled in.
* @param info the regioninfo associated with append
* @param key Modified by this call; we add to it this edits region 
edit/sequence id.
* @param edits Edits to append. MAY CONTAIN NO EDITS for case where we want 
to get an edit
-   * sequence id that is after all currently appended edits.
-   * @param inMemstore Always true except for case where we are writing a 
compaction completion
-   * record into the WAL; in this case the entry is just so we can finish an 
unfinished compaction
-   * -- it is not an edit for memstore.
+   *  sequence id that is after all currently appended edits.
* @return Returns a 'transaction id' and key will have the 
region edit/sequence id
-   * in it.
+   * in it.
+   * @see #appendMarker(RegionInfo, WALKeyImpl, WALEdit, boolean)
*/
-  long append(RegionInfo info, WALKeyImpl key, WALEdit edits, boolean 
inMemstore) throws IOException;
+  long appendData(RegionInfo info, WALKeyImpl key, WALEdit edits) throws 
IOException;
+
+  /**
+   * Append a marker edit to the WAL. A marker could be a FlushDescriptor, a 
compaction marker, or
+   * region event marker. The difference here is that, a marker will not be 
added to memstore.
+   * 
+   * The WAL is not flushed/sync'd after this transaction completes BUT on 
return this edit must
+   * have its region edit/sequence id assigned else it messes up our 
unification of mvcc and
+   * sequenceid. On return key will have the region edit/sequence 
id filled in.
+   * @param info the regioninfo associated with append
+   * @param key Modified by this call; we add to it this edits region 
edit/sequence id.
+   * @param edits Edits to append. MAY CONTAIN NO EDITS for case where we want 
to get an edit
+   *  sequence id that is after all currently appended edits.
+   * @param closeRegion Whether this is a region close marker, i.e, the last 
wal edit for this
 
 Review comment:
   Sure. We'd continue testing a boolean. Rarely, we'd check the type looking 
for a close and we'd find a flush or compaction marker instead. Not end of the 
world and we could improve in follow-up.
   
   On your this is private and we can improve in a follow on even removing 
parameters added by this patch, I agree.
   
   So let me +1 this patch as is. I'll have a go at trying to do an improvement 
in a follow-on. Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hbase] Apache-HBase commented on issue #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush…

2019-10-26 Thread GitBox
Apache-HBase commented on issue #753: HBASE-23181 Blocked WAL archive: 
"LogRoller: Failed to schedule flush…
URL: https://github.com/apache/hbase/pull/753#issuecomment-546589729
 
 
   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | :blue_heart: |  reexec  |   0m 35s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | :green_heart: |  dupname  |   0m  1s |  No case conflicting files found.  |
   | :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any 
anti-patterns.  |
   | :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 24 
new or modified test files.  |
   ||| _ master Compile Tests _ |
   | :blue_heart: |  mvndep  |   0m 35s |  Maven dependency ordering for branch 
 |
   | :green_heart: |  mvninstall  |   5m 10s |  master passed  |
   | :green_heart: |  compile  |   1m 45s |  master passed  |
   | :green_heart: |  checkstyle  |   2m 15s |  master passed  |
   | :green_heart: |  shadedjars  |   4m 34s |  branch has no errors when 
building our shaded downstream artifacts.  |
   | :green_heart: |  javadoc  |   1m 17s |  master passed  |
   | :blue_heart: |  spotbugs  |   4m 12s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | :green_heart: |  findbugs  |   5m 44s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | :blue_heart: |  mvndep  |   0m 15s |  Maven dependency ordering for patch  
|
   | :green_heart: |  mvninstall  |   4m 53s |  the patch passed  |
   | :green_heart: |  compile  |   1m 47s |  the patch passed  |
   | :green_heart: |  javac  |   1m 47s |  the patch passed  |
   | :green_heart: |  checkstyle  |   0m 26s |  The patch passed checkstyle in 
hbase-common  |
   | :green_heart: |  checkstyle  |   1m 28s |  hbase-server: The patch 
generated 0 new + 372 unchanged - 18 fixed = 372 total (was 390)  |
   | :green_heart: |  checkstyle  |   0m 19s |  The patch passed checkstyle in 
hbase-mapreduce  |
   | :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | :green_heart: |  shadedjars  |   4m 36s |  patch has no errors when 
building our shaded downstream artifacts.  |
   | :green_heart: |  hadoopcheck  |  15m 39s |  Patch does not cause any 
errors with Hadoop 2.8.5 2.9.2 or 3.1.2.  |
   | :green_heart: |  javadoc  |   1m 14s |  the patch passed  |
   | :green_heart: |  findbugs  |   6m  3s |  the patch passed  |
   ||| _ Other Tests _ |
   | :green_heart: |  unit  |   3m  7s |  hbase-common in the patch passed.  |
   | :green_heart: |  unit  | 161m 12s |  hbase-server in the patch passed.  |
   | :green_heart: |  unit  |  18m 27s |  hbase-mapreduce in the patch passed.  
|
   | :green_heart: |  asflicense  |   1m 40s |  The patch does not generate ASF 
License warnings.  |
   |  |   | 249m 53s |   |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.4 Server=19.03.4 base: 
https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-753/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hbase/pull/753 |
   | Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs 
shadedjars hadoopcheck hbaseanti checkstyle compile |
   | uname | Linux 9d5d616c38d4 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | 
/home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-753/out/precommit/personality/provided.sh
 |
   | git revision | master / 50dc288875 |
   | Default Java | 1.8.0_181 |
   |  Test Results | 
https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-753/4/testReport/
 |
   | Max. process+thread count | 5642 (vs. ulimit of 1) |
   | modules | C: hbase-common hbase-server hbase-mapreduce U: . |
   | Console output | 
https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-753/4/console |
   | versions | git=2.11.0 maven=2018-06-17T18:33:14Z) findbugs=3.1.11 |
   | Powered by | Apache Yetus 0.11.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Commented] (HBASE-22514) Move rsgroup feature into core of HBase

2019-10-26 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960318#comment-16960318
 ] 

Hudson commented on HBASE-22514:


Results for branch HBASE-22514
[build #160 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-22514/160/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-22514/160//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-22514/160//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-22514/160//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(x) {color:red}-1 client integration test{color}
--Failed when running client tests on top of Hadoop 2. [see log for 
details|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-22514/160//artifact/output-integration/hadoop-2.log].
 (note that this means we didn't run on Hadoop 3)


> Move rsgroup feature into core of HBase
> ---
>
> Key: HBASE-22514
> URL: https://issues.apache.org/jira/browse/HBASE-22514
> Project: HBase
>  Issue Type: Umbrella
>  Components: Admin, Client, rsgroup
>Reporter: Yechao Chen
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-22514.master.001.patch, 
> image-2019-05-31-18-25-38-217.png
>
>
> The class RSGroupAdminClient is not public 
> we need to use java api  RSGroupAdminClient  to manager RSG 
> so  RSGroupAdminClient should be public
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-22991) Release 1.4.11

2019-10-26 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-22991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960346#comment-16960346
 ] 

Hudson commented on HBASE-22991:


Results for branch master
[build #1517 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/1517/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/1517//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/1517//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/1517//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Release 1.4.11
> --
>
> Key: HBASE-22991
> URL: https://issues.apache.org/jira/browse/HBASE-22991
> Project: HBase
>  Issue Type: Task
>  Components: community
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Major
> Fix For: 1.4.11
>
> Attachments: Flaky_20Test_20Report.zip
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HBASE-23181) Blocked WAL archive: "LogRoller: Failed to schedule flush of XXXX, because it is not online on us"

2019-10-26 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang reassigned HBASE-23181:
-

Assignee: Duo Zhang

> Blocked WAL archive: "LogRoller: Failed to schedule flush of , because it 
> is not online on us"
> --
>
> Key: HBASE-23181
> URL: https://issues.apache.org/jira/browse/HBASE-23181
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.2.1
>Reporter: Michael Stack
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.1.8, 2.2.3
>
>
> On a heavily loaded cluster, WAL count keeps rising and we can get into a 
> state where we are not rolling the logs off fast enough. In particular, there 
> is this interesting state at the extreme where we pick a region to flush 
> because 'Too many WALs' but the region is actually not online. As the WAL 
> count rises, we keep picking a region-to-flush that is no longer on the 
> server. This condition blocks our being able to clear WALs; eventually WALs 
> climb into the hundreds and the RS goes zombie with a full Call queue that 
> starts throwing CallQueueTooLargeExceptions (bad if this servers is the one 
> carrying hbase:meta): i.e. clients fail to access the RegionServer.
> One symptom is a fast spike in WAL count for the RS. A restart of the RS will 
> break the bind.
> Here is how it looks in the log:
> {code}
> # Here is region closing
> 2019-10-16 23:10:55,897 INFO 
> org.apache.hadoop.hbase.regionserver.handler.UnassignRegionHandler: Closed 
> 8ee433ad59526778c53cc85ed3762d0b
> 
> # Then soon after ...
> 2019-10-16 23:11:44,041 WARN org.apache.hadoop.hbase.regionserver.LogRoller: 
> Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is 
> not online on us
> 2019-10-16 23:11:45,006 INFO 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; 
> count=45, max=32; forcing flush of 1 regions(s): 
> 8ee433ad59526778c53cc85ed3762d0b
> ...
> # Later...
> 2019-10-16 23:20:25,427 INFO 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; 
> count=542, max=32; forcing flush of 1 regions(s): 
> 8ee433ad59526778c53cc85ed3762d0b
> 2019-10-16 23:20:25,427 WARN org.apache.hadoop.hbase.regionserver.LogRoller: 
> Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is 
> not online on us
> {code}
> I've seen this runaway WALs 2.2.1. I've seen runaway WALs in a 1.2.x version 
> regularly that had HBASE-16721 fix in it, but can't say yet if it was for 
> same reason as above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23216) Add 2.2.2 to download page

2019-10-26 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-23216:
--
Component/s: website

> Add 2.2.2 to download page
> --
>
> Key: HBASE-23216
> URL: https://issues.apache.org/jira/browse/HBASE-23216
> Project: HBase
>  Issue Type: Sub-task
>  Components: website
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23181) Blocked WAL archive: "LogRoller: Failed to schedule flush of XXXX, because it is not online on us"

2019-10-26 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-23181.
---
Hadoop Flags: Reviewed
  Resolution: Fixed

Pushed to branch-2.1+.

Thanks [~stack] and [~binlijin] for reviewing.

Will open follow on issues to address the remaining problems.

> Blocked WAL archive: "LogRoller: Failed to schedule flush of , because it 
> is not online on us"
> --
>
> Key: HBASE-23181
> URL: https://issues.apache.org/jira/browse/HBASE-23181
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, wal
>Affects Versions: 2.2.1
>Reporter: Michael Stack
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.1.8, 2.2.3
>
>
> On a heavily loaded cluster, WAL count keeps rising and we can get into a 
> state where we are not rolling the logs off fast enough. In particular, there 
> is this interesting state at the extreme where we pick a region to flush 
> because 'Too many WALs' but the region is actually not online. As the WAL 
> count rises, we keep picking a region-to-flush that is no longer on the 
> server. This condition blocks our being able to clear WALs; eventually WALs 
> climb into the hundreds and the RS goes zombie with a full Call queue that 
> starts throwing CallQueueTooLargeExceptions (bad if this servers is the one 
> carrying hbase:meta): i.e. clients fail to access the RegionServer.
> One symptom is a fast spike in WAL count for the RS. A restart of the RS will 
> break the bind.
> Here is how it looks in the log:
> {code}
> # Here is region closing
> 2019-10-16 23:10:55,897 INFO 
> org.apache.hadoop.hbase.regionserver.handler.UnassignRegionHandler: Closed 
> 8ee433ad59526778c53cc85ed3762d0b
> 
> # Then soon after ...
> 2019-10-16 23:11:44,041 WARN org.apache.hadoop.hbase.regionserver.LogRoller: 
> Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is 
> not online on us
> 2019-10-16 23:11:45,006 INFO 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; 
> count=45, max=32; forcing flush of 1 regions(s): 
> 8ee433ad59526778c53cc85ed3762d0b
> ...
> # Later...
> 2019-10-16 23:20:25,427 INFO 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; 
> count=542, max=32; forcing flush of 1 regions(s): 
> 8ee433ad59526778c53cc85ed3762d0b
> 2019-10-16 23:20:25,427 WARN org.apache.hadoop.hbase.regionserver.LogRoller: 
> Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is 
> not online on us
> {code}
> I've seen this runaway WALs 2.2.1. I've seen runaway WALs in a 1.2.x version 
> regularly that had HBASE-16721 fix in it, but can't say yet if it was for 
> same reason as above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23181) Blocked WAL archive: "LogRoller: Failed to schedule flush of XXXX, because it is not online on us"

2019-10-26 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-23181:
--
Component/s: wal
 regionserver

> Blocked WAL archive: "LogRoller: Failed to schedule flush of , because it 
> is not online on us"
> --
>
> Key: HBASE-23181
> URL: https://issues.apache.org/jira/browse/HBASE-23181
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, wal
>Affects Versions: 2.2.1
>Reporter: Michael Stack
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.1.8, 2.2.3
>
>
> On a heavily loaded cluster, WAL count keeps rising and we can get into a 
> state where we are not rolling the logs off fast enough. In particular, there 
> is this interesting state at the extreme where we pick a region to flush 
> because 'Too many WALs' but the region is actually not online. As the WAL 
> count rises, we keep picking a region-to-flush that is no longer on the 
> server. This condition blocks our being able to clear WALs; eventually WALs 
> climb into the hundreds and the RS goes zombie with a full Call queue that 
> starts throwing CallQueueTooLargeExceptions (bad if this servers is the one 
> carrying hbase:meta): i.e. clients fail to access the RegionServer.
> One symptom is a fast spike in WAL count for the RS. A restart of the RS will 
> break the bind.
> Here is how it looks in the log:
> {code}
> # Here is region closing
> 2019-10-16 23:10:55,897 INFO 
> org.apache.hadoop.hbase.regionserver.handler.UnassignRegionHandler: Closed 
> 8ee433ad59526778c53cc85ed3762d0b
> 
> # Then soon after ...
> 2019-10-16 23:11:44,041 WARN org.apache.hadoop.hbase.regionserver.LogRoller: 
> Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is 
> not online on us
> 2019-10-16 23:11:45,006 INFO 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; 
> count=45, max=32; forcing flush of 1 regions(s): 
> 8ee433ad59526778c53cc85ed3762d0b
> ...
> # Later...
> 2019-10-16 23:20:25,427 INFO 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; 
> count=542, max=32; forcing flush of 1 regions(s): 
> 8ee433ad59526778c53cc85ed3762d0b
> 2019-10-16 23:20:25,427 WARN org.apache.hadoop.hbase.regionserver.LogRoller: 
> Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is 
> not online on us
> {code}
> I've seen this runaway WALs 2.2.1. I've seen runaway WALs in a 1.2.x version 
> regularly that had HBASE-16721 fix in it, but can't say yet if it was for 
> same reason as above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23055) Alter hbase:meta

2019-10-26 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960311#comment-16960311
 ] 

Hudson commented on HBASE-23055:


Results for branch HBASE-23055
[build #26 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-23055/26/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-23055/26//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-23055/26//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-23055/26//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Alter hbase:meta
> 
>
> Key: HBASE-23055
> URL: https://issues.apache.org/jira/browse/HBASE-23055
> Project: HBase
>  Issue Type: Task
>Reporter: Michael Stack
>Assignee: Michael Stack
>Priority: Major
> Fix For: 3.0.0
>
>
> hbase:meta is currently hardcoded. Its schema cannot be change.
> This issue is about allowing edits to hbase:meta schema. It will allow our 
> being able to set encodings such as the block-with-indexes which will help 
> quell CPU usage on host carrying hbase:meta. A dynamic hbase:meta is first 
> step on road to being able to split meta.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23185) High cpu usage because getTable()#put() gets config value every time

2019-10-26 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960320#comment-16960320
 ] 

Hudson commented on HBASE-23185:


Results for branch branch-1
[build #1118 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1118/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1118//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1118//JDK7_Nightly_Build_Report/]


(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1118//JDK8_Nightly_Build_Report_(Hadoop2)/]




(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> High cpu usage because getTable()#put() gets config value every time
> 
>
> Key: HBASE-23185
> URL: https://issues.apache.org/jira/browse/HBASE-23185
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 1.5.0, 1.4.10, 1.2.12, 1.3.5
>Reporter: Shinya Yoshida
>Assignee: Shinya Yoshida
>Priority: Major
>  Labels: performance
> Fix For: 1.6.0
>
> Attachments: Screenshot from 2019-10-18 12-38-14.png, Screenshot from 
> 2019-10-18 13-03-24.png
>
>
> When we analyzed the performance of our hbase application with many puts, we 
> found that Configuration methods use many CPU resources:
> !Screenshot from 2019-10-18 12-38-14.png|width=460,height=205!
> As you can see, getTable().put() is calling Configuration methods which cause 
> regex or synchronization by Hashtable.
> This should not happen in 0.99.2 because 
> https://issues.apache.org/jira/browse/HBASE-12128 addressed such an issue.
>  However, it's reproducing nowadays by bugs or leakages after many code 
> evoluations between 0.9x and 1.x.
>  # 
> [https://github.com/apache/hbase/blob/dd9eadb00f9dcd071a246482a11dfc7d63845f00/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java#L369-L374]
>  ** finishSetup is called every new HTable() e.g. every con.getTable()
>  ** So getInt is called everytime and it does regex
>  # 
> [https://github.com/apache/hbase/blob/dd9eadb00f9dcd071a246482a11dfc7d63845f00/hbase-client/src/main/java/org/apache/hadoop/hbase/client/BufferedMutatorImpl.java#L115]
>  ** BufferedMutatorImpl is created every first put for HTable e.g. 
> con.getTable().put()
>  ** Create ConnectionConf every time in BufferedMutatorImpl constructor
>  ** ConnectionConf gets config value in the constructor
>  # 
> [https://github.com/apache/hbase/blob/dd9eadb00f9dcd071a246482a11dfc7d63845f00/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java#L326]
>  ** AsyncProcess is created in BufferedMutatorImpl constructor, so new 
> AsyncProcess is created by con.getTable().put()
>  ** AsyncProcess parse many configurations
> So, con.getTable().put() is heavy operation for CPU because of getting config 
> value.
>  
> With in-house patch for this issue, we observed about 10% improvement on 
> max-throughput (e.g. CPU usage) at client-side:
> !Screenshot from 2019-10-18 13-03-24.png|width=508,height=223!
>  
> Seems branch-2 is not affected because client implementation has been changed 
> dramatically.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23221) Polish the WAL interface after HBASE-23181

2019-10-26 Thread Duo Zhang (Jira)
Duo Zhang created HBASE-23221:
-

 Summary: Polish the WAL interface after HBASE-23181
 Key: HBASE-23221
 URL: https://issues.apache.org/jira/browse/HBASE-23221
 Project: HBase
  Issue Type: Improvement
Reporter: Duo Zhang


We have a closeRegion flag which seems to be redundant with the marker WALEdit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hbase] Apache9 merged pull request #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush…

2019-10-26 Thread GitBox
Apache9 merged pull request #753: HBASE-23181 Blocked WAL archive: "LogRoller: 
Failed to schedule flush…
URL: https://github.com/apache/hbase/pull/753
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hbase] virajjasani opened a new pull request #761: HBASE-23213 : Reopen regions with very high Store Ref Counts(backport…

2019-10-26 Thread GitBox
virajjasani opened a new pull request #761: HBASE-23213 : Reopen regions with 
very high Store Ref Counts(backport…
URL: https://github.com/apache/hbase/pull/761
 
 
   … HBASE-22460)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (HBASE-23213) Backport HBASE-22460 to branch-1

2019-10-26 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani updated HBASE-23213:
-
Fix Version/s: 1.5.1
   Status: Patch Available  (was: In Progress)

> Backport HBASE-22460 to branch-1
> 
>
> Key: HBASE-23213
> URL: https://issues.apache.org/jira/browse/HBASE-23213
> Project: HBase
>  Issue Type: Task
>Affects Versions: 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
> Fix For: 1.5.1
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HBASE-23213) Backport HBASE-22460 to branch-1

2019-10-26 Thread Viraj Jasani (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-23213 started by Viraj Jasani.

> Backport HBASE-22460 to branch-1
> 
>
> Key: HBASE-23213
> URL: https://issues.apache.org/jira/browse/HBASE-23213
> Project: HBase
>  Issue Type: Task
>Affects Versions: 1.6.0
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HBASE-23216) Add 2.2.2 to download page

2019-10-26 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang reassigned HBASE-23216:
-

Assignee: Duo Zhang

> Add 2.2.2 to download page
> --
>
> Key: HBASE-23216
> URL: https://issues.apache.org/jira/browse/HBASE-23216
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23216) Add 2.2.2 to download page

2019-10-26 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-23216.
---
Fix Version/s: 3.0.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

Pushed to master.

Thanks [~zghao] for reviewing.

> Add 2.2.2 to download page
> --
>
> Key: HBASE-23216
> URL: https://issues.apache.org/jira/browse/HBASE-23216
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hbase] Apache9 merged pull request #758: HBASE-23216 Add 2.2.2 to download page

2019-10-26 Thread GitBox
Apache9 merged pull request #758: HBASE-23216 Add 2.2.2 to download page
URL: https://github.com/apache/hbase/pull/758
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Updated] (HBASE-23193) ConnectionImplementation.isTableAvailable can not deal with meta table on branch-2.x

2019-10-26 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-23193:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Seems worked. Resolve.

> ConnectionImplementation.isTableAvailable can not deal with meta table on 
> branch-2.x
> 
>
> Key: HBASE-23193
> URL: https://issues.apache.org/jira/browse/HBASE-23193
> Project: HBase
>  Issue Type: Bug
>  Components: rsgroup, test
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 2.3.0, 2.1.8, 2.2.3
>
> Attachments: HBASE-23193-branch-2.patch
>
>
> TestRSGroupKillRS is broken by HBASE-22767, as on master the client library 
> has been reimplemented so Admin.isTableAvailable can be used to test meta 
> table, but on branch-2 and branch-2.2, we will get this
> {noformat}
> java.lang.RuntimeException: java.io.IOException: This method can't be used to 
> locate meta regions; use MetaTableLocator instead
>   at org.apache.hadoop.hbase.Waiter.waitFor(Waiter.java:219)
>   at org.apache.hadoop.hbase.Waiter.waitFor(Waiter.java:143)
>   at 
> org.apache.hadoop.hbase.HBaseCommonTestingUtility.waitFor(HBaseCommonTestingUtility.java:242)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.waitTableAvailable(HBaseTestingUtility.java:3268)
>   at 
> org.apache.hadoop.hbase.rsgroup.TestRSGroupsKillRS.testLowerMetaGroupVersion(TestRSGroupsKillRS.java:245)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.io.IOException: This method can't be used to locate meta 
> regions; use MetaTableLocator instead
>   at 
> org.apache.hadoop.hbase.MetaTableAccessor.getTableRegionsAndLocations(MetaTableAccessor.java:615)
>   at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.isTableAvailable(ConnectionImplementation.java:643)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.isTableAvailable(HBaseAdmin.java:971)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility$9.evaluate(HBaseTestingUtility.java:4269)
>   at org.apache.hadoop.hbase.Waiter.waitFor(Waiter.java:191)
>   ... 30 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-23200) incorrect description in SortedCompactionPolicy.getNextMajorCompactTime

2019-10-26 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-23200:
--
Fix Version/s: (was: 2.2.2)
   2.2.3

> incorrect description in SortedCompactionPolicy.getNextMajorCompactTime
> ---
>
> Key: HBASE-23200
> URL: https://issues.apache.org/jira/browse/HBASE-23200
> Project: HBase
>  Issue Type: Bug
>  Components: Compaction
>Affects Versions: master
>Reporter: jackylau
>Assignee: jackylau
>Priority: Major
> Fix For: 2.2.3
>
>
> // default = 24hrs
> long ret = comConf.getMajorCompactionPeriod();
> but the default value is 7 days in CompactionConfiguration.java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23181) Blocked WAL archive: "LogRoller: Failed to schedule flush of XXXX, because it is not online on us"

2019-10-26 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960479#comment-16960479
 ] 

Hudson commented on HBASE-23181:


Results for branch branch-2
[build #2335 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2335/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2335//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2335//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2335//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Blocked WAL archive: "LogRoller: Failed to schedule flush of , because it 
> is not online on us"
> --
>
> Key: HBASE-23181
> URL: https://issues.apache.org/jira/browse/HBASE-23181
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, wal
>Affects Versions: 2.2.1
>Reporter: Michael Stack
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.1.8, 2.2.3
>
>
> On a heavily loaded cluster, WAL count keeps rising and we can get into a 
> state where we are not rolling the logs off fast enough. In particular, there 
> is this interesting state at the extreme where we pick a region to flush 
> because 'Too many WALs' but the region is actually not online. As the WAL 
> count rises, we keep picking a region-to-flush that is no longer on the 
> server. This condition blocks our being able to clear WALs; eventually WALs 
> climb into the hundreds and the RS goes zombie with a full Call queue that 
> starts throwing CallQueueTooLargeExceptions (bad if this servers is the one 
> carrying hbase:meta): i.e. clients fail to access the RegionServer.
> One symptom is a fast spike in WAL count for the RS. A restart of the RS will 
> break the bind.
> Here is how it looks in the log:
> {code}
> # Here is region closing
> 2019-10-16 23:10:55,897 INFO 
> org.apache.hadoop.hbase.regionserver.handler.UnassignRegionHandler: Closed 
> 8ee433ad59526778c53cc85ed3762d0b
> 
> # Then soon after ...
> 2019-10-16 23:11:44,041 WARN org.apache.hadoop.hbase.regionserver.LogRoller: 
> Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is 
> not online on us
> 2019-10-16 23:11:45,006 INFO 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; 
> count=45, max=32; forcing flush of 1 regions(s): 
> 8ee433ad59526778c53cc85ed3762d0b
> ...
> # Later...
> 2019-10-16 23:20:25,427 INFO 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; 
> count=542, max=32; forcing flush of 1 regions(s): 
> 8ee433ad59526778c53cc85ed3762d0b
> 2019-10-16 23:20:25,427 WARN org.apache.hadoop.hbase.regionserver.LogRoller: 
> Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is 
> not online on us
> {code}
> I've seen this runaway WALs 2.2.1. I've seen runaway WALs in a 1.2.x version 
> regularly that had HBASE-16721 fix in it, but can't say yet if it was for 
> same reason as above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [hbase] Apache-HBase commented on issue #761: HBASE-23213 : Reopen regions with very high Store Ref Counts(backport…

2019-10-26 Thread GitBox
Apache-HBase commented on issue #761: HBASE-23213 : Reopen regions with very 
high Store Ref Counts(backport…
URL: https://github.com/apache/hbase/pull/761#issuecomment-546652651
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | :blue_heart: |  reexec  |   1m 24s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | :green_heart: |  dupname  |   0m  1s |  No case conflicting files found.  |
   | :blue_heart: |  prototool  |   0m  0s |  prototool was not available.  |
   | :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any 
anti-patterns.  |
   | :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 2 
new or modified test files.  |
   ||| _ branch-1 Compile Tests _ |
   | :blue_heart: |  mvndep  |   1m 21s |  Maven dependency ordering for branch 
 |
   | :green_heart: |  mvninstall  |   7m 39s |  branch-1 passed  |
   | :green_heart: |  compile  |   1m 46s |  branch-1 passed with JDK 
v1.8.0_232  |
   | :green_heart: |  compile  |   1m 54s |  branch-1 passed with JDK 
v1.7.0_242  |
   | :green_heart: |  checkstyle  |  11m 40s |  branch-1 passed  |
   | :blue_heart: |  refguide  |   3m 37s |  branch has no errors when building 
the reference guide. See footer for rendered docs, which you should manually 
inspect.  |
   | :green_heart: |  shadedjars  |   2m 54s |  branch has no errors when 
building our shaded downstream artifacts.  |
   | :green_heart: |  javadoc  |   3m 56s |  branch-1 passed with JDK 
v1.8.0_232  |
   | :green_heart: |  javadoc  |   6m 13s |  branch-1 passed with JDK 
v1.7.0_242  |
   | :blue_heart: |  spotbugs  |   2m 40s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | :green_heart: |  findbugs  |  19m 24s |  branch-1 passed  |
   ||| _ Patch Compile Tests _ |
   | :blue_heart: |  mvndep  |   0m 17s |  Maven dependency ordering for patch  
|
   | :green_heart: |  mvninstall  |   2m  5s |  the patch passed  |
   | :green_heart: |  compile  |   1m 46s |  the patch passed with JDK 
v1.8.0_232  |
   | :green_heart: |  cc  |   1m 46s |  the patch passed  |
   | :green_heart: |  javac  |   1m 46s |  the patch passed  |
   | :green_heart: |  compile  |   1m 50s |  the patch passed with JDK 
v1.7.0_242  |
   | :green_heart: |  cc  |   1m 50s |  the patch passed  |
   | :green_heart: |  javac  |   1m 50s |  the patch passed  |
   | :green_heart: |  checkstyle  |  12m 36s |  the patch passed  |
   | :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | :broken_heart: |  xml  |   0m  0s |  The patch has 1 ill-formed XML 
file(s).  |
   | :blue_heart: |  refguide  |   3m 53s |  patch has no errors when building 
the reference guide. See footer for rendered docs, which you should manually 
inspect.  |
   | :green_heart: |  shadedjars  |   3m 21s |  patch has no errors when 
building our shaded downstream artifacts.  |
   | :green_heart: |  hadoopcheck  |   5m 57s |  Patch does not cause any 
errors with Hadoop 2.8.5 2.9.2.  |
   | :green_heart: |  hbaseprotoc  |   4m 58s |  the patch passed  |
   | :green_heart: |  javadoc  |   4m 14s |  the patch passed with JDK 
v1.8.0_232  |
   | :green_heart: |  javadoc  |   6m 54s |  the patch passed with JDK 
v1.7.0_242  |
   | :green_heart: |  findbugs  |  23m 11s |  the patch passed  |
   ||| _ Other Tests _ |
   | :broken_heart: |  unit  | 224m 34s |  root in the patch failed.  |
   | :green_heart: |  asflicense  |   3m  0s |  The patch does not generate ASF 
License warnings.  |
   |  |   | 366m 55s |   |
   
   
   | Reason | Tests |
   |---:|:--|
   | XML | Parsing Error(s): |
   |   | hbase-common/src/main/resources/hbase-default.xml |
   | Failed junit tests | 
hadoop.hbase.master.normalizer.TestSimpleRegionNormalizerOnCluster |
   |   | hadoop.hbase.client.TestAdmin1 |
   |   | hadoop.hbase.client.TestReplicaWithCluster |
   |   | 
hadoop.hbase.client.replication.TestReplicationAdminWithTwoDifferentZKClusters |
   |   | hadoop.hbase.master.TestMasterBalanceThrottling |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=19.03.4 Server=19.03.4 base: 
https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hbase/pull/761 |
   | Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs 
shadedjars hadoopcheck hbaseanti checkstyle compile refguide xml cc hbaseprotoc 
prototool |
   | uname | Linux ab9410c5fcf9 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 
17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | 
/home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-761/out/precommit/personality/provided.sh
 |
   | git revision | branch-1 / db2ce23 |
   | Default Java 

[jira] [Commented] (HBASE-23181) Blocked WAL archive: "LogRoller: Failed to schedule flush of XXXX, because it is not online on us"

2019-10-26 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960487#comment-16960487
 ] 

Hudson commented on HBASE-23181:


Results for branch branch-2.2
[build #674 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/674/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/674//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/674//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/674//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Blocked WAL archive: "LogRoller: Failed to schedule flush of , because it 
> is not online on us"
> --
>
> Key: HBASE-23181
> URL: https://issues.apache.org/jira/browse/HBASE-23181
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, wal
>Affects Versions: 2.2.1
>Reporter: Michael Stack
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.1.8, 2.2.3
>
>
> On a heavily loaded cluster, WAL count keeps rising and we can get into a 
> state where we are not rolling the logs off fast enough. In particular, there 
> is this interesting state at the extreme where we pick a region to flush 
> because 'Too many WALs' but the region is actually not online. As the WAL 
> count rises, we keep picking a region-to-flush that is no longer on the 
> server. This condition blocks our being able to clear WALs; eventually WALs 
> climb into the hundreds and the RS goes zombie with a full Call queue that 
> starts throwing CallQueueTooLargeExceptions (bad if this servers is the one 
> carrying hbase:meta): i.e. clients fail to access the RegionServer.
> One symptom is a fast spike in WAL count for the RS. A restart of the RS will 
> break the bind.
> Here is how it looks in the log:
> {code}
> # Here is region closing
> 2019-10-16 23:10:55,897 INFO 
> org.apache.hadoop.hbase.regionserver.handler.UnassignRegionHandler: Closed 
> 8ee433ad59526778c53cc85ed3762d0b
> 
> # Then soon after ...
> 2019-10-16 23:11:44,041 WARN org.apache.hadoop.hbase.regionserver.LogRoller: 
> Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is 
> not online on us
> 2019-10-16 23:11:45,006 INFO 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; 
> count=45, max=32; forcing flush of 1 regions(s): 
> 8ee433ad59526778c53cc85ed3762d0b
> ...
> # Later...
> 2019-10-16 23:20:25,427 INFO 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; 
> count=542, max=32; forcing flush of 1 regions(s): 
> 8ee433ad59526778c53cc85ed3762d0b
> 2019-10-16 23:20:25,427 WARN org.apache.hadoop.hbase.regionserver.LogRoller: 
> Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is 
> not online on us
> {code}
> I've seen this runaway WALs 2.2.1. I've seen runaway WALs in a 1.2.x version 
> regularly that had HBASE-16721 fix in it, but can't say yet if it was for 
> same reason as above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-23181) Blocked WAL archive: "LogRoller: Failed to schedule flush of XXXX, because it is not online on us"

2019-10-26 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-23181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960504#comment-16960504
 ] 

Hudson commented on HBASE-23181:


Results for branch branch-2.1
[build #1691 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/1691/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/1691//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/1691//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/1691//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Blocked WAL archive: "LogRoller: Failed to schedule flush of , because it 
> is not online on us"
> --
>
> Key: HBASE-23181
> URL: https://issues.apache.org/jira/browse/HBASE-23181
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver, wal
>Affects Versions: 2.2.1
>Reporter: Michael Stack
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.1.8, 2.2.3
>
>
> On a heavily loaded cluster, WAL count keeps rising and we can get into a 
> state where we are not rolling the logs off fast enough. In particular, there 
> is this interesting state at the extreme where we pick a region to flush 
> because 'Too many WALs' but the region is actually not online. As the WAL 
> count rises, we keep picking a region-to-flush that is no longer on the 
> server. This condition blocks our being able to clear WALs; eventually WALs 
> climb into the hundreds and the RS goes zombie with a full Call queue that 
> starts throwing CallQueueTooLargeExceptions (bad if this servers is the one 
> carrying hbase:meta): i.e. clients fail to access the RegionServer.
> One symptom is a fast spike in WAL count for the RS. A restart of the RS will 
> break the bind.
> Here is how it looks in the log:
> {code}
> # Here is region closing
> 2019-10-16 23:10:55,897 INFO 
> org.apache.hadoop.hbase.regionserver.handler.UnassignRegionHandler: Closed 
> 8ee433ad59526778c53cc85ed3762d0b
> 
> # Then soon after ...
> 2019-10-16 23:11:44,041 WARN org.apache.hadoop.hbase.regionserver.LogRoller: 
> Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is 
> not online on us
> 2019-10-16 23:11:45,006 INFO 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; 
> count=45, max=32; forcing flush of 1 regions(s): 
> 8ee433ad59526778c53cc85ed3762d0b
> ...
> # Later...
> 2019-10-16 23:20:25,427 INFO 
> org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; 
> count=542, max=32; forcing flush of 1 regions(s): 
> 8ee433ad59526778c53cc85ed3762d0b
> 2019-10-16 23:20:25,427 WARN org.apache.hadoop.hbase.regionserver.LogRoller: 
> Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is 
> not online on us
> {code}
> I've seen this runaway WALs 2.2.1. I've seen runaway WALs in a 1.2.x version 
> regularly that had HBASE-16721 fix in it, but can't say yet if it was for 
> same reason as above.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)