[GitHub] [hbase] Apache9 commented on issue #594: HBASE-8458 Support for batch version of checkAndPut() and checkAndDelete()
Apache9 commented on issue #594: HBASE-8458 Support for batch version of checkAndPut() and checkAndDelete() URL: https://github.com/apache/hbase/pull/594#issuecomment-546612086 Could you please upload a patch against master? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] Apache-HBase commented on issue #761: HBASE-23213 : Reopen regions with very high Store Ref Counts(backport…
Apache-HBase commented on issue #761: HBASE-23213 : Reopen regions with very high Store Ref Counts(backport… URL: https://github.com/apache/hbase/pull/761#issuecomment-546626087 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | :blue_heart: | reexec | 1m 19s | Docker mode activated. | ||| _ Prechecks _ | | :green_heart: | dupname | 0m 0s | No case conflicting files found. | | :blue_heart: | prototool | 0m 0s | prototool was not available. | | :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. | | :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | :green_heart: | test4tests | 0m 0s | The patch appears to include 2 new or modified test files. | ||| _ branch-1 Compile Tests _ | | :blue_heart: | mvndep | 1m 23s | Maven dependency ordering for branch | | :green_heart: | mvninstall | 7m 29s | branch-1 passed | | :green_heart: | compile | 1m 39s | branch-1 passed with JDK v1.8.0_232 | | :green_heart: | compile | 1m 46s | branch-1 passed with JDK v1.7.0_242 | | :green_heart: | checkstyle | 10m 59s | branch-1 passed | | :blue_heart: | refguide | 3m 24s | branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. | | :green_heart: | shadedjars | 2m 50s | branch has no errors when building our shaded downstream artifacts. | | :green_heart: | javadoc | 3m 45s | branch-1 passed with JDK v1.8.0_232 | | :green_heart: | javadoc | 5m 48s | branch-1 passed with JDK v1.7.0_242 | | :blue_heart: | spotbugs | 2m 28s | Used deprecated FindBugs config; considering switching to SpotBugs. | | :green_heart: | findbugs | 18m 2s | branch-1 passed | ||| _ Patch Compile Tests _ | | :blue_heart: | mvndep | 0m 17s | Maven dependency ordering for patch | | :green_heart: | mvninstall | 1m 58s | the patch passed | | :green_heart: | compile | 1m 42s | the patch passed with JDK v1.8.0_232 | | :green_heart: | cc | 1m 42s | the patch passed | | :green_heart: | javac | 1m 42s | the patch passed | | :green_heart: | compile | 1m 46s | the patch passed with JDK v1.7.0_242 | | :green_heart: | cc | 1m 46s | the patch passed | | :green_heart: | javac | 1m 46s | the patch passed | | :green_heart: | checkstyle | 10m 54s | the patch passed | | :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | :broken_heart: | xml | 0m 0s | The patch has 1 ill-formed XML file(s). | | :blue_heart: | refguide | 3m 0s | patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. | | :green_heart: | shadedjars | 2m 44s | patch has no errors when building our shaded downstream artifacts. | | :green_heart: | hadoopcheck | 4m 59s | Patch does not cause any errors with Hadoop 2.8.5 2.9.2. | | :green_heart: | hbaseprotoc | 4m 24s | the patch passed | | :green_heart: | javadoc | 3m 52s | the patch passed with JDK v1.8.0_232 | | :green_heart: | javadoc | 5m 55s | the patch passed with JDK v1.7.0_242 | | :broken_heart: | findbugs | 2m 55s | hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | :broken_heart: | findbugs | 10m 29s | root generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | ||| _ Other Tests _ | | :broken_heart: | unit | 166m 43s | root in the patch failed. | | :green_heart: | asflicense | 2m 50s | The patch does not generate ASF License warnings. | | | | 295m 4s | | | Reason | Tests | |---:|:--| | XML | Parsing Error(s): | | | hbase-common/src/main/resources/hbase-default.xml | | FindBugs | module:hbase-server | | | org.apache.hadoop.hbase.master.RegionsRecoveryChore.chore() makes inefficient use of keySet iterator instead of entrySet iterator At RegionsRecoveryChore.java:keySet iterator instead of entrySet iterator At RegionsRecoveryChore.java:[line 104] | | FindBugs | module:root | | | org.apache.hadoop.hbase.master.RegionsRecoveryChore.chore() makes inefficient use of keySet iterator instead of entrySet iterator At RegionsRecoveryChore.java:keySet iterator instead of entrySet iterator At RegionsRecoveryChore.java:[line 104] | | Failed junit tests | hadoop.hbase.client.TestAdmin1 | | | hadoop.hbase.client.replication.TestReplicationAdminWithTwoDifferentZKClusters | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.4 Server=19.03.4 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/1/artifact/out/Dockerfile | |
[GitHub] [hbase] Apache9 commented on issue #742: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush of XXXX, because it is not online on us"
Apache9 commented on issue #742: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush of , because it is not online on us" URL: https://github.com/apache/hbase/pull/742#issuecomment-546611398 We use #753 finally. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] Apache9 closed pull request #742: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush of XXXX, because it is not online on us"
Apache9 closed pull request #742: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush of , because it is not online on us" URL: https://github.com/apache/hbase/pull/742 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] Apache9 commented on a change in pull request #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush…
Apache9 commented on a change in pull request #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush… URL: https://github.com/apache/hbase/pull/753#discussion_r339287780 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java ## @@ -1001,8 +1001,13 @@ protected final boolean append(W writer, FSWALEntry entry) throws IOException { doAppend(writer, entry); assert highestUnsyncedTxid < entry.getTxid(); highestUnsyncedTxid = entry.getTxid(); -sequenceIdAccounting.update(encodedRegionName, entry.getFamilyNames(), regionSequenceId, - entry.isInMemStore()); +if (entry.isCloseRegion()) { + // let's clean all the records of this region + sequenceIdAccounting.onRegionClose(encodedRegionName); +} else { + sequenceIdAccounting.update(encodedRegionName, entry.getFamilyNames(), regionSequenceId, +entry.isInMemStore()); Review comment: This does not make sense... You do not want the closeRegion flag but still want the inMemstore flag? At least with appendData and appendMarker, we could remove at least one of the parameters in the methods of the WAL interface... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] saintstack commented on a change in pull request #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush…
saintstack commented on a change in pull request #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush… URL: https://github.com/apache/hbase/pull/753#discussion_r339284636 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WAL.java ## @@ -97,44 +97,59 @@ void close() throws IOException; /** - * Append a set of edits to the WAL. The WAL is not flushed/sync'd after this transaction - * completes BUT on return this edit must have its region edit/sequence id assigned - * else it messes up our unification of mvcc and sequenceid. On return key will - * have the region edit/sequence id filled in. + * Append a set of data edits to the WAL. 'Data' here means that the content in the edits will + * also be added to memstore. + * + * The WAL is not flushed/sync'd after this transaction completes BUT on return this edit must + * have its region edit/sequence id assigned else it messes up our unification of mvcc and + * sequenceid. On return key will have the region edit/sequence id filled in. * @param info the regioninfo associated with append * @param key Modified by this call; we add to it this edits region edit/sequence id. * @param edits Edits to append. MAY CONTAIN NO EDITS for case where we want to get an edit - * sequence id that is after all currently appended edits. - * @param inMemstore Always true except for case where we are writing a compaction completion - * record into the WAL; in this case the entry is just so we can finish an unfinished compaction - * -- it is not an edit for memstore. + * sequence id that is after all currently appended edits. * @return Returns a 'transaction id' and key will have the region edit/sequence id - * in it. + * in it. + * @see #appendMarker(RegionInfo, WALKeyImpl, WALEdit, boolean) */ - long append(RegionInfo info, WALKeyImpl key, WALEdit edits, boolean inMemstore) throws IOException; + long appendData(RegionInfo info, WALKeyImpl key, WALEdit edits) throws IOException; + + /** + * Append a marker edit to the WAL. A marker could be a FlushDescriptor, a compaction marker, or + * region event marker. The difference here is that, a marker will not be added to memstore. + * + * The WAL is not flushed/sync'd after this transaction completes BUT on return this edit must + * have its region edit/sequence id assigned else it messes up our unification of mvcc and + * sequenceid. On return key will have the region edit/sequence id filled in. + * @param info the regioninfo associated with append + * @param key Modified by this call; we add to it this edits region edit/sequence id. + * @param edits Edits to append. MAY CONTAIN NO EDITS for case where we want to get an edit + * sequence id that is after all currently appended edits. + * @param closeRegion Whether this is a region close marker, i.e, the last wal edit for this Review comment: I messed w/ the patch. I see how the appendData and appendMarker don't take us far enough down... down to AbstractFSWAL#appendEntry where we could ask if a close marker. But looking at patch, what would be wrong w/ doing something like this? ``` diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java index 2eb7c7436a..bc31204500 100644 --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java @@ -1001,7 +1001,7 @@ public abstract class AbstractFSWAL implements WAL { doAppend(writer, entry); assert highestUnsyncedTxid < entry.getTxid(); highestUnsyncedTxid = entry.getTxid(); -if (entry.isCloseRegion()) { +if (!entry.isInMemStore() && entry.isCloseMarker()) { // let's clean all the records of this region sequenceIdAccounting.onRegionClose(encodedRegionName); } else { ``` ... where entry.isCloseMarker would do something like WALEdit.isMetaEdit... ``` public boolean isMetaEdit() { for (Cell cell: cells) { if (!isMetaEditFamily(cell)) { return false; } } return true; } ``` ... only instead we'd look for METAFAMILY:HBASE::CLOSE, a new define added on WALEdit? Thanks for taking a look. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] saintstack commented on a change in pull request #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush…
saintstack commented on a change in pull request #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush… URL: https://github.com/apache/hbase/pull/753#discussion_r339290979 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/AbstractFSWAL.java ## @@ -1001,8 +1001,13 @@ protected final boolean append(W writer, FSWALEntry entry) throws IOException { doAppend(writer, entry); assert highestUnsyncedTxid < entry.getTxid(); highestUnsyncedTxid = entry.getTxid(); -sequenceIdAccounting.update(encodedRegionName, entry.getFamilyNames(), regionSequenceId, - entry.isInMemStore()); +if (entry.isCloseRegion()) { + // let's clean all the records of this region + sequenceIdAccounting.onRegionClose(encodedRegionName); +} else { + sequenceIdAccounting.update(encodedRegionName, entry.getFamilyNames(), regionSequenceId, +entry.isInMemStore()); Review comment: The isInMemStore flag is there currently? No? I'm not advocating adding anything not already present. Looking at this patch, the new methods appendData and appendMarker do not seem to buy us much. They do not go deep enough down into AbstractFSWAL. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] saintstack commented on a change in pull request #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush…
saintstack commented on a change in pull request #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush… URL: https://github.com/apache/hbase/pull/753#discussion_r339291059 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/wal/WAL.java ## @@ -97,44 +97,59 @@ void close() throws IOException; /** - * Append a set of edits to the WAL. The WAL is not flushed/sync'd after this transaction - * completes BUT on return this edit must have its region edit/sequence id assigned - * else it messes up our unification of mvcc and sequenceid. On return key will - * have the region edit/sequence id filled in. + * Append a set of data edits to the WAL. 'Data' here means that the content in the edits will + * also be added to memstore. + * + * The WAL is not flushed/sync'd after this transaction completes BUT on return this edit must + * have its region edit/sequence id assigned else it messes up our unification of mvcc and + * sequenceid. On return key will have the region edit/sequence id filled in. * @param info the regioninfo associated with append * @param key Modified by this call; we add to it this edits region edit/sequence id. * @param edits Edits to append. MAY CONTAIN NO EDITS for case where we want to get an edit - * sequence id that is after all currently appended edits. - * @param inMemstore Always true except for case where we are writing a compaction completion - * record into the WAL; in this case the entry is just so we can finish an unfinished compaction - * -- it is not an edit for memstore. + * sequence id that is after all currently appended edits. * @return Returns a 'transaction id' and key will have the region edit/sequence id - * in it. + * in it. + * @see #appendMarker(RegionInfo, WALKeyImpl, WALEdit, boolean) */ - long append(RegionInfo info, WALKeyImpl key, WALEdit edits, boolean inMemstore) throws IOException; + long appendData(RegionInfo info, WALKeyImpl key, WALEdit edits) throws IOException; + + /** + * Append a marker edit to the WAL. A marker could be a FlushDescriptor, a compaction marker, or + * region event marker. The difference here is that, a marker will not be added to memstore. + * + * The WAL is not flushed/sync'd after this transaction completes BUT on return this edit must + * have its region edit/sequence id assigned else it messes up our unification of mvcc and + * sequenceid. On return key will have the region edit/sequence id filled in. + * @param info the regioninfo associated with append + * @param key Modified by this call; we add to it this edits region edit/sequence id. + * @param edits Edits to append. MAY CONTAIN NO EDITS for case where we want to get an edit + * sequence id that is after all currently appended edits. + * @param closeRegion Whether this is a region close marker, i.e, the last wal edit for this Review comment: Sure. We'd continue testing a boolean. Rarely, we'd check the type looking for a close and we'd find a flush or compaction marker instead. Not end of the world and we could improve in follow-up. On your this is private and we can improve in a follow on even removing parameters added by this patch, I agree. So let me +1 this patch as is. I'll have a go at trying to do an improvement in a follow-on. Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] Apache-HBase commented on issue #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush…
Apache-HBase commented on issue #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush… URL: https://github.com/apache/hbase/pull/753#issuecomment-546589729 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | :blue_heart: | reexec | 0m 35s | Docker mode activated. | ||| _ Prechecks _ | | :green_heart: | dupname | 0m 1s | No case conflicting files found. | | :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. | | :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | :green_heart: | test4tests | 0m 0s | The patch appears to include 24 new or modified test files. | ||| _ master Compile Tests _ | | :blue_heart: | mvndep | 0m 35s | Maven dependency ordering for branch | | :green_heart: | mvninstall | 5m 10s | master passed | | :green_heart: | compile | 1m 45s | master passed | | :green_heart: | checkstyle | 2m 15s | master passed | | :green_heart: | shadedjars | 4m 34s | branch has no errors when building our shaded downstream artifacts. | | :green_heart: | javadoc | 1m 17s | master passed | | :blue_heart: | spotbugs | 4m 12s | Used deprecated FindBugs config; considering switching to SpotBugs. | | :green_heart: | findbugs | 5m 44s | master passed | ||| _ Patch Compile Tests _ | | :blue_heart: | mvndep | 0m 15s | Maven dependency ordering for patch | | :green_heart: | mvninstall | 4m 53s | the patch passed | | :green_heart: | compile | 1m 47s | the patch passed | | :green_heart: | javac | 1m 47s | the patch passed | | :green_heart: | checkstyle | 0m 26s | The patch passed checkstyle in hbase-common | | :green_heart: | checkstyle | 1m 28s | hbase-server: The patch generated 0 new + 372 unchanged - 18 fixed = 372 total (was 390) | | :green_heart: | checkstyle | 0m 19s | The patch passed checkstyle in hbase-mapreduce | | :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | :green_heart: | shadedjars | 4m 36s | patch has no errors when building our shaded downstream artifacts. | | :green_heart: | hadoopcheck | 15m 39s | Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2. | | :green_heart: | javadoc | 1m 14s | the patch passed | | :green_heart: | findbugs | 6m 3s | the patch passed | ||| _ Other Tests _ | | :green_heart: | unit | 3m 7s | hbase-common in the patch passed. | | :green_heart: | unit | 161m 12s | hbase-server in the patch passed. | | :green_heart: | unit | 18m 27s | hbase-mapreduce in the patch passed. | | :green_heart: | asflicense | 1m 40s | The patch does not generate ASF License warnings. | | | | 249m 53s | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.4 Server=19.03.4 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-753/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/753 | | Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 9d5d616c38d4 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-753/out/precommit/personality/provided.sh | | git revision | master / 50dc288875 | | Default Java | 1.8.0_181 | | Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-753/4/testReport/ | | Max. process+thread count | 5642 (vs. ulimit of 1) | | modules | C: hbase-common hbase-server hbase-mapreduce U: . | | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-753/4/console | | versions | git=2.11.0 maven=2018-06-17T18:33:14Z) findbugs=3.1.11 | | Powered by | Apache Yetus 0.11.0 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (HBASE-22514) Move rsgroup feature into core of HBase
[ https://issues.apache.org/jira/browse/HBASE-22514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960318#comment-16960318 ] Hudson commented on HBASE-22514: Results for branch HBASE-22514 [build #160 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-22514/160/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-22514/160//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-22514/160//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-22514/160//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (x) {color:red}-1 client integration test{color} --Failed when running client tests on top of Hadoop 2. [see log for details|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-22514/160//artifact/output-integration/hadoop-2.log]. (note that this means we didn't run on Hadoop 3) > Move rsgroup feature into core of HBase > --- > > Key: HBASE-22514 > URL: https://issues.apache.org/jira/browse/HBASE-22514 > Project: HBase > Issue Type: Umbrella > Components: Admin, Client, rsgroup >Reporter: Yechao Chen >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-22514.master.001.patch, > image-2019-05-31-18-25-38-217.png > > > The class RSGroupAdminClient is not public > we need to use java api RSGroupAdminClient to manager RSG > so RSGroupAdminClient should be public > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-22991) Release 1.4.11
[ https://issues.apache.org/jira/browse/HBASE-22991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960346#comment-16960346 ] Hudson commented on HBASE-22991: Results for branch master [build #1517 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/1517/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/1517//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/1517//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/1517//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Release 1.4.11 > -- > > Key: HBASE-22991 > URL: https://issues.apache.org/jira/browse/HBASE-22991 > Project: HBase > Issue Type: Task > Components: community >Reporter: Sean Busbey >Assignee: Sean Busbey >Priority: Major > Fix For: 1.4.11 > > Attachments: Flaky_20Test_20Report.zip > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HBASE-23181) Blocked WAL archive: "LogRoller: Failed to schedule flush of XXXX, because it is not online on us"
[ https://issues.apache.org/jira/browse/HBASE-23181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang reassigned HBASE-23181: - Assignee: Duo Zhang > Blocked WAL archive: "LogRoller: Failed to schedule flush of , because it > is not online on us" > -- > > Key: HBASE-23181 > URL: https://issues.apache.org/jira/browse/HBASE-23181 > Project: HBase > Issue Type: Bug >Affects Versions: 2.2.1 >Reporter: Michael Stack >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.3.0, 2.1.8, 2.2.3 > > > On a heavily loaded cluster, WAL count keeps rising and we can get into a > state where we are not rolling the logs off fast enough. In particular, there > is this interesting state at the extreme where we pick a region to flush > because 'Too many WALs' but the region is actually not online. As the WAL > count rises, we keep picking a region-to-flush that is no longer on the > server. This condition blocks our being able to clear WALs; eventually WALs > climb into the hundreds and the RS goes zombie with a full Call queue that > starts throwing CallQueueTooLargeExceptions (bad if this servers is the one > carrying hbase:meta): i.e. clients fail to access the RegionServer. > One symptom is a fast spike in WAL count for the RS. A restart of the RS will > break the bind. > Here is how it looks in the log: > {code} > # Here is region closing > 2019-10-16 23:10:55,897 INFO > org.apache.hadoop.hbase.regionserver.handler.UnassignRegionHandler: Closed > 8ee433ad59526778c53cc85ed3762d0b > > # Then soon after ... > 2019-10-16 23:11:44,041 WARN org.apache.hadoop.hbase.regionserver.LogRoller: > Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is > not online on us > 2019-10-16 23:11:45,006 INFO > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; > count=45, max=32; forcing flush of 1 regions(s): > 8ee433ad59526778c53cc85ed3762d0b > ... > # Later... > 2019-10-16 23:20:25,427 INFO > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; > count=542, max=32; forcing flush of 1 regions(s): > 8ee433ad59526778c53cc85ed3762d0b > 2019-10-16 23:20:25,427 WARN org.apache.hadoop.hbase.regionserver.LogRoller: > Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is > not online on us > {code} > I've seen this runaway WALs 2.2.1. I've seen runaway WALs in a 1.2.x version > regularly that had HBASE-16721 fix in it, but can't say yet if it was for > same reason as above. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23216) Add 2.2.2 to download page
[ https://issues.apache.org/jira/browse/HBASE-23216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-23216: -- Component/s: website > Add 2.2.2 to download page > -- > > Key: HBASE-23216 > URL: https://issues.apache.org/jira/browse/HBASE-23216 > Project: HBase > Issue Type: Sub-task > Components: website >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-23181) Blocked WAL archive: "LogRoller: Failed to schedule flush of XXXX, because it is not online on us"
[ https://issues.apache.org/jira/browse/HBASE-23181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang resolved HBASE-23181. --- Hadoop Flags: Reviewed Resolution: Fixed Pushed to branch-2.1+. Thanks [~stack] and [~binlijin] for reviewing. Will open follow on issues to address the remaining problems. > Blocked WAL archive: "LogRoller: Failed to schedule flush of , because it > is not online on us" > -- > > Key: HBASE-23181 > URL: https://issues.apache.org/jira/browse/HBASE-23181 > Project: HBase > Issue Type: Bug > Components: regionserver, wal >Affects Versions: 2.2.1 >Reporter: Michael Stack >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.3.0, 2.1.8, 2.2.3 > > > On a heavily loaded cluster, WAL count keeps rising and we can get into a > state where we are not rolling the logs off fast enough. In particular, there > is this interesting state at the extreme where we pick a region to flush > because 'Too many WALs' but the region is actually not online. As the WAL > count rises, we keep picking a region-to-flush that is no longer on the > server. This condition blocks our being able to clear WALs; eventually WALs > climb into the hundreds and the RS goes zombie with a full Call queue that > starts throwing CallQueueTooLargeExceptions (bad if this servers is the one > carrying hbase:meta): i.e. clients fail to access the RegionServer. > One symptom is a fast spike in WAL count for the RS. A restart of the RS will > break the bind. > Here is how it looks in the log: > {code} > # Here is region closing > 2019-10-16 23:10:55,897 INFO > org.apache.hadoop.hbase.regionserver.handler.UnassignRegionHandler: Closed > 8ee433ad59526778c53cc85ed3762d0b > > # Then soon after ... > 2019-10-16 23:11:44,041 WARN org.apache.hadoop.hbase.regionserver.LogRoller: > Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is > not online on us > 2019-10-16 23:11:45,006 INFO > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; > count=45, max=32; forcing flush of 1 regions(s): > 8ee433ad59526778c53cc85ed3762d0b > ... > # Later... > 2019-10-16 23:20:25,427 INFO > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; > count=542, max=32; forcing flush of 1 regions(s): > 8ee433ad59526778c53cc85ed3762d0b > 2019-10-16 23:20:25,427 WARN org.apache.hadoop.hbase.regionserver.LogRoller: > Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is > not online on us > {code} > I've seen this runaway WALs 2.2.1. I've seen runaway WALs in a 1.2.x version > regularly that had HBASE-16721 fix in it, but can't say yet if it was for > same reason as above. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23181) Blocked WAL archive: "LogRoller: Failed to schedule flush of XXXX, because it is not online on us"
[ https://issues.apache.org/jira/browse/HBASE-23181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-23181: -- Component/s: wal regionserver > Blocked WAL archive: "LogRoller: Failed to schedule flush of , because it > is not online on us" > -- > > Key: HBASE-23181 > URL: https://issues.apache.org/jira/browse/HBASE-23181 > Project: HBase > Issue Type: Bug > Components: regionserver, wal >Affects Versions: 2.2.1 >Reporter: Michael Stack >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.3.0, 2.1.8, 2.2.3 > > > On a heavily loaded cluster, WAL count keeps rising and we can get into a > state where we are not rolling the logs off fast enough. In particular, there > is this interesting state at the extreme where we pick a region to flush > because 'Too many WALs' but the region is actually not online. As the WAL > count rises, we keep picking a region-to-flush that is no longer on the > server. This condition blocks our being able to clear WALs; eventually WALs > climb into the hundreds and the RS goes zombie with a full Call queue that > starts throwing CallQueueTooLargeExceptions (bad if this servers is the one > carrying hbase:meta): i.e. clients fail to access the RegionServer. > One symptom is a fast spike in WAL count for the RS. A restart of the RS will > break the bind. > Here is how it looks in the log: > {code} > # Here is region closing > 2019-10-16 23:10:55,897 INFO > org.apache.hadoop.hbase.regionserver.handler.UnassignRegionHandler: Closed > 8ee433ad59526778c53cc85ed3762d0b > > # Then soon after ... > 2019-10-16 23:11:44,041 WARN org.apache.hadoop.hbase.regionserver.LogRoller: > Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is > not online on us > 2019-10-16 23:11:45,006 INFO > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; > count=45, max=32; forcing flush of 1 regions(s): > 8ee433ad59526778c53cc85ed3762d0b > ... > # Later... > 2019-10-16 23:20:25,427 INFO > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; > count=542, max=32; forcing flush of 1 regions(s): > 8ee433ad59526778c53cc85ed3762d0b > 2019-10-16 23:20:25,427 WARN org.apache.hadoop.hbase.regionserver.LogRoller: > Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is > not online on us > {code} > I've seen this runaway WALs 2.2.1. I've seen runaway WALs in a 1.2.x version > regularly that had HBASE-16721 fix in it, but can't say yet if it was for > same reason as above. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23055) Alter hbase:meta
[ https://issues.apache.org/jira/browse/HBASE-23055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960311#comment-16960311 ] Hudson commented on HBASE-23055: Results for branch HBASE-23055 [build #26 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-23055/26/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-23055/26//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-23055/26//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-23055/26//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Alter hbase:meta > > > Key: HBASE-23055 > URL: https://issues.apache.org/jira/browse/HBASE-23055 > Project: HBase > Issue Type: Task >Reporter: Michael Stack >Assignee: Michael Stack >Priority: Major > Fix For: 3.0.0 > > > hbase:meta is currently hardcoded. Its schema cannot be change. > This issue is about allowing edits to hbase:meta schema. It will allow our > being able to set encodings such as the block-with-indexes which will help > quell CPU usage on host carrying hbase:meta. A dynamic hbase:meta is first > step on road to being able to split meta. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23185) High cpu usage because getTable()#put() gets config value every time
[ https://issues.apache.org/jira/browse/HBASE-23185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960320#comment-16960320 ] Hudson commented on HBASE-23185: Results for branch branch-1 [build #1118 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1118/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1118//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1118//JDK7_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1118//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > High cpu usage because getTable()#put() gets config value every time > > > Key: HBASE-23185 > URL: https://issues.apache.org/jira/browse/HBASE-23185 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 1.5.0, 1.4.10, 1.2.12, 1.3.5 >Reporter: Shinya Yoshida >Assignee: Shinya Yoshida >Priority: Major > Labels: performance > Fix For: 1.6.0 > > Attachments: Screenshot from 2019-10-18 12-38-14.png, Screenshot from > 2019-10-18 13-03-24.png > > > When we analyzed the performance of our hbase application with many puts, we > found that Configuration methods use many CPU resources: > !Screenshot from 2019-10-18 12-38-14.png|width=460,height=205! > As you can see, getTable().put() is calling Configuration methods which cause > regex or synchronization by Hashtable. > This should not happen in 0.99.2 because > https://issues.apache.org/jira/browse/HBASE-12128 addressed such an issue. > However, it's reproducing nowadays by bugs or leakages after many code > evoluations between 0.9x and 1.x. > # > [https://github.com/apache/hbase/blob/dd9eadb00f9dcd071a246482a11dfc7d63845f00/hbase-client/src/main/java/org/apache/hadoop/hbase/client/HTable.java#L369-L374] > ** finishSetup is called every new HTable() e.g. every con.getTable() > ** So getInt is called everytime and it does regex > # > [https://github.com/apache/hbase/blob/dd9eadb00f9dcd071a246482a11dfc7d63845f00/hbase-client/src/main/java/org/apache/hadoop/hbase/client/BufferedMutatorImpl.java#L115] > ** BufferedMutatorImpl is created every first put for HTable e.g. > con.getTable().put() > ** Create ConnectionConf every time in BufferedMutatorImpl constructor > ** ConnectionConf gets config value in the constructor > # > [https://github.com/apache/hbase/blob/dd9eadb00f9dcd071a246482a11dfc7d63845f00/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncProcess.java#L326] > ** AsyncProcess is created in BufferedMutatorImpl constructor, so new > AsyncProcess is created by con.getTable().put() > ** AsyncProcess parse many configurations > So, con.getTable().put() is heavy operation for CPU because of getting config > value. > > With in-house patch for this issue, we observed about 10% improvement on > max-throughput (e.g. CPU usage) at client-side: > !Screenshot from 2019-10-18 13-03-24.png|width=508,height=223! > > Seems branch-2 is not affected because client implementation has been changed > dramatically. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-23221) Polish the WAL interface after HBASE-23181
Duo Zhang created HBASE-23221: - Summary: Polish the WAL interface after HBASE-23181 Key: HBASE-23221 URL: https://issues.apache.org/jira/browse/HBASE-23221 Project: HBase Issue Type: Improvement Reporter: Duo Zhang We have a closeRegion flag which seems to be redundant with the marker WALEdit. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] Apache9 merged pull request #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush…
Apache9 merged pull request #753: HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush… URL: https://github.com/apache/hbase/pull/753 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] virajjasani opened a new pull request #761: HBASE-23213 : Reopen regions with very high Store Ref Counts(backport…
virajjasani opened a new pull request #761: HBASE-23213 : Reopen regions with very high Store Ref Counts(backport… URL: https://github.com/apache/hbase/pull/761 … HBASE-22460) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (HBASE-23213) Backport HBASE-22460 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-23213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Viraj Jasani updated HBASE-23213: - Fix Version/s: 1.5.1 Status: Patch Available (was: In Progress) > Backport HBASE-22460 to branch-1 > > > Key: HBASE-23213 > URL: https://issues.apache.org/jira/browse/HBASE-23213 > Project: HBase > Issue Type: Task >Affects Versions: 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Minor > Fix For: 1.5.1 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (HBASE-23213) Backport HBASE-22460 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-23213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-23213 started by Viraj Jasani. > Backport HBASE-22460 to branch-1 > > > Key: HBASE-23213 > URL: https://issues.apache.org/jira/browse/HBASE-23213 > Project: HBase > Issue Type: Task >Affects Versions: 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Minor > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HBASE-23216) Add 2.2.2 to download page
[ https://issues.apache.org/jira/browse/HBASE-23216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang reassigned HBASE-23216: - Assignee: Duo Zhang > Add 2.2.2 to download page > -- > > Key: HBASE-23216 > URL: https://issues.apache.org/jira/browse/HBASE-23216 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-23216) Add 2.2.2 to download page
[ https://issues.apache.org/jira/browse/HBASE-23216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang resolved HBASE-23216. --- Fix Version/s: 3.0.0 Hadoop Flags: Reviewed Resolution: Fixed Pushed to master. Thanks [~zghao] for reviewing. > Add 2.2.2 to download page > -- > > Key: HBASE-23216 > URL: https://issues.apache.org/jira/browse/HBASE-23216 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] Apache9 merged pull request #758: HBASE-23216 Add 2.2.2 to download page
Apache9 merged pull request #758: HBASE-23216 Add 2.2.2 to download page URL: https://github.com/apache/hbase/pull/758 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (HBASE-23193) ConnectionImplementation.isTableAvailable can not deal with meta table on branch-2.x
[ https://issues.apache.org/jira/browse/HBASE-23193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-23193: -- Resolution: Fixed Status: Resolved (was: Patch Available) Seems worked. Resolve. > ConnectionImplementation.isTableAvailable can not deal with meta table on > branch-2.x > > > Key: HBASE-23193 > URL: https://issues.apache.org/jira/browse/HBASE-23193 > Project: HBase > Issue Type: Bug > Components: rsgroup, test >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 2.3.0, 2.1.8, 2.2.3 > > Attachments: HBASE-23193-branch-2.patch > > > TestRSGroupKillRS is broken by HBASE-22767, as on master the client library > has been reimplemented so Admin.isTableAvailable can be used to test meta > table, but on branch-2 and branch-2.2, we will get this > {noformat} > java.lang.RuntimeException: java.io.IOException: This method can't be used to > locate meta regions; use MetaTableLocator instead > at org.apache.hadoop.hbase.Waiter.waitFor(Waiter.java:219) > at org.apache.hadoop.hbase.Waiter.waitFor(Waiter.java:143) > at > org.apache.hadoop.hbase.HBaseCommonTestingUtility.waitFor(HBaseCommonTestingUtility.java:242) > at > org.apache.hadoop.hbase.HBaseTestingUtility.waitTableAvailable(HBaseTestingUtility.java:3268) > at > org.apache.hadoop.hbase.rsgroup.TestRSGroupsKillRS.testLowerMetaGroupVersion(TestRSGroupsKillRS.java:245) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.io.IOException: This method can't be used to locate meta > regions; use MetaTableLocator instead > at > org.apache.hadoop.hbase.MetaTableAccessor.getTableRegionsAndLocations(MetaTableAccessor.java:615) > at > org.apache.hadoop.hbase.client.ConnectionImplementation.isTableAvailable(ConnectionImplementation.java:643) > at > org.apache.hadoop.hbase.client.HBaseAdmin.isTableAvailable(HBaseAdmin.java:971) > at > org.apache.hadoop.hbase.HBaseTestingUtility$9.evaluate(HBaseTestingUtility.java:4269) > at org.apache.hadoop.hbase.Waiter.waitFor(Waiter.java:191) > ... 30 more > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23200) incorrect description in SortedCompactionPolicy.getNextMajorCompactTime
[ https://issues.apache.org/jira/browse/HBASE-23200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-23200: -- Fix Version/s: (was: 2.2.2) 2.2.3 > incorrect description in SortedCompactionPolicy.getNextMajorCompactTime > --- > > Key: HBASE-23200 > URL: https://issues.apache.org/jira/browse/HBASE-23200 > Project: HBase > Issue Type: Bug > Components: Compaction >Affects Versions: master >Reporter: jackylau >Assignee: jackylau >Priority: Major > Fix For: 2.2.3 > > > // default = 24hrs > long ret = comConf.getMajorCompactionPeriod(); > but the default value is 7 days in CompactionConfiguration.java -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23181) Blocked WAL archive: "LogRoller: Failed to schedule flush of XXXX, because it is not online on us"
[ https://issues.apache.org/jira/browse/HBASE-23181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960479#comment-16960479 ] Hudson commented on HBASE-23181: Results for branch branch-2 [build #2335 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2335/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2335//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2335//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2335//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Blocked WAL archive: "LogRoller: Failed to schedule flush of , because it > is not online on us" > -- > > Key: HBASE-23181 > URL: https://issues.apache.org/jira/browse/HBASE-23181 > Project: HBase > Issue Type: Bug > Components: regionserver, wal >Affects Versions: 2.2.1 >Reporter: Michael Stack >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.3.0, 2.1.8, 2.2.3 > > > On a heavily loaded cluster, WAL count keeps rising and we can get into a > state where we are not rolling the logs off fast enough. In particular, there > is this interesting state at the extreme where we pick a region to flush > because 'Too many WALs' but the region is actually not online. As the WAL > count rises, we keep picking a region-to-flush that is no longer on the > server. This condition blocks our being able to clear WALs; eventually WALs > climb into the hundreds and the RS goes zombie with a full Call queue that > starts throwing CallQueueTooLargeExceptions (bad if this servers is the one > carrying hbase:meta): i.e. clients fail to access the RegionServer. > One symptom is a fast spike in WAL count for the RS. A restart of the RS will > break the bind. > Here is how it looks in the log: > {code} > # Here is region closing > 2019-10-16 23:10:55,897 INFO > org.apache.hadoop.hbase.regionserver.handler.UnassignRegionHandler: Closed > 8ee433ad59526778c53cc85ed3762d0b > > # Then soon after ... > 2019-10-16 23:11:44,041 WARN org.apache.hadoop.hbase.regionserver.LogRoller: > Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is > not online on us > 2019-10-16 23:11:45,006 INFO > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; > count=45, max=32; forcing flush of 1 regions(s): > 8ee433ad59526778c53cc85ed3762d0b > ... > # Later... > 2019-10-16 23:20:25,427 INFO > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; > count=542, max=32; forcing flush of 1 regions(s): > 8ee433ad59526778c53cc85ed3762d0b > 2019-10-16 23:20:25,427 WARN org.apache.hadoop.hbase.regionserver.LogRoller: > Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is > not online on us > {code} > I've seen this runaway WALs 2.2.1. I've seen runaway WALs in a 1.2.x version > regularly that had HBASE-16721 fix in it, but can't say yet if it was for > same reason as above. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] Apache-HBase commented on issue #761: HBASE-23213 : Reopen regions with very high Store Ref Counts(backport…
Apache-HBase commented on issue #761: HBASE-23213 : Reopen regions with very high Store Ref Counts(backport… URL: https://github.com/apache/hbase/pull/761#issuecomment-546652651 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | :blue_heart: | reexec | 1m 24s | Docker mode activated. | ||| _ Prechecks _ | | :green_heart: | dupname | 0m 1s | No case conflicting files found. | | :blue_heart: | prototool | 0m 0s | prototool was not available. | | :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. | | :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | :green_heart: | test4tests | 0m 0s | The patch appears to include 2 new or modified test files. | ||| _ branch-1 Compile Tests _ | | :blue_heart: | mvndep | 1m 21s | Maven dependency ordering for branch | | :green_heart: | mvninstall | 7m 39s | branch-1 passed | | :green_heart: | compile | 1m 46s | branch-1 passed with JDK v1.8.0_232 | | :green_heart: | compile | 1m 54s | branch-1 passed with JDK v1.7.0_242 | | :green_heart: | checkstyle | 11m 40s | branch-1 passed | | :blue_heart: | refguide | 3m 37s | branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. | | :green_heart: | shadedjars | 2m 54s | branch has no errors when building our shaded downstream artifacts. | | :green_heart: | javadoc | 3m 56s | branch-1 passed with JDK v1.8.0_232 | | :green_heart: | javadoc | 6m 13s | branch-1 passed with JDK v1.7.0_242 | | :blue_heart: | spotbugs | 2m 40s | Used deprecated FindBugs config; considering switching to SpotBugs. | | :green_heart: | findbugs | 19m 24s | branch-1 passed | ||| _ Patch Compile Tests _ | | :blue_heart: | mvndep | 0m 17s | Maven dependency ordering for patch | | :green_heart: | mvninstall | 2m 5s | the patch passed | | :green_heart: | compile | 1m 46s | the patch passed with JDK v1.8.0_232 | | :green_heart: | cc | 1m 46s | the patch passed | | :green_heart: | javac | 1m 46s | the patch passed | | :green_heart: | compile | 1m 50s | the patch passed with JDK v1.7.0_242 | | :green_heart: | cc | 1m 50s | the patch passed | | :green_heart: | javac | 1m 50s | the patch passed | | :green_heart: | checkstyle | 12m 36s | the patch passed | | :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | :broken_heart: | xml | 0m 0s | The patch has 1 ill-formed XML file(s). | | :blue_heart: | refguide | 3m 53s | patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. | | :green_heart: | shadedjars | 3m 21s | patch has no errors when building our shaded downstream artifacts. | | :green_heart: | hadoopcheck | 5m 57s | Patch does not cause any errors with Hadoop 2.8.5 2.9.2. | | :green_heart: | hbaseprotoc | 4m 58s | the patch passed | | :green_heart: | javadoc | 4m 14s | the patch passed with JDK v1.8.0_232 | | :green_heart: | javadoc | 6m 54s | the patch passed with JDK v1.7.0_242 | | :green_heart: | findbugs | 23m 11s | the patch passed | ||| _ Other Tests _ | | :broken_heart: | unit | 224m 34s | root in the patch failed. | | :green_heart: | asflicense | 3m 0s | The patch does not generate ASF License warnings. | | | | 366m 55s | | | Reason | Tests | |---:|:--| | XML | Parsing Error(s): | | | hbase-common/src/main/resources/hbase-default.xml | | Failed junit tests | hadoop.hbase.master.normalizer.TestSimpleRegionNormalizerOnCluster | | | hadoop.hbase.client.TestAdmin1 | | | hadoop.hbase.client.TestReplicaWithCluster | | | hadoop.hbase.client.replication.TestReplicationAdminWithTwoDifferentZKClusters | | | hadoop.hbase.master.TestMasterBalanceThrottling | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.4 Server=19.03.4 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/761 | | Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile refguide xml cc hbaseprotoc prototool | | uname | Linux ab9410c5fcf9 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-761/out/precommit/personality/provided.sh | | git revision | branch-1 / db2ce23 | | Default Java
[jira] [Commented] (HBASE-23181) Blocked WAL archive: "LogRoller: Failed to schedule flush of XXXX, because it is not online on us"
[ https://issues.apache.org/jira/browse/HBASE-23181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960487#comment-16960487 ] Hudson commented on HBASE-23181: Results for branch branch-2.2 [build #674 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/674/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/674//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/674//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.2/674//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Blocked WAL archive: "LogRoller: Failed to schedule flush of , because it > is not online on us" > -- > > Key: HBASE-23181 > URL: https://issues.apache.org/jira/browse/HBASE-23181 > Project: HBase > Issue Type: Bug > Components: regionserver, wal >Affects Versions: 2.2.1 >Reporter: Michael Stack >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.3.0, 2.1.8, 2.2.3 > > > On a heavily loaded cluster, WAL count keeps rising and we can get into a > state where we are not rolling the logs off fast enough. In particular, there > is this interesting state at the extreme where we pick a region to flush > because 'Too many WALs' but the region is actually not online. As the WAL > count rises, we keep picking a region-to-flush that is no longer on the > server. This condition blocks our being able to clear WALs; eventually WALs > climb into the hundreds and the RS goes zombie with a full Call queue that > starts throwing CallQueueTooLargeExceptions (bad if this servers is the one > carrying hbase:meta): i.e. clients fail to access the RegionServer. > One symptom is a fast spike in WAL count for the RS. A restart of the RS will > break the bind. > Here is how it looks in the log: > {code} > # Here is region closing > 2019-10-16 23:10:55,897 INFO > org.apache.hadoop.hbase.regionserver.handler.UnassignRegionHandler: Closed > 8ee433ad59526778c53cc85ed3762d0b > > # Then soon after ... > 2019-10-16 23:11:44,041 WARN org.apache.hadoop.hbase.regionserver.LogRoller: > Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is > not online on us > 2019-10-16 23:11:45,006 INFO > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; > count=45, max=32; forcing flush of 1 regions(s): > 8ee433ad59526778c53cc85ed3762d0b > ... > # Later... > 2019-10-16 23:20:25,427 INFO > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; > count=542, max=32; forcing flush of 1 regions(s): > 8ee433ad59526778c53cc85ed3762d0b > 2019-10-16 23:20:25,427 WARN org.apache.hadoop.hbase.regionserver.LogRoller: > Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is > not online on us > {code} > I've seen this runaway WALs 2.2.1. I've seen runaway WALs in a 1.2.x version > regularly that had HBASE-16721 fix in it, but can't say yet if it was for > same reason as above. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23181) Blocked WAL archive: "LogRoller: Failed to schedule flush of XXXX, because it is not online on us"
[ https://issues.apache.org/jira/browse/HBASE-23181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16960504#comment-16960504 ] Hudson commented on HBASE-23181: Results for branch branch-2.1 [build #1691 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/1691/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/1691//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/1691//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/1691//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Blocked WAL archive: "LogRoller: Failed to schedule flush of , because it > is not online on us" > -- > > Key: HBASE-23181 > URL: https://issues.apache.org/jira/browse/HBASE-23181 > Project: HBase > Issue Type: Bug > Components: regionserver, wal >Affects Versions: 2.2.1 >Reporter: Michael Stack >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.3.0, 2.1.8, 2.2.3 > > > On a heavily loaded cluster, WAL count keeps rising and we can get into a > state where we are not rolling the logs off fast enough. In particular, there > is this interesting state at the extreme where we pick a region to flush > because 'Too many WALs' but the region is actually not online. As the WAL > count rises, we keep picking a region-to-flush that is no longer on the > server. This condition blocks our being able to clear WALs; eventually WALs > climb into the hundreds and the RS goes zombie with a full Call queue that > starts throwing CallQueueTooLargeExceptions (bad if this servers is the one > carrying hbase:meta): i.e. clients fail to access the RegionServer. > One symptom is a fast spike in WAL count for the RS. A restart of the RS will > break the bind. > Here is how it looks in the log: > {code} > # Here is region closing > 2019-10-16 23:10:55,897 INFO > org.apache.hadoop.hbase.regionserver.handler.UnassignRegionHandler: Closed > 8ee433ad59526778c53cc85ed3762d0b > > # Then soon after ... > 2019-10-16 23:11:44,041 WARN org.apache.hadoop.hbase.regionserver.LogRoller: > Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is > not online on us > 2019-10-16 23:11:45,006 INFO > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; > count=45, max=32; forcing flush of 1 regions(s): > 8ee433ad59526778c53cc85ed3762d0b > ... > # Later... > 2019-10-16 23:20:25,427 INFO > org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL: Too many WALs; > count=542, max=32; forcing flush of 1 regions(s): > 8ee433ad59526778c53cc85ed3762d0b > 2019-10-16 23:20:25,427 WARN org.apache.hadoop.hbase.regionserver.LogRoller: > Failed to schedule flush of 8ee433ad59526778c53cc85ed3762d0b, because it is > not online on us > {code} > I've seen this runaway WALs 2.2.1. I've seen runaway WALs in a 1.2.x version > regularly that had HBASE-16721 fix in it, but can't say yet if it was for > same reason as above. -- This message was sent by Atlassian Jira (v8.3.4#803005)