[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207856#comment-16207856 ] Hudson commented on HBASE-14247: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #3904 (See [https://builds.apache.org/job/HBase-Trunk_matrix/3904/]) HBASE-14247 Separate the old WALs into different regionserver (zghao: rev 6db1ce1e914f42dfefd7292ca3f2aa7bc60fd319) * (edit) hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ReplicationQueueInfo.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManagerZkImpl.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/wal/AsyncFSWALProvider.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationBase.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/WALEntryStream.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/CleanerChore.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestWALEntryStream.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceWALReader.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/replication/ReplicationSourceDummy.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/RecoveredReplicationSource.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceInterface.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationKillSlaveRS.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/DumpReplicationQueues.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestLogsCleaner.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationKillMasterRS.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/wal/AbstractFSWALProvider.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/wal/FSHLogProvider.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/FileCleanerDelegate.java > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 3.0.0, 2.0.0-alpha-4 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff, HBASE-14247.master.001.patch, > HBASE-14247.master.002.patch, HBASE-14247.master.003.patch, > HBASE-14247.master.004.patch, HBASE-14247.master.005.patch > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207742#comment-16207742 ] Hudson commented on HBASE-14247: FAILURE: Integrated in Jenkins build HBase-2.0 #703 (See [https://builds.apache.org/job/HBase-2.0/703/]) HBASE-14247 Separate the old WALs into different regionserver (zghao: rev 7f1cd12e8c557d23a9b93c704433435bda8094d3) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/FileCleanerDelegate.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestWALEntryStream.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/DumpReplicationQueues.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceWALReader.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestLogsCleaner.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManagerZkImpl.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceInterface.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationKillSlaveRS.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/wal/AsyncFSWALProvider.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/wal/AbstractFSWALProvider.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationKillMasterRS.java * (edit) hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ReplicationQueueInfo.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationBase.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/wal/FSHLogProvider.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/RecoveredReplicationSource.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/replication/ReplicationSourceDummy.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/CleanerChore.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/WALEntryStream.java > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 3.0.0, 2.0.0-alpha-4 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff, HBASE-14247.master.001.patch, > HBASE-14247.master.002.patch, HBASE-14247.master.003.patch, > HBASE-14247.master.004.patch, HBASE-14247.master.005.patch > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16205424#comment-16205424 ] Dave Latham commented on HBASE-14247: - No, thank you. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff, HBASE-14247.master.001.patch, > HBASE-14247.master.002.patch, HBASE-14247.master.003.patch, > HBASE-14247.master.004.patch, HBASE-14247.master.005.patch > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16205383#comment-16205383 ] Guanghao Zhang commented on HBASE-14247: Thanks [~Apache9] for reviewing. [~davelatham] Do you have more concerns? If not, I will commit it tomorrow. Thanks. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff, HBASE-14247.master.001.patch, > HBASE-14247.master.002.patch, HBASE-14247.master.003.patch, > HBASE-14247.master.004.patch, HBASE-14247.master.005.patch > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203606#comment-16203606 ] Hadoop QA commented on HBASE-14247: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 8 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 43s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 49s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 4s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 5s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 37m 56s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 10s{color} | {color:green} hbase-replication in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 91m 40s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}151m 18s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:5d60123 | | JIRA Issue | HBASE-14247 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12892041/HBASE-14247.master.005.patch | | Optional Tests | asflicense shadedjars javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 06b3448157a5 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 883c358 | | Default Java | 1.8.0_144 |
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203505#comment-16203505 ] Duo Zhang commented on HBASE-14247: --- +1. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff, HBASE-14247.master.001.patch, > HBASE-14247.master.002.patch, HBASE-14247.master.003.patch, > HBASE-14247.master.004.patch, HBASE-14247.master.005.patch > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16185829#comment-16185829 ] Guanghao Zhang commented on HBASE-14247: Pint [~davelatham] [~stack] [~Apache9] [~anoop.hbase] for reviewing. Thanks. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247.master.001.patch, > HBASE-14247.master.002.patch, HBASE-14247.master.003.patch, > HBASE-14247.master.004.patch, HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16182105#comment-16182105 ] Hadoop QA commented on HBASE-14247: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 8 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 37s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 53s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 15s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 42s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 7m 4s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 29s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 3s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 46m 37s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 28s{color} | {color:green} hbase-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 11s{color} | {color:green} hbase-replication in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}117m 26s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 46s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}199m 36s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:5d60123 | | JIRA Issue | HBASE-14247 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12889213/HBASE-14247.master.004.patch | | Optional Tests | asflicense shadedjars javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux e314c35f9f01 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181154#comment-16181154 ] Dave Latham commented on HBASE-14247: - Thanks, [~zghaobac]. It looks like the latest patch would address the concern. There's an implicit dependency between the clocks assigning the file modification time and checking time checking the ZK queues. But the TimeToLiveCleaner would seem to provide plenty of a safety margin for unsynced clocks. I didn't review the rest, aside from happening to notice a misspelling of SEPERATE vs SEPARATE (just in case you care). > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247.master.001.patch, > HBASE-14247.master.002.patch, HBASE-14247.master.003.patch, > HBASE-14247-v001.diff, HBASE-14247-v002.diff, HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180695#comment-16180695 ] Hadoop QA commented on HBASE-14247: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 8s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 56s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 35s{color} | {color:red} hbase-server in master has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 14s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 40m 9s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 25s{color} | {color:green} hbase-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 10s{color} | {color:green} hbase-replication in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 32s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 9s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}155m 38s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.cleaner.TestLogsCleaner | | Timed out junit tests | org.apache.hadoop.hbase.master.procedure.TestDisableTableProcedure | | | org.apache.hadoop.hbase.master.procedure.TestEnableTableProcedure | | | org.apache.hadoop.hbase.master.procedure.TestServerCrashProcedure | | | org.apache.hadoop.hbase.client.TestSnapshotCloneIndependence | | | org.apache.hadoop.hbase.master.TestGetLastFlushedSequenceId | | |
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16177408#comment-16177408 ] Dave Latham commented on HBASE-14247: - That's exactly right. Checking ZK once for the batch of all sub region server directories is exactly what I've been suggesting. HBASE-9208 happened 4 years ago. I don't have records of how many logs were necessary before it became a problem. We did have hundreds of region servers. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247.master.001.patch, > HBASE-14247.master.002.patch, HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16177395#comment-16177395 ] Guanghao Zhang commented on HBASE-14247: [~davelatham] I checked HBASE-9208. If I am not wrong, the key point is ReplicationLogCleaner will check zk for every old log. So before HBASE-9208, the cleaner will check zk O(old log number) per chore. Then HBASE-9208 changed this to O(1). And this issue will increased it to O(region server number), you thought O(region server number) will have performance problem, right? If this is a problem, can we share the replication queue result (check zk once) for all sub region server directory? Meanwhile, can you give some data when you found the problem in HBASE-9208, like how many old logs and how many region servers? > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247.master.001.patch, > HBASE-14247.master.002.patch, HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16177017#comment-16177017 ] Dave Latham commented on HBASE-14247: - Yes, that would address my concern, at least for our own clusters. I don't think it's a great way to address the issue, though. People with large clusters that use it would be the ones most likely to be hit by HBASE-9208. It also doesn't seem great for HBase to need to support two different directory layouts for WAL files. I still think it would be better to update the log cleaner to handle multiple directories well, and then turn it on for everyone. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247.master.001.patch, > HBASE-14247.master.002.patch, HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176830#comment-16176830 ] Anoop Sam John commented on HBASE-14247: In latest patch , there is a config to turn this behave on or off and default is off , means the existing way. This will make ur concern addressed? > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247.master.001.patch, > HBASE-14247.master.002.patch, HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176577#comment-16176577 ] Dave Latham commented on HBASE-14247: - I still believe that committing this would have a significant risk of introducing a regression of HBASE-9208. I would be (non-binding) -1 on doing so until first improving the log cleaner or providing convincing evidence that it would not happen. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247.master.001.patch, > HBASE-14247.master.002.patch, HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175824#comment-16175824 ] Guanghao Zhang commented on HBASE-14247: [~stack] ping for reviewing. Do you plan merge this improvement to hbase 2.0 release? Thanks. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247.master.001.patch, > HBASE-14247.master.002.patch, HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16171136#comment-16171136 ] Hadoop QA commented on HBASE-14247: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 33s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 30s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 33s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 19s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 9s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 54s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 37m 3s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 30s{color} | {color:green} hbase-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 9s{color} | {color:green} hbase-replication in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 90m 58s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 49s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}155m 37s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.13.1 Server=1.13.1 Image:yetus/hbase:5d60123 | | JIRA Issue | HBASE-14247 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887812/HBASE-14247.master.002.patch | | Optional Tests | asflicense shadedjars javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 44eb49bf645b 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16170010#comment-16170010 ] Hadoop QA commented on HBASE-14247: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 8 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 35s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 53s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 0s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 34s{color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} shadedjars {color} | {color:red} 5m 35s{color} | {color:red} branch has 12 errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 22s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedjars {color} | {color:red} 4m 24s{color} | {color:red} patch has 12 errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 43m 45s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 8s{color} | {color:green} hbase-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 20s{color} | {color:green} hbase-replication in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}121m 14s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 48s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}196m 47s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.replication.TestReplicationKillRS | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.03.0-ce Server=17.03.0-ce Image:yetus/hbase:5d60123 | | JIRA Issue | HBASE-14247 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12887640/HBASE-14247.master.001.patch | | Optional Tests | asflicense shadedjars javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 6335f46c5185 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16112051#comment-16112051 ] Dave Latham commented on HBASE-14247: - I'd be concerned about first introducing a regression with this issue, and only then addressing it if there's anyone paying enough attention to do so. I think it would be better to first ensure the log cleaner will handle multiple directories in a single batch, then change the log layout to be multiple directories. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16112028#comment-16112028 ] Guanghao Zhang commented on HBASE-14247: [~davelatham] Because the max-directory-items limit of HDFS, I thought the first thing is make the old WAL directory scalable. Then if the cleaner has performance problem, we can try to make it fast. Your concern can be addressed after we resolve this issue? Thanks. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16111893#comment-16111893 ] Dave Latham commented on HBASE-14247: - I don't think that more threads is a great solution. It still means that, in a cluster using replication, for every region server's log directory, a thread will need to read all the replication queues (size also proportional to the number of region servers) in ZK, instead of just happening once per chore. This gives O(N^2) performance for N region servers, and is not just theoretical but has actually caused problems in the past on large clusters (>1000 nodes). With multiple threads, it likely still won't be fast enough and may end up hammering ZK if the number of threads is ramped up too high to compensate. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16111875#comment-16111875 ] Guanghao Zhang commented on HBASE-14247: [~stack] Sorry for late to reply. It need some code rebase work.. For the performance concern, I thought we can test it when patch is ready. Even it has performance problem, it may be addressed by issue like HBASE-18309. Thanks. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16111803#comment-16111803 ] stack commented on HBASE-14247: --- [~zghaobac] You have something to post sir? (You have an answer for the [~davelatham] question?) Thanks. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015878#comment-16015878 ] Dave Latham commented on HBASE-14247: - Does the patch address the log cleaner performance concern? > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015036#comment-16015036 ] Guanghao Zhang commented on HBASE-14247: [~aertoria] I already have a patch for master and our 0.98 branch. But it need some code rebase work. I will upload the patch later. Thanks. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Guanghao Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014560#comment-16014560 ] Ethan Wang commented on HBASE-14247: [~zghaobac] Are you planning working on this change? > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011643#comment-16011643 ] Enis Soztutar commented on HBASE-14247: --- Please feel free to go ahead. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011613#comment-16011613 ] Guanghao Zhang commented on HBASE-14247: [~enis] Do you work on this issue? I have a patch for our 0.98 branch. If you not work on this, I can help to prepare a patch for master branch. Thanks. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011139#comment-16011139 ] Enis Soztutar commented on HBASE-14247: --- Marking this as critical for 2.0. We have indeed seen this causing problems on larger clusters. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16011120#comment-16011120 ] Hadoop QA commented on HBASE-14247: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 3s {color} | {color:blue} The patch file was not named according to hbase's naming conventions. Please see https://yetus.apache.org/documentation/0.3.0/precommit-patchnames for instructions. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} | {color:red} HBASE-14247 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.3.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12751989/HBASE-14247-v003.diff | | JIRA Issue | HBASE-14247 | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/6792/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1600#comment-1600 ] Ethan Wang commented on HBASE-14247: I have a naive idea. Enhancement like this sounds like to me should implemented from HDFS layer, since HDFS should provide storage as a service? > Separate the old WALs into different regionserver directories > - > > Key: HBASE-14247 > URL: https://issues.apache.org/jira/browse/HBASE-14247 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, > HBASE-14247-v003.diff > > > Currently all old WALs of regionservers are achieved into the single > directory of oldWALs. In big clusters, because of long TTL of WAL or disabled > replications, the number of files under oldWALs may reach the > max-directory-items limit of HDFS, which will make the hbase cluster crashed. > {quote} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): > The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: > limit=1048576 items=1048576 > {quote} > A simple solution is to separate the old WALs into different directories > according to the server name of the WAL. > Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14709350#comment-14709350 ] Dave Latham commented on HBASE-14247: - {quote} I don't think this will be problem. Currenly the HFileCleaner have to run through the cleaner checks for every store's subdirectory of every region of every table and we didn't see the problem of efficiency. So I won't need to worry about the OldLogCleaner in new layout of old layout. {quote} The HFileCleaner does not need to check anything that is O(regions) or O(servers). It only cleans the archive directory which I believe only has a subdirectory for each column family. But this issue would not affect the HFileCleaner - only the OldLogCleaner which currently only has to process a single batch. It has a different set of cleaner checks - including the ReplicationLogCleaner. {quote} What's more, we can turn up the cleaner period by config hbase.master.cleaner.interval. {quote} We've had a problem before where the cleaner cannot clean old files as fast as the new ones are created. (See HBASE-9208 for one example). When that happens, it doesn't matter what the hbase.master.cleaner.interval is. Separate the old WALs into different regionserver directories - Key: HBASE-14247 URL: https://issues.apache.org/jira/browse/HBASE-14247 Project: HBase Issue Type: Improvement Components: wal Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff, HBASE-14247-v003.diff Currently all old WALs of regionservers are achieved into the single directory of oldWALs. In big clusters, because of long TTL of WAL or disabled replications, the number of files under oldWALs may reach the max-directory-items limit of HDFS, which will make the hbase cluster crashed. {quote} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: limit=1048576 items=1048576 {quote} A simple solution is to separate the old WALs into different directories according to the server name of the WAL. Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708910#comment-14708910 ] Liu Shaohui commented on HBASE-14247: - [~davelatham] {quote} One concern with this change is that the OldLogCleaner as it is implemented will now have to run through the cleaner checks for every regionserver's subdirectory instead of doing them all in one batch. It will likely make the cleaner chore much slower and may not be able to keep up for large clusters. {quote} I don't think this will be problem. Currenly the HFileCleaner have to run through the cleaner checks for every store's subdirectory of every region of every table and we didn't see the problem of efficiency. So I won't need to worry about the OldLogCleaner in new layout of old layout. What's more, we can turn up the cleaner period by config hbase.master.cleaner.interval. Separate the old WALs into different regionserver directories - Key: HBASE-14247 URL: https://issues.apache.org/jira/browse/HBASE-14247 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff Currently all old WALs of regionservers are achieved into the single directory of oldWALs. In big clusters, because of long TTL of WAL or disabled replications, the number of files under oldWALs may reach the max-directory-items limit of HDFS, which will make the hbase cluster crashed. {quote} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: limit=1048576 items=1048576 {quote} A simple solution is to separate the old WALs into different directories according to the server name of the WAL. Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706853#comment-14706853 ] Dave Latham commented on HBASE-14247: - One concern with this change is that the OldLogCleaner as it is implemented will now have to run through the cleaner checks for every region's subdirectory instead of doing them all in one batch. It will likely make the cleaner chore much slower and may not be able to keep up for large clusters. Separate the old WALs into different regionserver directories - Key: HBASE-14247 URL: https://issues.apache.org/jira/browse/HBASE-14247 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff Currently all old WALs of regionservers are achieved into the single directory of oldWALs. In big clusters, because of long TTL of WAL or disabled replications, the number of files under oldWALs may reach the max-directory-items limit of HDFS, which will make the hbase cluster crashed. {quote} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: limit=1048576 items=1048576 {quote} A simple solution is to separate the old WALs into different directories according to the server name of the WAL. Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706859#comment-14706859 ] Dave Latham commented on HBASE-14247: - Oops, I meant to say every *region server's* subdirectory (instead of every *region's* subdirectory). This may be able to be mitigated by batching up cleaning checks across directories. Separate the old WALs into different regionserver directories - Key: HBASE-14247 URL: https://issues.apache.org/jira/browse/HBASE-14247 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff Currently all old WALs of regionservers are achieved into the single directory of oldWALs. In big clusters, because of long TTL of WAL or disabled replications, the number of files under oldWALs may reach the max-directory-items limit of HDFS, which will make the hbase cluster crashed. {quote} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: limit=1048576 items=1048576 {quote} A simple solution is to separate the old WALs into different directories according to the server name of the WAL. Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14707384#comment-14707384 ] Hadoop QA commented on HBASE-14247: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12751688/HBASE-14247-v002.diff against master branch at commit bcef28eefaf192b0ad48c8011f98b8e944340da5. ATTACHMENT ID: 12751688 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 30 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.7.0) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 1 warning messages. {color:red}-1 checkstyle{color}. The applied patch generated 1859 checkstyle errors (more than the master's current 1857 errors). {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.hadoop.hdfs.server.namenode.TestFileTruncate.testTruncateRecovery(TestFileTruncate.java:1059) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15198//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15198//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15198//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15198//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15198//console This message is automatically generated. Separate the old WALs into different regionserver directories - Key: HBASE-14247 URL: https://issues.apache.org/jira/browse/HBASE-14247 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff Currently all old WALs of regionservers are achieved into the single directory of oldWALs. In big clusters, because of long TTL of WAL or disabled replications, the number of files under oldWALs may reach the max-directory-items limit of HDFS, which will make the hbase cluster crashed. {quote} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: limit=1048576 items=1048576 {quote} A simple solution is to separate the old WALs into different directories according to the server name of the WAL. Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14706449#comment-14706449 ] Liu Shaohui commented on HBASE-14247: - [~stack] [~apurtell] What's your suggestions about this change? Separate the old WALs into different regionserver directories - Key: HBASE-14247 URL: https://issues.apache.org/jira/browse/HBASE-14247 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 Attachments: HBASE-14247-v001.diff, HBASE-14247-v002.diff Currently all old WALs of regionservers are achieved into the single directory of oldWALs. In big clusters, because of long TTL of WAL or disabled replications, the number of files under oldWALs may reach the max-directory-items limit of HDFS, which will make the hbase cluster crashed. {quote} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: limit=1048576 items=1048576 {quote} A simple solution is to separate the old WALs into different directories according to the server name of the WAL. Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704537#comment-14704537 ] Liu Shaohui commented on HBASE-14247: - Please help to review at https://reviews.apache.org/r/37641/ Separate the old WALs into different regionserver directories - Key: HBASE-14247 URL: https://issues.apache.org/jira/browse/HBASE-14247 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Fix For: 2.0.0 Attachments: HBASE-14247-v001.diff Currently all old WALs of regionservers are achieved into the single directory of oldWALs. In big clusters, because of long TTL of WAL or disabled replications, the number of files under oldWALs may reach the max-directory-items limit of HDFS, which will make the hbase cluster crashed. {quote} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: limit=1048576 items=1048576 {quote} A simple solution is to separate the old WALs into different directories according to the server name of the WAL. Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
[ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702555#comment-14702555 ] Vladimir Rodionov commented on HBASE-14247: --- {quote} Suggestions are welcomed {quote} Increase *dfs.block.size* :) Having separate dir for every server is a good idea. Separate the old WALs into different regionserver directories - Key: HBASE-14247 URL: https://issues.apache.org/jira/browse/HBASE-14247 Project: HBase Issue Type: Improvement Reporter: Liu Shaohui Assignee: Liu Shaohui Priority: Minor Currently all old WALs of regionservers are achieved into the single directory of oldWALs. In big clusters, because of long TTL of WAL or disabled replications, the number of files under oldWALs may reach the max-directory-items limit of HDFS, which will make the hbase cluster crashed. {quote} Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: limit=1048576 items=1048576 {quote} A simple solution is to separate the old WALs into different directories according to the server name of the WAL. Suggestions are welcomed~ Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)