[jira] [Comment Edited] (HBASE-20844) Duplicate rows returned while hbase snapshot reads
[ https://issues.apache.org/jira/browse/HBASE-20844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539597#comment-16539597 ] ShivaKumar SS edited comment on HBASE-20844 at 7/11/18 5:58 AM: This behaviour is not seen in hbase 1.4.5 and it turns out to be below fix missing in hbase 1.3.1, where it ignores regions which are getting split. {{Class : org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormatImpl}} Method : {{ public static List getRegionInfosFromManifest(SnapshotManifest manifest) {}} {{ List regionManifests = manifest.getRegionManifests();}} {{ if (regionManifests == null) {}} {{ throw new IllegalArgumentException("Snapshot seems empty");}} } {{ List regionInfos = Lists.newArrayListWithCapacity(regionManifests.size());}} {{ for (SnapshotRegionManifest regionManifest : regionManifests) {}} {{ HRegionInfo hri = HRegionInfo.convert(regionManifest.getRegionInfo());}} {{ if (hri.isOffline() && (hri.isSplit() || hri.isSplitParent())) { // This one.}} {{ continue;}} {{ }}} {{ regionInfos.add(hri);}} } {{ return regionInfos;}} } was (Author: shivakumar.ss): This behaviour is not seen in hbase 1.4.5 and it turns out to be below fix missing in hbase 1.3.1, where it ignores regions which are getting split. {{Class : org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormatImpl}} Method : {{ public static List getRegionInfosFromManifest(SnapshotManifest manifest) {}} {{ List regionManifests = manifest.getRegionManifests();}} {{ if (regionManifests == null) {}} {{ throw new IllegalArgumentException("Snapshot seems empty");}} {{ }}} {{ List regionInfos = Lists.newArrayListWithCapacity(regionManifests.size());}} {{ for (SnapshotRegionManifest regionManifest : regionManifests) {}} {{ HRegionInfo hri = HRegionInfo.convert(regionManifest.getRegionInfo());}} {{ if (hri.isOffline() && (hri.isSplit() || hri.isSplitParent())) { // This one.}} {{ continue;}} {{ }}} {{ regionInfos.add(hri);}} {{ }}} {{ return regionInfos;}} {{ }}} > Duplicate rows returned while hbase snapshot reads > -- > > Key: HBASE-20844 > URL: https://issues.apache.org/jira/browse/HBASE-20844 > Project: HBase > Issue Type: Bug > Components: mapreduce, snapshots, spark >Affects Versions: 1.3.1 > Environment: Cluster Details > Java 1.7 > Hbase 1.3.1 > Spark 1.6.1 >Reporter: ShivaKumar SS >Priority: Major > > We are trying to take snapshot from code and read data using MR and spark, > both approaches are returning duplicate records. > On the API side, > \{{org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat }} is used. > Snapshot was taken during the table was in a region split state. > We suspect it is due to data is being returned for both parent and daughter > regions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20844) Duplicate rows returned while hbase snapshot reads
[ https://issues.apache.org/jira/browse/HBASE-20844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539597#comment-16539597 ] ShivaKumar SS commented on HBASE-20844: --- This behaviour is not seen in hbase 1.4.5 and it turns out to be below fix missing in hbase 1.3.1, where it ignores regions which are getting split. {{Class : org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormatImpl}} Method : {{ public static List getRegionInfosFromManifest(SnapshotManifest manifest) {}} {{ List regionManifests = manifest.getRegionManifests();}} {{ if (regionManifests == null) {}} {{ throw new IllegalArgumentException("Snapshot seems empty");}} {{ }}} {{ List regionInfos = Lists.newArrayListWithCapacity(regionManifests.size());}} {{ for (SnapshotRegionManifest regionManifest : regionManifests) {}} {{ HRegionInfo hri = HRegionInfo.convert(regionManifest.getRegionInfo());}} {{ if (hri.isOffline() && (hri.isSplit() || hri.isSplitParent())) { // This one.}} {{ continue;}} {{ }}} {{ regionInfos.add(hri);}} {{ }}} {{ return regionInfos;}} {{ }}} > Duplicate rows returned while hbase snapshot reads > -- > > Key: HBASE-20844 > URL: https://issues.apache.org/jira/browse/HBASE-20844 > Project: HBase > Issue Type: Bug > Components: mapreduce, snapshots, spark >Affects Versions: 1.3.1 > Environment: Cluster Details > Java 1.7 > Hbase 1.3.1 > Spark 1.6.1 >Reporter: ShivaKumar SS >Priority: Major > > We are trying to take snapshot from code and read data using MR and spark, > both approaches are returning duplicate records. > On the API side, > \{{org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat }} is used. > Snapshot was taken during the table was in a region split state. > We suspect it is due to data is being returned for both parent and daughter > regions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20865) CreateTableProcedure is stuck in retry loop in CREATE_TABLE_WRITE_FS_LAYOUT state
[ https://issues.apache.org/jira/browse/HBASE-20865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Toshihiro Suzuki updated HBASE-20865: - Status: Patch Available (was: Open) > CreateTableProcedure is stuck in retry loop in CREATE_TABLE_WRITE_FS_LAYOUT > state > - > > Key: HBASE-20865 > URL: https://issues.apache.org/jira/browse/HBASE-20865 > Project: HBase > Issue Type: Bug > Components: amv2 >Reporter: Toshihiro Suzuki >Assignee: Toshihiro Suzuki >Priority: Major > Attachments: HBASE-20865.master.001.patch > > > Similar to HBASE-20616, CreateTableProcedure gets stuck in retry loop in > CREATE_TABLE_WRITE_FS_LAYOUT state when writing HDFS fails. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20865) CreateTableProcedure is stuck in retry loop in CREATE_TABLE_WRITE_FS_LAYOUT state
[ https://issues.apache.org/jira/browse/HBASE-20865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Toshihiro Suzuki updated HBASE-20865: - Attachment: HBASE-20865.master.001.patch > CreateTableProcedure is stuck in retry loop in CREATE_TABLE_WRITE_FS_LAYOUT > state > - > > Key: HBASE-20865 > URL: https://issues.apache.org/jira/browse/HBASE-20865 > Project: HBase > Issue Type: Bug > Components: amv2 >Reporter: Toshihiro Suzuki >Assignee: Toshihiro Suzuki >Priority: Major > Attachments: HBASE-20865.master.001.patch > > > Similar to HBASE-20616, CreateTableProcedure gets stuck in retry loop in > CREATE_TABLE_WRITE_FS_LAYOUT state when writing HDFS fails. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20855) PeerConfigTracker only support one listener will cause problem when there is a recovered replication queue
[ https://issues.apache.org/jira/browse/HBASE-20855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539579#comment-16539579 ] Hadoop QA commented on HBASE-20855: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-1 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 28s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 43s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 30s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 27s{color} | {color:red} hbase-client: The patch generated 1 new + 12 unchanged - 0 fixed = 13 total (was 12) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 26s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 1m 32s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 31s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}135m 39s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}160m 4s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests |
[jira] [Commented] (HBASE-20860) Merged region's RIT state may not be cleaned after master restart
[ https://issues.apache.org/jira/browse/HBASE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539577#comment-16539577 ] Hadoop QA commented on HBASE-20860: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2.0 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 41s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 44s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 2s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 40s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 59s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} branch-2.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 39s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 20s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}151m 30s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}187m 39s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 | | JIRA Issue | HBASE-20860 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931100/HBASE-20860.branch-2.0.005.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 9bf01aa6832c 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | branch-2.0 / cd1ecae0d1 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/13582/testReport/ | | Max. process+thread count | 4139 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/13582/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically
[jira] [Commented] (HBASE-17885) Backport HBASE-15871 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-17885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539573#comment-16539573 ] Toshihiro Suzuki commented on HBASE-17885: -- Hi [~ram_krish], can I work on this? > Backport HBASE-15871 to branch-1 > > > Key: HBASE-17885 > URL: https://issues.apache.org/jira/browse/HBASE-17885 > Project: HBase > Issue Type: Bug > Components: Scanners >Affects Versions: 1.3.1, 1.2.5, 1.1.8 >Reporter: ramkrishna.s.vasudevan >Priority: Major > Fix For: 1.5.0, 1.3.3, 1.2.8, 1.4.6 > > > Will try to rebase the branch-1 patch at the earliest. Hope the fix versions > are correct. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20847) The parent procedure of RegionTransitionProcedure may not have the table lock
[ https://issues.apache.org/jira/browse/HBASE-20847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539557#comment-16539557 ] Allan Yang commented on HBASE-20847: {quote} We need to release the exclusive lock if we are scheduling sub procedures. This is what we do for now. {quote} Are we doing this now? IIRC, we will hold the exclusive lock the whole life time even if we schedule sub procedures(if holdLock = true). Just reviewed V3 patch, +1 on this patch, I think it totally resolves the problem of HBASE-20846. You can add the test case in HBASE-20846 to this patch, and resolve HBASE-20846 after this one. For restoring locks after master restarts, I think we can open another issue to implement it, since it is a big change in AMv2. > The parent procedure of RegionTransitionProcedure may not have the table lock > - > > Key: HBASE-20847 > URL: https://issues.apache.org/jira/browse/HBASE-20847 > Project: HBase > Issue Type: Sub-task > Components: proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-20847-v1.patch, HBASE-20847-v2.patch, > HBASE-20847-v3.patch, HBASE-20847.patch > > > For example, SCP can also schedule AssignProcedure and obviously it will not > hold the table lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20847) The parent procedure of RegionTransitionProcedure may not have the table lock
[ https://issues.apache.org/jira/browse/HBASE-20847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539524#comment-16539524 ] Duo Zhang commented on HBASE-20847: --- The root cause for TestAdmin2 is that, we only have 5 rpc handlers so all of them will be blocked by the createTable request, as now the AssignProcedure can hold the shared lock while the parent CreateTableProcedure holds the exclusive lock, so all other broken CreateTableProcedure will hang there and wait for the AssignProcedures, but AssignProcedure will lead to a reportRegionTransition call, which can not be executed as all the rpc handlers have been consumed by createTable... It is strange that the requests from user are mixed up with the system requests... I temporarily fixed it by setting a larger handler count, need dig later... > The parent procedure of RegionTransitionProcedure may not have the table lock > - > > Key: HBASE-20847 > URL: https://issues.apache.org/jira/browse/HBASE-20847 > Project: HBase > Issue Type: Sub-task > Components: proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-20847-v1.patch, HBASE-20847-v2.patch, > HBASE-20847-v3.patch, HBASE-20847.patch > > > For example, SCP can also schedule AssignProcedure and obviously it will not > hold the table lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20847) The parent procedure of RegionTransitionProcedure may not have the table lock
[ https://issues.apache.org/jira/browse/HBASE-20847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-20847: -- Attachment: HBASE-20847-v3.patch > The parent procedure of RegionTransitionProcedure may not have the table lock > - > > Key: HBASE-20847 > URL: https://issues.apache.org/jira/browse/HBASE-20847 > Project: HBase > Issue Type: Sub-task > Components: proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-20847-v1.patch, HBASE-20847-v2.patch, > HBASE-20847-v3.patch, HBASE-20847.patch > > > For example, SCP can also schedule AssignProcedure and obviously it will not > hold the table lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20860) Merged region's RIT state may not be cleaned after master restart
[ https://issues.apache.org/jira/browse/HBASE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-20860: -- Fix Version/s: (was: 2.1.0) 2.1.1 > Merged region's RIT state may not be cleaned after master restart > - > > Key: HBASE-20860 > URL: https://issues.apache.org/jira/browse/HBASE-20860 > Project: HBase > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Fix For: 3.0.0, 2.0.2, 2.1.1 > > Attachments: HBASE-20860.branch-2.0.002.patch, > HBASE-20860.branch-2.0.003.patch, HBASE-20860.branch-2.0.004.patch, > HBASE-20860.branch-2.0.005.patch, HBASE-20860.branch-2.0.patch > > > In MergeTableRegionsProcedure, we issue UnassignProcedures to offline regions > to merge. But if we restart master just after MergeTableRegionsProcedure > finished these two UnassignProcedure and before it can delete their meta > entries. The new master will found these two region is CLOSED but no > procedures are attached to them. They will be regard as RIT regions and > nobody will clean the RIT state for them later. > A quick way to resolve this stuck situation in the production env is > restarting master again, since the meta entries are deleted in > MergeTableRegionsProcedure. Here, I offer a fix for this problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()
[ https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-20697: --- Fix Version/s: (was: 1.3.3) (was: 1.2.7) (was: 2.1.0) 2.1.1 2.2.0 1.5.0 3.0.0 > Can't cache All region locations of the specify table by calling > table.getRegionLocator().getAllRegionLocations() > - > > Key: HBASE-20697 > URL: https://issues.apache.org/jira/browse/HBASE-20697 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1, 1.2.6, 2.0.1 >Reporter: zhaoyuan >Assignee: zhaoyuan >Priority: Major > Fix For: 3.0.0, 1.5.0, 1.4.6, 2.0.2, 2.2.0, 2.1.1 > > Attachments: HBASE-20697.branch-1.2.001.patch, > HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, > HBASE-20697.branch-1.2.004.patch, HBASE-20697.master.001.patch, > HBASE-20697.master.002.patch, HBASE-20697.master.002.patch, > HBASE-20697.master.003.patch > > > When we upgrade and restart a new version application which will read and > write to HBase, we will get some operation timeout. The time out is expected > because when the application restarts,It will not hold any region locations > cache and do communication with zk and meta regionserver to get region > locations. > We want to avoid these timeouts so we do warmup work and as far as I am > concerned,the method table.getRegionLocator().getAllRegionLocations() will > fetch all region locations and cache them. However, it didn't work good. > There are still a lot of time outs,so it confused me. > I dig into the source code and find something below > {code:java} > // code placeholder > public List getAllRegionLocations() throws IOException { > TableName tableName = getName(); > NavigableMap locations = > MetaScanner.allTableRegions(this.connection, tableName); > ArrayList regions = new ArrayList<>(locations.size()); > for (Entry entry : locations.entrySet()) { > regions.add(new HRegionLocation(entry.getKey(), entry.getValue())); > } > if (regions.size() > 0) { > connection.cacheLocation(tableName, new RegionLocations(regions)); > } > return regions; > } > In MetaCache > public void cacheLocation(final TableName tableName, final RegionLocations > locations) { > byte [] startKey = > locations.getRegionLocation().getRegionInfo().getStartKey(); > ConcurrentMap tableLocations = > getTableLocations(tableName); > RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, > locations); > boolean isNewCacheEntry = (oldLocation == null); > if (isNewCacheEntry) { > if (LOG.isTraceEnabled()) { > LOG.trace("Cached location: " + locations); > } > addToCachedServers(locations); > return; > } > {code} > It will collect all regions into one RegionLocations object and only cache > the first not null region location and then when we put or get to hbase, we > do getCacheLocation() > {code:java} > // code placeholder > public RegionLocations getCachedLocation(final TableName tableName, final > byte [] row) { > ConcurrentNavigableMap tableLocations = > getTableLocations(tableName); > Entry e = tableLocations.floorEntry(row); > if (e == null) { > if (metrics!= null) metrics.incrMetaCacheMiss(); > return null; > } > RegionLocations possibleRegion = e.getValue(); > // make sure that the end key is greater than the row we're looking > // for, otherwise the row actually belongs in the next region, not > // this one. the exception case is when the endkey is > // HConstants.EMPTY_END_ROW, signifying that the region we're > // checking is actually the last region in the table. > byte[] endKey = > possibleRegion.getRegionLocation().getRegionInfo().getEndKey(); > if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) || > getRowComparator(tableName).compareRows( > endKey, 0, endKey.length, row, 0, row.length) > 0) { > if (metrics != null) metrics.incrMetaCacheHit(); > return possibleRegion; > } > // Passed all the way through, so we got nothing - complete cache miss > if (metrics != null) metrics.incrMetaCacheMiss(); > return null; > } > {code} > It will choose the first location to be possibleRegion and possibly it will > miss match. > So did I forget something or may be wrong somewhere? If this is indeed a bug > I think it can be fixed not very hard. > Hope commiters and PMC review this ! > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20869) Endpoint-based Export use incorrect user to write to destination
[ https://issues.apache.org/jira/browse/HBASE-20869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539479#comment-16539479 ] Chia-Ping Tsai commented on HBASE-20869: Thanks for catching that! > Endpoint-based Export use incorrect user to write to destination > > > Key: HBASE-20869 > URL: https://issues.apache.org/jira/browse/HBASE-20869 > Project: HBase > Issue Type: Bug > Components: Coprocessors >Affects Versions: 2.0.0 > Environment: Hadoop 3.0.0 + HBase 2.0.0, Kerberos. >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Major > > HBASE-15806 implemented an endpoint based export. It gets caller's HDFS > delegation token, and RegionServer is supposed to write out exported files as > the caller. > Everything works fine if you use run export as hbase user. However, once you > use a different user to export, it fails. > To reproduce, > Add to configuration key hbase.coprocessor.region.classes the coprocessor > class org.apache.hadoop.hbase.coprocessor.Export. > create a table t1, assign permission to a user foo: > > {noformat} > hbase(main):004:0> user_permission 't1' > User Namespace,Table,Family,Qualifier:Permission > hbase default,t1,,: [Permission: actions=READ,WRITE,EXEC,CREATE,ADMIN] > foo default,t1,,: [Permission: actions=READ,WRITE,EXEC,CREATE,ADMIN]{noformat} > > As user foo, execute the following command: > > {noformat} > $ hdfs dfs -mkdir /tmp/export_hbase2 > $ hbase org.apache.hadoop.hbase.coprocessor.Export t1 /tmp/export_hbase2/t2/ > > 18/07/10 14:03:59 INFO client.RpcRetryingCallerImpl: Call exception, tries=6, > retries=6, started=4457 ms ago, cancelled=false, > msg=org.apache.hadoop.security.AccessControlException: Permission denied: > user=hbase, access=WRITE, > inode="/tmp/export_hbase2/t2":foo:supergroup:drwxr-xr-x > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:400) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:256) > at > org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:194) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1846) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1830) > at > org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1789) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.resolvePathForStartFile(FSDirWriteFileOp.java:316) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2411) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2343) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:764) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:451) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) > at sun.reflect.GeneratedConstructorAccessor25.newInstance(Unknown Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88) > at > org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:278) > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1195) > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1174) > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1112) > at > org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:462) > at > org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:459) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at >
[jira] [Commented] (HBASE-20838) Move all setStorage related UT cases from TestFSUtils to TestCommonFSUtils
[ https://issues.apache.org/jira/browse/HBASE-20838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539477#comment-16539477 ] Hadoop QA commented on HBASE-20838: --- (!) A patch to the testing environment has been detected. Re-executing against the patched versions to perform further tests. The console is at https://builds.apache.org/job/PreCommit-HBASE-Build/13584/console in case of problems. > Move all setStorage related UT cases from TestFSUtils to TestCommonFSUtils > -- > > Key: HBASE-20838 > URL: https://issues.apache.org/jira/browse/HBASE-20838 > Project: HBase > Issue Type: Test >Reporter: Yu Li >Assignee: Yu Li >Priority: Major > Attachments: HBASE-20838.patch, HBASE-20838.patch, > HBASE-20838.v2.patch > > > As per > [discussed|https://issues.apache.org/jira/browse/HBASE-20691?focusedCommentId=16517662=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16517662] > in HBASE-20691, since the setStoragePolicy code is in CommonFSUtils, the > test should be in TestCommonFSUtils -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20838) Move all setStorage related UT cases from TestFSUtils to TestCommonFSUtils
[ https://issues.apache.org/jira/browse/HBASE-20838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539476#comment-16539476 ] Yu Li commented on HBASE-20838: --- Patch v2 fixes shellcheck and checks whether root is included in CHANGED_MODULES before adding hbase-server > Move all setStorage related UT cases from TestFSUtils to TestCommonFSUtils > -- > > Key: HBASE-20838 > URL: https://issues.apache.org/jira/browse/HBASE-20838 > Project: HBase > Issue Type: Test >Reporter: Yu Li >Assignee: Yu Li >Priority: Major > Attachments: HBASE-20838.patch, HBASE-20838.patch, > HBASE-20838.v2.patch > > > As per > [discussed|https://issues.apache.org/jira/browse/HBASE-20691?focusedCommentId=16517662=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16517662] > in HBASE-20691, since the setStoragePolicy code is in CommonFSUtils, the > test should be in TestCommonFSUtils -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20838) Move all setStorage related UT cases from TestFSUtils to TestCommonFSUtils
[ https://issues.apache.org/jira/browse/HBASE-20838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-20838: -- Attachment: HBASE-20838.v2.patch > Move all setStorage related UT cases from TestFSUtils to TestCommonFSUtils > -- > > Key: HBASE-20838 > URL: https://issues.apache.org/jira/browse/HBASE-20838 > Project: HBase > Issue Type: Test >Reporter: Yu Li >Assignee: Yu Li >Priority: Major > Attachments: HBASE-20838.patch, HBASE-20838.patch, > HBASE-20838.v2.patch > > > As per > [discussed|https://issues.apache.org/jira/browse/HBASE-20691?focusedCommentId=16517662=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16517662] > in HBASE-20691, since the setStoragePolicy code is in CommonFSUtils, the > test should be in TestCommonFSUtils -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20869) Endpoint-based Export use incorrect user to write to destination
Wei-Chiu Chuang created HBASE-20869: --- Summary: Endpoint-based Export use incorrect user to write to destination Key: HBASE-20869 URL: https://issues.apache.org/jira/browse/HBASE-20869 Project: HBase Issue Type: Bug Components: Coprocessors Affects Versions: 2.0.0 Environment: Hadoop 3.0.0 + HBase 2.0.0, Kerberos. Reporter: Wei-Chiu Chuang Assignee: Wei-Chiu Chuang HBASE-15806 implemented an endpoint based export. It gets caller's HDFS delegation token, and RegionServer is supposed to write out exported files as the caller. Everything works fine if you use run export as hbase user. However, once you use a different user to export, it fails. To reproduce, Add to configuration key hbase.coprocessor.region.classes the coprocessor class org.apache.hadoop.hbase.coprocessor.Export. create a table t1, assign permission to a user foo: {noformat} hbase(main):004:0> user_permission 't1' User Namespace,Table,Family,Qualifier:Permission hbase default,t1,,: [Permission: actions=READ,WRITE,EXEC,CREATE,ADMIN] foo default,t1,,: [Permission: actions=READ,WRITE,EXEC,CREATE,ADMIN]{noformat} As user foo, execute the following command: {noformat} $ hdfs dfs -mkdir /tmp/export_hbase2 $ hbase org.apache.hadoop.hbase.coprocessor.Export t1 /tmp/export_hbase2/t2/ 18/07/10 14:03:59 INFO client.RpcRetryingCallerImpl: Call exception, tries=6, retries=6, started=4457 ms ago, cancelled=false, msg=org.apache.hadoop.security.AccessControlException: Permission denied: user=hbase, access=WRITE, inode="/tmp/export_hbase2/t2":foo:supergroup:drwxr-xr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:400) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:256) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:194) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1846) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1830) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1789) at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.resolvePathForStartFile(FSDirWriteFileOp.java:316) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2411) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2343) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:764) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:451) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675) at sun.reflect.GeneratedConstructorAccessor25.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88) at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:278) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1195) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1174) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1112) at org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:462) at org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:459) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:473) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:400) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1103) at org.apache.hadoop.io.SequenceFile$Writer.(SequenceFile.java:1168) at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:285) at org.apache.hadoop.hbase.coprocessor.Export$SecureWriter.(Export.java:445) at
[jira] [Updated] (HBASE-20855) PeerConfigTracker only support one listener will cause problem when there is a recovered replication queue
[ https://issues.apache.org/jira/browse/HBASE-20855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingyun Tian updated HBASE-20855: - Attachment: HBASE-20855.branch-1.004.patch > PeerConfigTracker only support one listener will cause problem when there is > a recovered replication queue > -- > > Key: HBASE-20855 > URL: https://issues.apache.org/jira/browse/HBASE-20855 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0, 1.4.0, 1.5.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Attachments: HBASE-20855.branch-1.001.patch, > HBASE-20855.branch-1.002.patch, HBASE-20855.branch-1.003.patch, > HBASE-20855.branch-1.004.patch > > > {code} > public void init(Context context) throws IOException { > this.ctx = context; > if (this.ctx != null){ > ReplicationPeer peer = this.ctx.getReplicationPeer(); > if (peer != null){ > peer.trackPeerConfigChanges(this); > } else { > LOG.warn("Not tracking replication peer config changes for Peer Id " + > this.ctx.getPeerId() + > " because there's no such peer"); > } > } > } > {code} > As we know, replication source will set itself to the PeerConfigTracker in > ReplicationPeer. When there is one or more recovered queue, each queue will > generate a new replication source, But they share the same ReplicationPeer. > Then when it calls setListener, the new generated one will cover the older > one. Thus there will only has one ReplicationPeer that receive the peer > config change notify. > {code} > public synchronized void setListener(ReplicationPeerConfigListener listener){ > this.listener = listener; > } > {code} > > To solve this, PeerConfigTracker need to support multiple listener and > listener should be removed when the replication endpoint terminated. > I will upload a patch later with fix and UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20847) The parent procedure of RegionTransitionProcedure may not have the table lock
[ https://issues.apache.org/jira/browse/HBASE-20847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539462#comment-16539462 ] Duo Zhang commented on HBASE-20847: --- The problem is that, in ProcedureExecutor will schedule the sub procedures before releasing the lock(if needed, i.e, holdLock is false), this breaks our assumption above, where we assume that the the parent procedure will not release the lock until the sub procedures finishes... Anyway, I think remove the assumption is still fine, as in Java, you are free release the WriteLock before releasing the readLock. Let me change the code. > The parent procedure of RegionTransitionProcedure may not have the table lock > - > > Key: HBASE-20847 > URL: https://issues.apache.org/jira/browse/HBASE-20847 > Project: HBase > Issue Type: Sub-task > Components: proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-20847-v1.patch, HBASE-20847-v2.patch, > HBASE-20847.patch > > > For example, SCP can also schedule AssignProcedure and obviously it will not > hold the table lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20860) Merged region's RIT state may not be cleaned after master restart
[ https://issues.apache.org/jira/browse/HBASE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539451#comment-16539451 ] Allan Yang commented on HBASE-20860: {code} Hard part is figuring if MTRP is on-going {code} Yes, it is hard to know there is a MTRP for those regions when starting {code} Would it be cleaner calling removeFromOfflineRegions inside in markRegionAsMerged rather than after markRegionAsMerged in MergeTableRegionsProcedure? {code} Modified the patch as you advice. Thanks for reviewing, [~stack]. > Merged region's RIT state may not be cleaned after master restart > - > > Key: HBASE-20860 > URL: https://issues.apache.org/jira/browse/HBASE-20860 > Project: HBase > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Fix For: 3.0.0, 2.1.0, 2.0.2 > > Attachments: HBASE-20860.branch-2.0.002.patch, > HBASE-20860.branch-2.0.003.patch, HBASE-20860.branch-2.0.004.patch, > HBASE-20860.branch-2.0.005.patch, HBASE-20860.branch-2.0.patch > > > In MergeTableRegionsProcedure, we issue UnassignProcedures to offline regions > to merge. But if we restart master just after MergeTableRegionsProcedure > finished these two UnassignProcedure and before it can delete their meta > entries. The new master will found these two region is CLOSED but no > procedures are attached to them. They will be regard as RIT regions and > nobody will clean the RIT state for them later. > A quick way to resolve this stuck situation in the production env is > restarting master again, since the meta entries are deleted in > MergeTableRegionsProcedure. Here, I offer a fix for this problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20860) Merged region's RIT state may not be cleaned after master restart
[ https://issues.apache.org/jira/browse/HBASE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-20860: --- Attachment: HBASE-20860.branch-2.0.005.patch > Merged region's RIT state may not be cleaned after master restart > - > > Key: HBASE-20860 > URL: https://issues.apache.org/jira/browse/HBASE-20860 > Project: HBase > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Fix For: 3.0.0, 2.1.0, 2.0.2 > > Attachments: HBASE-20860.branch-2.0.002.patch, > HBASE-20860.branch-2.0.003.patch, HBASE-20860.branch-2.0.004.patch, > HBASE-20860.branch-2.0.005.patch, HBASE-20860.branch-2.0.patch > > > In MergeTableRegionsProcedure, we issue UnassignProcedures to offline regions > to merge. But if we restart master just after MergeTableRegionsProcedure > finished these two UnassignProcedure and before it can delete their meta > entries. The new master will found these two region is CLOSED but no > procedures are attached to them. They will be regard as RIT regions and > nobody will clean the RIT state for them later. > A quick way to resolve this stuck situation in the production env is > restarting master again, since the meta entries are deleted in > MergeTableRegionsProcedure. Here, I offer a fix for this problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()
[ https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoyuan updated HBASE-20697: - Affects Version/s: 2.0.1 > Can't cache All region locations of the specify table by calling > table.getRegionLocator().getAllRegionLocations() > - > > Key: HBASE-20697 > URL: https://issues.apache.org/jira/browse/HBASE-20697 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1, 1.2.6, 2.0.1 >Reporter: zhaoyuan >Assignee: zhaoyuan >Priority: Major > Fix For: 2.1.0, 1.2.7, 1.3.3, 1.4.6, 2.0.2 > > Attachments: HBASE-20697.branch-1.2.001.patch, > HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, > HBASE-20697.branch-1.2.004.patch, HBASE-20697.master.001.patch, > HBASE-20697.master.002.patch, HBASE-20697.master.002.patch, > HBASE-20697.master.003.patch > > > When we upgrade and restart a new version application which will read and > write to HBase, we will get some operation timeout. The time out is expected > because when the application restarts,It will not hold any region locations > cache and do communication with zk and meta regionserver to get region > locations. > We want to avoid these timeouts so we do warmup work and as far as I am > concerned,the method table.getRegionLocator().getAllRegionLocations() will > fetch all region locations and cache them. However, it didn't work good. > There are still a lot of time outs,so it confused me. > I dig into the source code and find something below > {code:java} > // code placeholder > public List getAllRegionLocations() throws IOException { > TableName tableName = getName(); > NavigableMap locations = > MetaScanner.allTableRegions(this.connection, tableName); > ArrayList regions = new ArrayList<>(locations.size()); > for (Entry entry : locations.entrySet()) { > regions.add(new HRegionLocation(entry.getKey(), entry.getValue())); > } > if (regions.size() > 0) { > connection.cacheLocation(tableName, new RegionLocations(regions)); > } > return regions; > } > In MetaCache > public void cacheLocation(final TableName tableName, final RegionLocations > locations) { > byte [] startKey = > locations.getRegionLocation().getRegionInfo().getStartKey(); > ConcurrentMap tableLocations = > getTableLocations(tableName); > RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, > locations); > boolean isNewCacheEntry = (oldLocation == null); > if (isNewCacheEntry) { > if (LOG.isTraceEnabled()) { > LOG.trace("Cached location: " + locations); > } > addToCachedServers(locations); > return; > } > {code} > It will collect all regions into one RegionLocations object and only cache > the first not null region location and then when we put or get to hbase, we > do getCacheLocation() > {code:java} > // code placeholder > public RegionLocations getCachedLocation(final TableName tableName, final > byte [] row) { > ConcurrentNavigableMap tableLocations = > getTableLocations(tableName); > Entry e = tableLocations.floorEntry(row); > if (e == null) { > if (metrics!= null) metrics.incrMetaCacheMiss(); > return null; > } > RegionLocations possibleRegion = e.getValue(); > // make sure that the end key is greater than the row we're looking > // for, otherwise the row actually belongs in the next region, not > // this one. the exception case is when the endkey is > // HConstants.EMPTY_END_ROW, signifying that the region we're > // checking is actually the last region in the table. > byte[] endKey = > possibleRegion.getRegionLocation().getRegionInfo().getEndKey(); > if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) || > getRowComparator(tableName).compareRows( > endKey, 0, endKey.length, row, 0, row.length) > 0) { > if (metrics != null) metrics.incrMetaCacheHit(); > return possibleRegion; > } > // Passed all the way through, so we got nothing - complete cache miss > if (metrics != null) metrics.incrMetaCacheMiss(); > return null; > } > {code} > It will choose the first location to be possibleRegion and possibly it will > miss match. > So did I forget something or may be wrong somewhere? If this is indeed a bug > I think it can be fixed not very hard. > Hope commiters and PMC review this ! > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()
[ https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoyuan updated HBASE-20697: - Fix Version/s: 2.0.2 1.4.6 2.1.0 > Can't cache All region locations of the specify table by calling > table.getRegionLocator().getAllRegionLocations() > - > > Key: HBASE-20697 > URL: https://issues.apache.org/jira/browse/HBASE-20697 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1, 1.2.6 >Reporter: zhaoyuan >Assignee: zhaoyuan >Priority: Major > Fix For: 2.1.0, 1.2.7, 1.3.3, 1.4.6, 2.0.2 > > Attachments: HBASE-20697.branch-1.2.001.patch, > HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, > HBASE-20697.branch-1.2.004.patch, HBASE-20697.master.001.patch, > HBASE-20697.master.002.patch, HBASE-20697.master.002.patch, > HBASE-20697.master.003.patch > > > When we upgrade and restart a new version application which will read and > write to HBase, we will get some operation timeout. The time out is expected > because when the application restarts,It will not hold any region locations > cache and do communication with zk and meta regionserver to get region > locations. > We want to avoid these timeouts so we do warmup work and as far as I am > concerned,the method table.getRegionLocator().getAllRegionLocations() will > fetch all region locations and cache them. However, it didn't work good. > There are still a lot of time outs,so it confused me. > I dig into the source code and find something below > {code:java} > // code placeholder > public List getAllRegionLocations() throws IOException { > TableName tableName = getName(); > NavigableMap locations = > MetaScanner.allTableRegions(this.connection, tableName); > ArrayList regions = new ArrayList<>(locations.size()); > for (Entry entry : locations.entrySet()) { > regions.add(new HRegionLocation(entry.getKey(), entry.getValue())); > } > if (regions.size() > 0) { > connection.cacheLocation(tableName, new RegionLocations(regions)); > } > return regions; > } > In MetaCache > public void cacheLocation(final TableName tableName, final RegionLocations > locations) { > byte [] startKey = > locations.getRegionLocation().getRegionInfo().getStartKey(); > ConcurrentMap tableLocations = > getTableLocations(tableName); > RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, > locations); > boolean isNewCacheEntry = (oldLocation == null); > if (isNewCacheEntry) { > if (LOG.isTraceEnabled()) { > LOG.trace("Cached location: " + locations); > } > addToCachedServers(locations); > return; > } > {code} > It will collect all regions into one RegionLocations object and only cache > the first not null region location and then when we put or get to hbase, we > do getCacheLocation() > {code:java} > // code placeholder > public RegionLocations getCachedLocation(final TableName tableName, final > byte [] row) { > ConcurrentNavigableMap tableLocations = > getTableLocations(tableName); > Entry e = tableLocations.floorEntry(row); > if (e == null) { > if (metrics!= null) metrics.incrMetaCacheMiss(); > return null; > } > RegionLocations possibleRegion = e.getValue(); > // make sure that the end key is greater than the row we're looking > // for, otherwise the row actually belongs in the next region, not > // this one. the exception case is when the endkey is > // HConstants.EMPTY_END_ROW, signifying that the region we're > // checking is actually the last region in the table. > byte[] endKey = > possibleRegion.getRegionLocation().getRegionInfo().getEndKey(); > if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) || > getRowComparator(tableName).compareRows( > endKey, 0, endKey.length, row, 0, row.length) > 0) { > if (metrics != null) metrics.incrMetaCacheHit(); > return possibleRegion; > } > // Passed all the way through, so we got nothing - complete cache miss > if (metrics != null) metrics.incrMetaCacheMiss(); > return null; > } > {code} > It will choose the first location to be possibleRegion and possibly it will > miss match. > So did I forget something or may be wrong somewhere? If this is indeed a bug > I think it can be fixed not very hard. > Hope commiters and PMC review this ! > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version
[ https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539439#comment-16539439 ] Vladimir Rodionov commented on HBASE-20866: --- {quote} Looking forward to the patch. {quote} That sounds quite optimistic, [~yuzhih...@gmail.com]. Getting back 0.98 perf (which, in turn, worse than 0.94) will require a lot of patches, I presume. > HBase 1.x scan performance degradation compared to 0.98 version > --- > > Key: HBASE-20866 > URL: https://issues.apache.org/jira/browse/HBASE-20866 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.2 >Reporter: Vikas Vishwakarma >Assignee: Vikas Vishwakarma >Priority: Critical > Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6 > > > Internally while testing 1.3 as part of migration from 0.98 to 1.3 we > observed perf degradation in scan performance for phoenix queries varying > from few 10's to upto 200% depending on the query being executed. We tried > simple native HBase scan and there also we saw upto 40% degradation in > performance when the number of column qualifiers are high (40-50+) > To identify the root cause of performance diff between 0.98 and 1.3 we > carried out lot of experiments with profiling and git bisect iterations, > however we were not able to identify any particular source of scan > performance degradation and it looked like this is an accumulated degradation > of 5-10% over various enhancements and refactoring. > We identified few major enhancements like partialResult handling, > ScannerContext with heartbeat processing, time/size limiting, RPC > refactoring, etc that could have contributed to small degradation in > performance which put together could be leading to large overall degradation. > One of the changes is > [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which > implements partialResult handling. In ClientScanner.java the results received > from server are cached on the client side by converting the result array into > an ArrayList. This function gets called in a loop depending on the number of > rows in the scan result. Example for ten’s of millions of rows scanned, this > can be called in the order of millions of times. > In almost all the cases 99% of the time (except for handling partial results, > etc). We are just taking the resultsFromServer converting it into a ArrayList > resultsToAddToCache in addResultsToList(..) and then iterating over the list > again and adding it to cache in loadCache(..) as given in the code path below > In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → > addResultsToList(..) → > {code:java} > loadCache() { > ... > List resultsToAddToCache = > getResultsToAddToCache(values, callable.isHeartbeatMessage()); > ... > … > for (Result rs : resultsToAddToCache) { > rs = filterLoadedCell(rs); > cache.add(rs); > ... > } > } > getResultsToAddToCache(..) { > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && > scan.getAllowPartialResults(); > .. > if (allowPartials || isBatchSet) { > addResultsToList(resultsToAddToCache, resultsFromServer, 0, > (null == resultsFromServer ? 0 : resultsFromServer.length)); > return resultsToAddToCache; > } > ... > } > private void addResultsToList(List outputList, Result[] inputArray, > int start, int end) { > if (inputArray == null || start < 0 || end > inputArray.length) return; > for (int i = start; i < end; i++) { > outputList.add(inputArray[i]); > } > }{code} > > It looks like we can avoid the result array to arraylist conversion > (resultsFromServer --> resultsToAddToCache ) for the first case which is also > the most frequent case and instead directly take the values arraay returned > by callable and add it to the cache without converting it into ArrayList. > I have taken both these flags allowPartials and isBatchSet out in loadcahe() > and I am directly adding values to scanner cache if the above condition is > pass instead of coverting it into arrayList by calling > getResultsToAddToCache(). For example: > {code:java} > protected void loadCache() throws IOException { > Result[] values = null; > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && scan.getAllowPartialResults(); > .. > for (;;) { > try { > values = call(callable, caller, scannerTimeout); > .. > } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) { > .. > } > if (allowPartials || isBatchSet) { // DIRECTLY COPY values TO CACHE > if (values != null) { > for (int v=0; v Result rs = values[v]; > > cache.add(rs); > ... > } else { // DO ALL THE REGULAR
[jira] [Commented] (HBASE-20649) Validate HFiles do not have PREFIX_TREE DataBlockEncoding
[ https://issues.apache.org/jira/browse/HBASE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539423#comment-16539423 ] Duo Zhang commented on HBASE-20649: --- I think the tool is a good start. We can add the steps in our ref guide for operators. Maybe a improvement could be that, output the suggested operations at last, for example, which tables need to be major compacted, and which snapshots are dirty and needs to be dropped or reconstructed. Can do this in a follow-on issue. Thanks. > Validate HFiles do not have PREFIX_TREE DataBlockEncoding > - > > Key: HBASE-20649 > URL: https://issues.apache.org/jira/browse/HBASE-20649 > Project: HBase > Issue Type: New Feature >Reporter: Peter Somogyi >Assignee: Balazs Meszaros >Priority: Minor > Fix For: 3.0.0 > > Attachments: HBASE-20649.master.001.patch, > HBASE-20649.master.002.patch, HBASE-20649.master.003.patch, > HBASE-20649.master.004.patch, HBASE-20649.master.005.patch > > > HBASE-20592 adds a tool to check column families on the cluster do not have > PREFIX_TREE encoding. > Since it is possible that DataBlockEncoding was already changed but HFiles > are not rewritten yet we would need a tool that can verify the content of > hfiles in the cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20651) Master, prevents hbck or shell command to reassign the split parent region
[ https://issues.apache.org/jira/browse/HBASE-20651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huaxiang sun updated HBASE-20651: - Resolution: Fixed Fix Version/s: 1.5.0 Status: Resolved (was: Patch Available) Thanks [~esteban] and [~mdrob] for review, I pushed the patch to branch-1. > Master, prevents hbck or shell command to reassign the split parent region > -- > > Key: HBASE-20651 > URL: https://issues.apache.org/jira/browse/HBASE-20651 > Project: HBase > Issue Type: Improvement > Components: master >Affects Versions: 1.2.6 >Reporter: huaxiang sun >Assignee: huaxiang sun >Priority: Minor > Fix For: 1.5.0 > > Attachments: HBASE-20651-branch-1-v001.patch, > HBASE-20651-branch-1-v002.patch, HBASE-20651-branch-1-v003.patch > > > We are seeing that hbck brings back split parent region and this causes > region inconsistency. More details will be filled as reproduce is still > ongoing. Might need to do something at hbck or master to prevent this from > happening. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20771) PUT operation fail with "No server address listed in hbase:meta for region xxxxx"
[ https://issues.apache.org/jira/browse/HBASE-20771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539337#comment-16539337 ] Hudson commented on HBASE-20771: Results for branch branch-1.3 [build #388 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/388/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/388//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/388//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.3/388//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > PUT operation fail with "No server address listed in hbase:meta for region > x" > - > > Key: HBASE-20771 > URL: https://issues.apache.org/jira/browse/HBASE-20771 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 1.5.0 >Reporter: Pankaj Kumar >Assignee: Pankaj Kumar >Priority: Major > Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6 > > Attachments: HBASE-20771.branch-1.001.patch, > HBASE-20771.branch-1.002.patch > > > 1) Create a table with 1 region > 2) AM send RPC to RS to open the region, > {code} > // invoke assignment (async) > ArrayList userRegionSet = new ArrayList(regions); > for (Map.Entry> plan: bulkPlan.entrySet()) { > if (!assign(plan.getKey(), plan.getValue())) { > for (HRegionInfo region: plan.getValue()) { > if (!regionStates.isRegionOnline(region)) { > invokeAssign(region); > if (!region.getTable().isSystemTable()) { > userRegionSet.add(region); > } > } > } > } > } > {code} > In above code if assignment fails (due to some problem) then AM will sumbit > the assign request to the thread pool and wait for some duration (here it > will wait for 20sec, region count is 1). > 3) After 20sec, CreateTableProcedure will set table state as ENABLED in ZK > and finish the procedure. > {code} > // Mark the table as Enabling > assignmentManager.getTableStateManager().setTableState(tableName, > ZooKeeperProtos.Table.State.ENABLING); > // Trigger immediate assignment of the regions in round-robin fashion > ModifyRegionUtils.assignRegions(assignmentManager, regions); > // Enable table > assignmentManager.getTableStateManager() > .setTableState(tableName, ZooKeeperProtos.Table.State.ENABLED); > {code} > 4) At the client side, CreateTableFuture.waitProcedureResult(...) is waiting > for the procedure to finish, > {code} > // If the procedure is no longer running, we should have a result > if (response != null && response.getState() != > GetProcedureResultResponse.State.RUNNING) { > procResultFound = response.getState() != > GetProcedureResultResponse.State.NOT_FOUND; > return convertResult(response); > } > {code} > Here we wait for operation result only when procedure result not found, but > in this scenario it will be wrong because region assignment didnt complete, > {code} > // if we don't have a proc result, try the compatibility wait > if (!procResultFound) { > result = waitOperationResult(deadlineTs); > } > {code} > Since HBaseAdmin didn't wait for operation result (successful region > assignment), so client PUT operation will fail by the time region is > successfully opened because "info:server" entry wont be there in meta. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20649) Validate HFiles do not have PREFIX_TREE DataBlockEncoding
[ https://issues.apache.org/jira/browse/HBASE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539329#comment-16539329 ] Zach York commented on HBASE-20649: --- Trying to get up to speed on this all. Overall looks like a handy upgrade tool! [~busbey] Your steps are what we want to document as an operator? It would be awesome if we could provide more info when running the specific tool (if it fails in root dir, suggest trying a major compaction if data encoding for the table is correct. If it fails in archive dir, see if any Snapshots reference these files). Could we have a tool/script to help automate determining which snapshot is 'dirty' and help to automatically clean it? It just seems like a lot of manual steps to get your cluster upgrade ready (imagine if you had a number of incremental snapshots). > Validate HFiles do not have PREFIX_TREE DataBlockEncoding > - > > Key: HBASE-20649 > URL: https://issues.apache.org/jira/browse/HBASE-20649 > Project: HBase > Issue Type: New Feature >Reporter: Peter Somogyi >Assignee: Balazs Meszaros >Priority: Minor > Fix For: 3.0.0 > > Attachments: HBASE-20649.master.001.patch, > HBASE-20649.master.002.patch, HBASE-20649.master.003.patch, > HBASE-20649.master.004.patch, HBASE-20649.master.005.patch > > > HBASE-20592 adds a tool to check column families on the cluster do not have > PREFIX_TREE encoding. > Since it is possible that DataBlockEncoding was already changed but HFiles > are not rewritten yet we would need a tool that can verify the content of > hfiles in the cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20771) PUT operation fail with "No server address listed in hbase:meta for region xxxxx"
[ https://issues.apache.org/jira/browse/HBASE-20771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539319#comment-16539319 ] Hudson commented on HBASE-20771: Results for branch branch-1.4 [build #381 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/381/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/381//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/381//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/381//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > PUT operation fail with "No server address listed in hbase:meta for region > x" > - > > Key: HBASE-20771 > URL: https://issues.apache.org/jira/browse/HBASE-20771 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 1.5.0 >Reporter: Pankaj Kumar >Assignee: Pankaj Kumar >Priority: Major > Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6 > > Attachments: HBASE-20771.branch-1.001.patch, > HBASE-20771.branch-1.002.patch > > > 1) Create a table with 1 region > 2) AM send RPC to RS to open the region, > {code} > // invoke assignment (async) > ArrayList userRegionSet = new ArrayList(regions); > for (Map.Entry> plan: bulkPlan.entrySet()) { > if (!assign(plan.getKey(), plan.getValue())) { > for (HRegionInfo region: plan.getValue()) { > if (!regionStates.isRegionOnline(region)) { > invokeAssign(region); > if (!region.getTable().isSystemTable()) { > userRegionSet.add(region); > } > } > } > } > } > {code} > In above code if assignment fails (due to some problem) then AM will sumbit > the assign request to the thread pool and wait for some duration (here it > will wait for 20sec, region count is 1). > 3) After 20sec, CreateTableProcedure will set table state as ENABLED in ZK > and finish the procedure. > {code} > // Mark the table as Enabling > assignmentManager.getTableStateManager().setTableState(tableName, > ZooKeeperProtos.Table.State.ENABLING); > // Trigger immediate assignment of the regions in round-robin fashion > ModifyRegionUtils.assignRegions(assignmentManager, regions); > // Enable table > assignmentManager.getTableStateManager() > .setTableState(tableName, ZooKeeperProtos.Table.State.ENABLED); > {code} > 4) At the client side, CreateTableFuture.waitProcedureResult(...) is waiting > for the procedure to finish, > {code} > // If the procedure is no longer running, we should have a result > if (response != null && response.getState() != > GetProcedureResultResponse.State.RUNNING) { > procResultFound = response.getState() != > GetProcedureResultResponse.State.NOT_FOUND; > return convertResult(response); > } > {code} > Here we wait for operation result only when procedure result not found, but > in this scenario it will be wrong because region assignment didnt complete, > {code} > // if we don't have a proc result, try the compatibility wait > if (!procResultFound) { > result = waitOperationResult(deadlineTs); > } > {code} > Since HBaseAdmin didn't wait for operation result (successful region > assignment), so client PUT operation will fail by the time region is > successfully opened because "info:server" entry wont be there in meta. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20557) Backport HBASE-17215 to branch-1
[ https://issues.apache.org/jira/browse/HBASE-20557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539317#comment-16539317 ] Hudson commented on HBASE-20557: Results for branch branch-1.4 [build #381 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/381/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/381//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/381//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/381//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > Backport HBASE-17215 to branch-1 > > > Key: HBASE-20557 > URL: https://issues.apache.org/jira/browse/HBASE-20557 > Project: HBase > Issue Type: Sub-task > Components: HFile, master >Affects Versions: 1.4.4, 1.4.5 >Reporter: Tak Lon (Stephen) Wu >Assignee: Tak Lon (Stephen) Wu >Priority: Major > Fix For: 1.5.0, 1.4.6 > > Attachments: HBASE-20557.branch-1.001.patch, > HBASE-20557.branch-1.002.patch, HBASE-20557.branch-1.003.patch > > > As part of HBASE-20555, HBASE-17215 is the second patch that is needed for > backporting HBASE-18083 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20806) Split style journal for flushes and compactions
[ https://issues.apache.org/jira/browse/HBASE-20806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539316#comment-16539316 ] Hudson commented on HBASE-20806: Results for branch branch-1.4 [build #381 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/381/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/381//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/381//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/381//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > Split style journal for flushes and compactions > --- > > Key: HBASE-20806 > URL: https://issues.apache.org/jira/browse/HBASE-20806 > Project: HBase > Issue Type: Improvement >Reporter: Abhishek Singh Chouhan >Assignee: Abhishek Singh Chouhan >Priority: Minor > Fix For: 3.0.0, 2.1.0, 1.5.0, 1.2.7, 1.3.3, 1.4.6, 2.0.2 > > Attachments: HBASE-20806.branch-1.001.patch, > HBASE-20806.branch-1.002.patch, HBASE-20806.branch-1.003.patch, > HBASE-20806.branch-2.001.patch, HBASE-20806.master.001.patch, > HBASE-20806.master.002.patch, HBASE-20806.master.003.patch > > > In 1.x we have split transaction journal that gives a clear picture of when > various stages of splits took place. We should have a similar thing for > flushes and compactions so as to have insights into time spent in various > stages, which we can use to identify regressions that might creep up. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-17215) Separate small/large file delete threads in HFileCleaner to accelerate archived hfile cleanup speed
[ https://issues.apache.org/jira/browse/HBASE-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539318#comment-16539318 ] Hudson commented on HBASE-17215: Results for branch branch-1.4 [build #381 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/381/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/381//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/381//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.4/381//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > Separate small/large file delete threads in HFileCleaner to accelerate > archived hfile cleanup speed > --- > > Key: HBASE-17215 > URL: https://issues.apache.org/jira/browse/HBASE-17215 > Project: HBase > Issue Type: Improvement >Reporter: Yu Li >Assignee: Yu Li >Priority: Major > Fix For: 2.0.0 > > Attachments: HBASE-17215.patch, HBASE-17215.v2.patch, > HBASE-17215.v3.patch > > > When using PCIe-SSD the flush speed will be really quick, and although we > have per CF flush, we still have the > {{hbase.regionserver.optionalcacheflushinterval}} setting and some other > mechanism to avoid data kept in memory for too long to flush small hfiles. In > our online environment we found the single thread cleaner kept cleaning > earlier flushed small files while large files got no chance, which caused > disk full then many other problems. > Deleting hfiles in parallel with too many threads will also increase the > workload of namenode, so here we propose to separate large/small hfile > cleaner threads just like we do for compaction, and it turned out to work > well in our cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20771) PUT operation fail with "No server address listed in hbase:meta for region xxxxx"
[ https://issues.apache.org/jira/browse/HBASE-20771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539308#comment-16539308 ] Hudson commented on HBASE-20771: Results for branch branch-1.2 [build #392 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/392/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/392//General_Nightly_Build_Report/] (x) {color:red}-1 jdk7 checks{color} -- For more information [see jdk7 report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/392//JDK7_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1.2/392//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 source release artifact{color} -- See build output for details. > PUT operation fail with "No server address listed in hbase:meta for region > x" > - > > Key: HBASE-20771 > URL: https://issues.apache.org/jira/browse/HBASE-20771 > Project: HBase > Issue Type: Bug > Components: Client >Affects Versions: 1.5.0 >Reporter: Pankaj Kumar >Assignee: Pankaj Kumar >Priority: Major > Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6 > > Attachments: HBASE-20771.branch-1.001.patch, > HBASE-20771.branch-1.002.patch > > > 1) Create a table with 1 region > 2) AM send RPC to RS to open the region, > {code} > // invoke assignment (async) > ArrayList userRegionSet = new ArrayList(regions); > for (Map.Entry> plan: bulkPlan.entrySet()) { > if (!assign(plan.getKey(), plan.getValue())) { > for (HRegionInfo region: plan.getValue()) { > if (!regionStates.isRegionOnline(region)) { > invokeAssign(region); > if (!region.getTable().isSystemTable()) { > userRegionSet.add(region); > } > } > } > } > } > {code} > In above code if assignment fails (due to some problem) then AM will sumbit > the assign request to the thread pool and wait for some duration (here it > will wait for 20sec, region count is 1). > 3) After 20sec, CreateTableProcedure will set table state as ENABLED in ZK > and finish the procedure. > {code} > // Mark the table as Enabling > assignmentManager.getTableStateManager().setTableState(tableName, > ZooKeeperProtos.Table.State.ENABLING); > // Trigger immediate assignment of the regions in round-robin fashion > ModifyRegionUtils.assignRegions(assignmentManager, regions); > // Enable table > assignmentManager.getTableStateManager() > .setTableState(tableName, ZooKeeperProtos.Table.State.ENABLED); > {code} > 4) At the client side, CreateTableFuture.waitProcedureResult(...) is waiting > for the procedure to finish, > {code} > // If the procedure is no longer running, we should have a result > if (response != null && response.getState() != > GetProcedureResultResponse.State.RUNNING) { > procResultFound = response.getState() != > GetProcedureResultResponse.State.NOT_FOUND; > return convertResult(response); > } > {code} > Here we wait for operation result only when procedure result not found, but > in this scenario it will be wrong because region assignment didnt complete, > {code} > // if we don't have a proc result, try the compatibility wait > if (!procResultFound) { > result = waitOperationResult(deadlineTs); > } > {code} > Since HBaseAdmin didn't wait for operation result (successful region > assignment), so client PUT operation will fail by the time region is > successfully opened because "info:server" entry wont be there in meta. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()
[ https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539295#comment-16539295 ] Andrew Purtell commented on HBASE-20697: +1 for all of the branch-1s from me > Can't cache All region locations of the specify table by calling > table.getRegionLocator().getAllRegionLocations() > - > > Key: HBASE-20697 > URL: https://issues.apache.org/jira/browse/HBASE-20697 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1, 1.2.6 >Reporter: zhaoyuan >Assignee: zhaoyuan >Priority: Major > Fix For: 1.2.7, 1.3.3 > > Attachments: HBASE-20697.branch-1.2.001.patch, > HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, > HBASE-20697.branch-1.2.004.patch, HBASE-20697.master.001.patch, > HBASE-20697.master.002.patch, HBASE-20697.master.002.patch, > HBASE-20697.master.003.patch > > > When we upgrade and restart a new version application which will read and > write to HBase, we will get some operation timeout. The time out is expected > because when the application restarts,It will not hold any region locations > cache and do communication with zk and meta regionserver to get region > locations. > We want to avoid these timeouts so we do warmup work and as far as I am > concerned,the method table.getRegionLocator().getAllRegionLocations() will > fetch all region locations and cache them. However, it didn't work good. > There are still a lot of time outs,so it confused me. > I dig into the source code and find something below > {code:java} > // code placeholder > public List getAllRegionLocations() throws IOException { > TableName tableName = getName(); > NavigableMap locations = > MetaScanner.allTableRegions(this.connection, tableName); > ArrayList regions = new ArrayList<>(locations.size()); > for (Entry entry : locations.entrySet()) { > regions.add(new HRegionLocation(entry.getKey(), entry.getValue())); > } > if (regions.size() > 0) { > connection.cacheLocation(tableName, new RegionLocations(regions)); > } > return regions; > } > In MetaCache > public void cacheLocation(final TableName tableName, final RegionLocations > locations) { > byte [] startKey = > locations.getRegionLocation().getRegionInfo().getStartKey(); > ConcurrentMap tableLocations = > getTableLocations(tableName); > RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, > locations); > boolean isNewCacheEntry = (oldLocation == null); > if (isNewCacheEntry) { > if (LOG.isTraceEnabled()) { > LOG.trace("Cached location: " + locations); > } > addToCachedServers(locations); > return; > } > {code} > It will collect all regions into one RegionLocations object and only cache > the first not null region location and then when we put or get to hbase, we > do getCacheLocation() > {code:java} > // code placeholder > public RegionLocations getCachedLocation(final TableName tableName, final > byte [] row) { > ConcurrentNavigableMap tableLocations = > getTableLocations(tableName); > Entry e = tableLocations.floorEntry(row); > if (e == null) { > if (metrics!= null) metrics.incrMetaCacheMiss(); > return null; > } > RegionLocations possibleRegion = e.getValue(); > // make sure that the end key is greater than the row we're looking > // for, otherwise the row actually belongs in the next region, not > // this one. the exception case is when the endkey is > // HConstants.EMPTY_END_ROW, signifying that the region we're > // checking is actually the last region in the table. > byte[] endKey = > possibleRegion.getRegionLocation().getRegionInfo().getEndKey(); > if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) || > getRowComparator(tableName).compareRows( > endKey, 0, endKey.length, row, 0, row.length) > 0) { > if (metrics != null) metrics.incrMetaCacheHit(); > return possibleRegion; > } > // Passed all the way through, so we got nothing - complete cache miss > if (metrics != null) metrics.incrMetaCacheMiss(); > return null; > } > {code} > It will choose the first location to be possibleRegion and possibly it will > miss match. > So did I forget something or may be wrong somewhere? If this is indeed a bug > I think it can be fixed not very hard. > Hope commiters and PMC review this ! > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20868) Fix TestCheckTestClasses on HBASE-18477
[ https://issues.apache.org/jira/browse/HBASE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539278#comment-16539278 ] Zach York commented on HBASE-20868: --- [~yuzhih...@gmail.com] Can you take a look when you get a chance? It's a simple annotation fix. > Fix TestCheckTestClasses on HBASE-18477 > --- > > Key: HBASE-20868 > URL: https://issues.apache.org/jira/browse/HBASE-20868 > Project: HBase > Issue Type: Sub-task >Affects Versions: HBASE-18477 >Reporter: Zach York >Assignee: Zach York >Priority: Minor > Fix For: HBASE-18477 > > Attachments: HBASE-20868.HBASE-18477.001.patch, > HBASE-20868.HBASE-18477.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20868) Fix TestCheckTestClasses on HBASE-18477
[ https://issues.apache.org/jira/browse/HBASE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539273#comment-16539273 ] Hadoop QA commented on HBASE-20868: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} HBASE-18477 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 37s{color} | {color:green} HBASE-18477 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s{color} | {color:green} HBASE-18477 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} HBASE-18477 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 0s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 35s{color} | {color:green} HBASE-18477 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} HBASE-18477 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s{color} | {color:green} hbase-common: The patch generated 0 new + 3 unchanged - 1 fixed = 3 total (was 4) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 59s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 9m 38s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 34s{color} | {color:green} hbase-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 11s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 36m 48s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-20868 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931068/HBASE-20868.HBASE-18477.002.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 6d0f7815e63b 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | HBASE-18477 / c402868642 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/13581/testReport/ | | Max. process+thread count | 325 (vs. ulimit of 1) | | modules | C: hbase-common U: hbase-common | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/13581/console | | Powered by | Apache Yetus 0.7.0
[jira] [Commented] (HBASE-20838) Move all setStorage related UT cases from TestFSUtils to TestCommonFSUtils
[ https://issues.apache.org/jira/browse/HBASE-20838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539248#comment-16539248 ] Hadoop QA commented on HBASE-20838: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 3s{color} | {color:blue} Shelldocs was not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 48s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 56s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 19s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 45s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} shellcheck {color} | {color:red} 0m 0s{color} | {color:red} The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 18s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 20s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 49s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}229m 51s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}201m 22s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 44s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}485m 47s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce
[jira] [Updated] (HBASE-20868) Fix TestCheckTestClasses on HBASE-18477
[ https://issues.apache.org/jira/browse/HBASE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zach York updated HBASE-20868: -- Attachment: HBASE-20868.HBASE-18477.002.patch > Fix TestCheckTestClasses on HBASE-18477 > --- > > Key: HBASE-20868 > URL: https://issues.apache.org/jira/browse/HBASE-20868 > Project: HBase > Issue Type: Sub-task >Affects Versions: HBASE-18477 >Reporter: Zach York >Assignee: Zach York >Priority: Minor > Fix For: HBASE-18477 > > Attachments: HBASE-20868.HBASE-18477.001.patch, > HBASE-20868.HBASE-18477.002.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20868) Fix TestCheckTestClasses on HBASE-18477
[ https://issues.apache.org/jira/browse/HBASE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539184#comment-16539184 ] Hadoop QA commented on HBASE-20868: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} HBASE-18477 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 10s{color} | {color:green} HBASE-18477 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} HBASE-18477 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} HBASE-18477 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 33s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 34s{color} | {color:green} HBASE-18477 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} HBASE-18477 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} hbase-common: The patch generated 0 new + 3 unchanged - 1 fixed = 3 total (was 4) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 32s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 15s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 16s{color} | {color:red} hbase-common in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 39m 8s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.util.TestReadReplicaClustersTableNameUtil | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-20868 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931049/HBASE-20868.HBASE-18477.001.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux f2a6d951700a 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | HBASE-18477 / c402868642 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC3 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/13580/artifact/patchprocess/patch-unit-hbase-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/13580/testReport/ | | Max.
[jira] [Commented] (HBASE-18477) Umbrella JIRA for HBase Read Replica clusters
[ https://issues.apache.org/jira/browse/HBASE-18477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539167#comment-16539167 ] Hudson commented on HBASE-18477: Results for branch HBASE-18477 [build #260 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/260/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/260//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/260//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/260//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (x) {color:red}-1 client integration test{color} --Failed when running client tests on top of Hadoop 2. [see log for details|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-18477/260//artifact/output-integration/hadoop-2.log]. (note that this means we didn't run on Hadoop 3) > Umbrella JIRA for HBase Read Replica clusters > - > > Key: HBASE-18477 > URL: https://issues.apache.org/jira/browse/HBASE-18477 > Project: HBase > Issue Type: New Feature >Reporter: Zach York >Assignee: Zach York >Priority: Major > Attachments: HBase Read-Replica Clusters Scope doc.docx, HBase > Read-Replica Clusters Scope doc.pdf, HBase Read-Replica Clusters Scope > doc_v2.docx, HBase Read-Replica Clusters Scope doc_v2.pdf > > > Recently, changes (such as HBASE-17437) have unblocked HBase to run with a > root directory external to the cluster (such as in Amazon S3). This means > that the data is stored outside of the cluster and can be accessible after > the cluster has been terminated. One use case that is often asked about is > pointing multiple clusters to one root directory (sharing the data) to have > read resiliency in the case of a cluster failure. > > This JIRA is an umbrella JIRA to contain all the tasks necessary to create a > read-replica HBase cluster that is pointed at the same root directory. > > This requires making the Read-Replica cluster Read-Only (no metadata > operation or data operations). > Separating the hbase:meta table for each cluster (Otherwise HBase gets > confused with multiple clusters trying to update the meta table with their ip > addresses) > Adding refresh functionality for the meta table to ensure new metadata is > picked up on the read replica cluster. > Adding refresh functionality for HFiles for a given table to ensure new data > is picked up on the read replica cluster. > > This can be used with any existing cluster that is backed by an external > filesystem. > > Please note that this feature is still quite manual (with the potential for > automation later). > > More information on this particular feature can be found here: > https://aws.amazon.com/blogs/big-data/setting-up-read-replica-clusters-with-hbase-on-amazon-s3/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-19682) Use Collections.emptyList() For Empty List Values
[ https://issues.apache.org/jira/browse/HBASE-19682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539148#comment-16539148 ] BELUGA BEHR commented on HBASE-19682: - Please accept this submission to the project. > Use Collections.emptyList() For Empty List Values > - > > Key: HBASE-19682 > URL: https://issues.apache.org/jira/browse/HBASE-19682 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HBASE-19682.1.patch, HBASE-19682.2.patch, > HBASE-19682.3.1.patch, HBASE-19682.4.patch, HBASE-19682.5.patch, example.patch > > > Use {{Collection.emptyList()}} for returning an empty list instead of > {{return new ArrayList<> ()}}. The default constructor creates a buffer of > size 10 for _ArrayList_ therefore, returning this static value saves on some > memory and GC pressure and saves time not having to allocate a new internally > buffer for each instantiation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20868) Fix TestCheckTestClasses on HBASE-18477
[ https://issues.apache.org/jira/browse/HBASE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zach York updated HBASE-20868: -- Status: Patch Available (was: Open) > Fix TestCheckTestClasses on HBASE-18477 > --- > > Key: HBASE-20868 > URL: https://issues.apache.org/jira/browse/HBASE-20868 > Project: HBase > Issue Type: Sub-task >Affects Versions: HBASE-18477 >Reporter: Zach York >Assignee: Zach York >Priority: Minor > Fix For: HBASE-18477 > > Attachments: HBASE-20868.HBASE-18477.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20868) Fix TestCheckTestClasses on HBASE-18477
[ https://issues.apache.org/jira/browse/HBASE-20868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zach York updated HBASE-20868: -- Attachment: HBASE-20868.HBASE-18477.001.patch > Fix TestCheckTestClasses on HBASE-18477 > --- > > Key: HBASE-20868 > URL: https://issues.apache.org/jira/browse/HBASE-20868 > Project: HBase > Issue Type: Sub-task >Affects Versions: HBASE-18477 >Reporter: Zach York >Assignee: Zach York >Priority: Minor > Fix For: HBASE-18477 > > Attachments: HBASE-20868.HBASE-18477.001.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20868) Fix TestCheckTestClasses on HBASE-18477
Zach York created HBASE-20868: - Summary: Fix TestCheckTestClasses on HBASE-18477 Key: HBASE-20868 URL: https://issues.apache.org/jira/browse/HBASE-20868 Project: HBase Issue Type: Sub-task Affects Versions: HBASE-18477 Reporter: Zach York Assignee: Zach York Fix For: HBASE-18477 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20832) Generate CHANGES.md and RELEASENOTES.md for 2.1.0
[ https://issues.apache.org/jira/browse/HBASE-20832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539035#comment-16539035 ] Hudson commented on HBASE-20832: Results for branch branch-2.1 [build #46 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/46/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/46//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/46//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/46//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Generate CHANGES.md and RELEASENOTES.md for 2.1.0 > - > > Key: HBASE-20832 > URL: https://issues.apache.org/jira/browse/HBASE-20832 > Project: HBase > Issue Type: Sub-task > Components: documentation >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 2.1.0 > > Attachments: HBASE-20832-branch-2.1-addendum-v1.patch, > HBASE-20832-branch-2.1-addendum-v2.patch, > HBASE-20832-branch-2.1-addendum-v2.patch, > HBASE-20832-branch-2.1-addendum.patch, HBASE-20832-v1.patch, HBASE-20832.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20862) Address 2.1.0 Compatibility Report Issues
[ https://issues.apache.org/jira/browse/HBASE-20862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539034#comment-16539034 ] Hudson commented on HBASE-20862: Results for branch branch-2.1 [build #46 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/46/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/46//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/46//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/46//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Address 2.1.0 Compatibility Report Issues > - > > Key: HBASE-20862 > URL: https://issues.apache.org/jira/browse/HBASE-20862 > Project: HBase > Issue Type: Task > Components: compatibility >Affects Versions: 2.1.0 >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Blocker > Fix For: 2.1.0, 2.2.0 > > Attachments: HBASE-20862.branch-2.001.patch > > > > https://dist.apache.org/repos/dist/dev/hbase/2.1.0RC0/compatibility_report_2.0.0vs2.1.0RC0.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20651) Master, prevents hbck or shell command to reassign the split parent region
[ https://issues.apache.org/jira/browse/HBASE-20651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16539010#comment-16539010 ] huaxiang sun commented on HBASE-20651: -- I am going to commit as [~esteban] gives his +1, thanks. > Master, prevents hbck or shell command to reassign the split parent region > -- > > Key: HBASE-20651 > URL: https://issues.apache.org/jira/browse/HBASE-20651 > Project: HBase > Issue Type: Improvement > Components: master >Affects Versions: 1.2.6 >Reporter: huaxiang sun >Assignee: huaxiang sun >Priority: Minor > Attachments: HBASE-20651-branch-1-v001.patch, > HBASE-20651-branch-1-v002.patch, HBASE-20651-branch-1-v003.patch > > > We are seeing that hbck brings back split parent region and this causes > region inconsistency. More details will be filled as reproduce is still > ongoing. Might need to do something at hbck or master to prevent this from > happening. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20862) Address 2.1.0 Compatibility Report Issues
[ https://issues.apache.org/jira/browse/HBASE-20862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538985#comment-16538985 ] Hudson commented on HBASE-20862: Results for branch branch-2 [build #967 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/967/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/967//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/967//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/967//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Address 2.1.0 Compatibility Report Issues > - > > Key: HBASE-20862 > URL: https://issues.apache.org/jira/browse/HBASE-20862 > Project: HBase > Issue Type: Task > Components: compatibility >Affects Versions: 2.1.0 >Reporter: Mike Drob >Assignee: Mike Drob >Priority: Blocker > Fix For: 2.1.0, 2.2.0 > > Attachments: HBASE-20862.branch-2.001.patch > > > > https://dist.apache.org/repos/dist/dev/hbase/2.1.0RC0/compatibility_report_2.0.0vs2.1.0RC0.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20649) Validate HFiles do not have PREFIX_TREE DataBlockEncoding
[ https://issues.apache.org/jira/browse/HBASE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538966#comment-16538966 ] Hadoop QA commented on HBASE-20649: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 27s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 21s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} refguide {color} | {color:blue} 6m 55s{color} | {color:blue} branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 6m 39s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 8s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 8m 24s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:blue}0{color} | {color:blue} refguide {color} | {color:blue} 5m 8s{color} | {color:blue} patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 41s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 0s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 55s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}229m 22s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}311m 2s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests |
[jira] [Assigned] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version
[ https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell reassigned HBASE-20866: -- Assignee: Vikas Vishwakarma > HBase 1.x scan performance degradation compared to 0.98 version > --- > > Key: HBASE-20866 > URL: https://issues.apache.org/jira/browse/HBASE-20866 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.2 >Reporter: Vikas Vishwakarma >Assignee: Vikas Vishwakarma >Priority: Critical > Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6 > > > Internally while testing 1.3 as part of migration from 0.98 to 1.3 we > observed perf degradation in scan performance for phoenix queries varying > from few 10's to upto 200% depending on the query being executed. We tried > simple native HBase scan and there also we saw upto 40% degradation in > performance when the number of column qualifiers are high (40-50+) > To identify the root cause of performance diff between 0.98 and 1.3 we > carried out lot of experiments with profiling and git bisect iterations, > however we were not able to identify any particular source of scan > performance degradation and it looked like this is an accumulated degradation > of 5-10% over various enhancements and refactoring. > We identified few major enhancements like partialResult handling, > ScannerContext with heartbeat processing, time/size limiting, RPC > refactoring, etc that could have contributed to small degradation in > performance which put together could be leading to large overall degradation. > One of the changes is > [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which > implements partialResult handling. In ClientScanner.java the results received > from server are cached on the client side by converting the result array into > an ArrayList. This function gets called in a loop depending on the number of > rows in the scan result. Example for ten’s of millions of rows scanned, this > can be called in the order of millions of times. > In almost all the cases 99% of the time (except for handling partial results, > etc). We are just taking the resultsFromServer converting it into a ArrayList > resultsToAddToCache in addResultsToList(..) and then iterating over the list > again and adding it to cache in loadCache(..) as given in the code path below > In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → > addResultsToList(..) → > {code:java} > loadCache() { > ... > List resultsToAddToCache = > getResultsToAddToCache(values, callable.isHeartbeatMessage()); > ... > … > for (Result rs : resultsToAddToCache) { > rs = filterLoadedCell(rs); > cache.add(rs); > ... > } > } > getResultsToAddToCache(..) { > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && > scan.getAllowPartialResults(); > .. > if (allowPartials || isBatchSet) { > addResultsToList(resultsToAddToCache, resultsFromServer, 0, > (null == resultsFromServer ? 0 : resultsFromServer.length)); > return resultsToAddToCache; > } > ... > } > private void addResultsToList(List outputList, Result[] inputArray, > int start, int end) { > if (inputArray == null || start < 0 || end > inputArray.length) return; > for (int i = start; i < end; i++) { > outputList.add(inputArray[i]); > } > }{code} > > It looks like we can avoid the result array to arraylist conversion > (resultsFromServer --> resultsToAddToCache ) for the first case which is also > the most frequent case and instead directly take the values arraay returned > by callable and add it to the cache without converting it into ArrayList. > I have taken both these flags allowPartials and isBatchSet out in loadcahe() > and I am directly adding values to scanner cache if the above condition is > pass instead of coverting it into arrayList by calling > getResultsToAddToCache(). For example: > {code:java} > protected void loadCache() throws IOException { > Result[] values = null; > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && scan.getAllowPartialResults(); > .. > for (;;) { > try { > values = call(callable, caller, scannerTimeout); > .. > } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) { > .. > } > if (allowPartials || isBatchSet) { // DIRECTLY COPY values TO CACHE > if (values != null) { > for (int v=0; v Result rs = values[v]; > > cache.add(rs); > ... > } else { // DO ALL THE REGULAR PARTIAL RESULT HANDLING .. > List resultsToAddToCache = > getResultsToAddToCache(values, callable.isHeartbeatMessage()); > for (Result rs : resultsToAddToCache) { > > cache.add(rs); > ... > } > } > {code}
[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version
[ https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538962#comment-16538962 ] Andrew Purtell commented on HBASE-20866: [~reidchan] the problem is an accumulation of multiple changes over time in the set of all changes introduced from 0.98 through to 1.3. Many thanks to [~vik.karma] for finding this one. There will be more, I am sure. As Vikas said {quote} To identify the root cause of performance diff between 0.98 and 1.3 we carried out lot of experiments with profiling and git bisect iterations, however we were not able to identify any particular source of scan performance degradation and it looked like this is an accumulated degradation of 5-10% over various enhancements and refactoring. {quote} I'm raising the priority to Critical because this impacts anyone seriously considering a migration off of 0.98 to a 1.x, and anyone who already has. > HBase 1.x scan performance degradation compared to 0.98 version > --- > > Key: HBASE-20866 > URL: https://issues.apache.org/jira/browse/HBASE-20866 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.2 >Reporter: Vikas Vishwakarma >Priority: Major > Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6 > > > Internally while testing 1.3 as part of migration from 0.98 to 1.3 we > observed perf degradation in scan performance for phoenix queries varying > from few 10's to upto 200% depending on the query being executed. We tried > simple native HBase scan and there also we saw upto 40% degradation in > performance when the number of column qualifiers are high (40-50+) > To identify the root cause of performance diff between 0.98 and 1.3 we > carried out lot of experiments with profiling and git bisect iterations, > however we were not able to identify any particular source of scan > performance degradation and it looked like this is an accumulated degradation > of 5-10% over various enhancements and refactoring. > We identified few major enhancements like partialResult handling, > ScannerContext with heartbeat processing, time/size limiting, RPC > refactoring, etc that could have contributed to small degradation in > performance which put together could be leading to large overall degradation. > One of the changes is > [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which > implements partialResult handling. In ClientScanner.java the results received > from server are cached on the client side by converting the result array into > an ArrayList. This function gets called in a loop depending on the number of > rows in the scan result. Example for ten’s of millions of rows scanned, this > can be called in the order of millions of times. > In almost all the cases 99% of the time (except for handling partial results, > etc). We are just taking the resultsFromServer converting it into a ArrayList > resultsToAddToCache in addResultsToList(..) and then iterating over the list > again and adding it to cache in loadCache(..) as given in the code path below > In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → > addResultsToList(..) → > {code:java} > loadCache() { > ... > List resultsToAddToCache = > getResultsToAddToCache(values, callable.isHeartbeatMessage()); > ... > … > for (Result rs : resultsToAddToCache) { > rs = filterLoadedCell(rs); > cache.add(rs); > ... > } > } > getResultsToAddToCache(..) { > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && > scan.getAllowPartialResults(); > .. > if (allowPartials || isBatchSet) { > addResultsToList(resultsToAddToCache, resultsFromServer, 0, > (null == resultsFromServer ? 0 : resultsFromServer.length)); > return resultsToAddToCache; > } > ... > } > private void addResultsToList(List outputList, Result[] inputArray, > int start, int end) { > if (inputArray == null || start < 0 || end > inputArray.length) return; > for (int i = start; i < end; i++) { > outputList.add(inputArray[i]); > } > }{code} > > It looks like we can avoid the result array to arraylist conversion > (resultsFromServer --> resultsToAddToCache ) for the first case which is also > the most frequent case and instead directly take the values arraay returned > by callable and add it to the cache without converting it into ArrayList. > I have taken both these flags allowPartials and isBatchSet out in loadcahe() > and I am directly adding values to scanner cache if the above condition is > pass instead of coverting it into arrayList by calling > getResultsToAddToCache(). For example: > {code:java} > protected void loadCache() throws IOException { > Result[] values = null; > .. > final boolean
[jira] [Updated] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version
[ https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-20866: --- Priority: Critical (was: Major) Fix Version/s: 1.4.6 1.3.3 1.2.7 1.5.0 > HBase 1.x scan performance degradation compared to 0.98 version > --- > > Key: HBASE-20866 > URL: https://issues.apache.org/jira/browse/HBASE-20866 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.2 >Reporter: Vikas Vishwakarma >Priority: Critical > Fix For: 1.5.0, 1.2.7, 1.3.3, 1.4.6 > > > Internally while testing 1.3 as part of migration from 0.98 to 1.3 we > observed perf degradation in scan performance for phoenix queries varying > from few 10's to upto 200% depending on the query being executed. We tried > simple native HBase scan and there also we saw upto 40% degradation in > performance when the number of column qualifiers are high (40-50+) > To identify the root cause of performance diff between 0.98 and 1.3 we > carried out lot of experiments with profiling and git bisect iterations, > however we were not able to identify any particular source of scan > performance degradation and it looked like this is an accumulated degradation > of 5-10% over various enhancements and refactoring. > We identified few major enhancements like partialResult handling, > ScannerContext with heartbeat processing, time/size limiting, RPC > refactoring, etc that could have contributed to small degradation in > performance which put together could be leading to large overall degradation. > One of the changes is > [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which > implements partialResult handling. In ClientScanner.java the results received > from server are cached on the client side by converting the result array into > an ArrayList. This function gets called in a loop depending on the number of > rows in the scan result. Example for ten’s of millions of rows scanned, this > can be called in the order of millions of times. > In almost all the cases 99% of the time (except for handling partial results, > etc). We are just taking the resultsFromServer converting it into a ArrayList > resultsToAddToCache in addResultsToList(..) and then iterating over the list > again and adding it to cache in loadCache(..) as given in the code path below > In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → > addResultsToList(..) → > {code:java} > loadCache() { > ... > List resultsToAddToCache = > getResultsToAddToCache(values, callable.isHeartbeatMessage()); > ... > … > for (Result rs : resultsToAddToCache) { > rs = filterLoadedCell(rs); > cache.add(rs); > ... > } > } > getResultsToAddToCache(..) { > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && > scan.getAllowPartialResults(); > .. > if (allowPartials || isBatchSet) { > addResultsToList(resultsToAddToCache, resultsFromServer, 0, > (null == resultsFromServer ? 0 : resultsFromServer.length)); > return resultsToAddToCache; > } > ... > } > private void addResultsToList(List outputList, Result[] inputArray, > int start, int end) { > if (inputArray == null || start < 0 || end > inputArray.length) return; > for (int i = start; i < end; i++) { > outputList.add(inputArray[i]); > } > }{code} > > It looks like we can avoid the result array to arraylist conversion > (resultsFromServer --> resultsToAddToCache ) for the first case which is also > the most frequent case and instead directly take the values arraay returned > by callable and add it to the cache without converting it into ArrayList. > I have taken both these flags allowPartials and isBatchSet out in loadcahe() > and I am directly adding values to scanner cache if the above condition is > pass instead of coverting it into arrayList by calling > getResultsToAddToCache(). For example: > {code:java} > protected void loadCache() throws IOException { > Result[] values = null; > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && scan.getAllowPartialResults(); > .. > for (;;) { > try { > values = call(callable, caller, scannerTimeout); > .. > } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) { > .. > } > if (allowPartials || isBatchSet) { // DIRECTLY COPY values TO CACHE > if (values != null) { > for (int v=0; v Result rs = values[v]; > > cache.add(rs); > ... > } else { // DO ALL THE REGULAR PARTIAL RESULT HANDLING .. > List resultsToAddToCache = > getResultsToAddToCache(values, callable.isHeartbeatMessage()); > for (Result rs :
[jira] [Commented] (HBASE-20649) Validate HFiles do not have PREFIX_TREE DataBlockEncoding
[ https://issues.apache.org/jira/browse/HBASE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538864#comment-16538864 ] Hadoop QA commented on HBASE-20649: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 24s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 43s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 59s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} refguide {color} | {color:blue} 4m 58s{color} | {color:blue} branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 8s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 55s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:blue}0{color} | {color:blue} refguide {color} | {color:blue} 5m 13s{color} | {color:blue} patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 31s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 9s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}288m 3s{color} | {color:green} root in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 48s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}356m 39s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker |
[jira] [Commented] (HBASE-20649) Validate HFiles do not have PREFIX_TREE DataBlockEncoding
[ https://issues.apache.org/jira/browse/HBASE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538850#comment-16538850 ] Hadoop QA commented on HBASE-20649: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 53s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 13s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 39s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} refguide {color} | {color:blue} 5m 56s{color} | {color:blue} branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 59s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 24s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 57s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:blue}0{color} | {color:blue} refguide {color} | {color:blue} 5m 51s{color} | {color:blue} patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 1s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 11m 50s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 57s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}202m 5s{color} | {color:green} root in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 55s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}283m 47s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker |
[jira] [Commented] (HBASE-20860) Merged region's RIT state may not be cleaned after master restart
[ https://issues.apache.org/jira/browse/HBASE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538798#comment-16538798 ] Hadoop QA commented on HBASE-20860: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2.0 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 36s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 13s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 21s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 44s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 27s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} branch-2.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 37s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 13m 39s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 26s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 11s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}108m 14s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.regionserver.TestRegionReplicas | | | hadoop.hbase.regionserver.TestRegionServerHostname | | | hadoop.hbase.regionserver.TestHRegion | | | hadoop.hbase.regionserver.TestAtomicOperation | | | hadoop.hbase.regionserver.TestMajorCompaction | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 | | JIRA Issue | HBASE-20860 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12931004/HBASE-20860.branch-2.0.004.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 0cbc57444feb 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | branch-2.0 / cd1ecae0d1 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC3 | | unit |
[jira] [Assigned] (HBASE-20857) JMX - add Balancer status = enabled / disabled
[ https://issues.apache.org/jira/browse/HBASE-20857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Monani Mihir reassigned HBASE-20857: Assignee: Monani Mihir > JMX - add Balancer status = enabled / disabled > -- > > Key: HBASE-20857 > URL: https://issues.apache.org/jira/browse/HBASE-20857 > Project: HBase > Issue Type: Improvement > Components: API, master, metrics, REST, tooling, Usability >Reporter: Hari Sekhon >Assignee: Monani Mihir >Priority: Major > > Add HBase Balancer enabled/disabled status to JMX API on HMaster. > Right now the HMaster will give a warning near the top of HMaster UI if > balancer is disabled, but scraping this is for monitoring integration is not > nice, it should be available in JMX API as there is already a > Master,sub=Balancer bean with metrics for the balancer ops etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20867) RS may got killed while master restarts
[ https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-20867: --- Attachment: HBASE-20867.branch-2.0.001.patch > RS may got killed while master restarts > --- > > Key: HBASE-20867 > URL: https://issues.apache.org/jira/browse/HBASE-20867 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Attachments: HBASE-20867.branch-2.0.001.patch > > > If the master is dispatching a RPC call to RS when aborting. A connection > exception may be thrown by the RPC layer(A IOException with "Connection > closed" message in this case). The RSProcedureDispatcher will regard is as an > un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, > which will expire the RS. > Actually, the RS is very healthy, only the master is restarting. > I think we should deal with those kinds of connection exceptions in > RSProcedureDispatcher and retry the rpc call -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20833) Modify pre-upgrade coprocessor validator to support table level coprocessors
[ https://issues.apache.org/jira/browse/HBASE-20833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538743#comment-16538743 ] Hudson commented on HBASE-20833: Results for branch master [build #392 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/392/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/392//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/392//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/392//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Modify pre-upgrade coprocessor validator to support table level coprocessors > > > Key: HBASE-20833 > URL: https://issues.apache.org/jira/browse/HBASE-20833 > Project: HBase > Issue Type: New Feature > Components: Coprocessors >Reporter: Balazs Meszaros >Assignee: Balazs Meszaros >Priority: Major > Fix For: 3.0.0, 2.1.0, 2.0.2, 2.2.0 > > Attachments: HBASE-20833.master.001.patch, > HBASE-20833.master.003.patch, HBASE-20833.master.004.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20784) Will lose the SNAPSHOT suffix if we get the version of RS from ServerManager
[ https://issues.apache.org/jira/browse/HBASE-20784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538744#comment-16538744 ] Hudson commented on HBASE-20784: Results for branch master [build #392 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/392/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/392//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/392//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/392//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Will lose the SNAPSHOT suffix if we get the version of RS from ServerManager > > > Key: HBASE-20784 > URL: https://issues.apache.org/jira/browse/HBASE-20784 > Project: HBase > Issue Type: Bug > Components: master, UI >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Minor > Fix For: 3.0.0, 2.1.0, 2.2.0 > > Attachments: HBASE-20784.patch > > > In HBASE-20722 we removed the usage of RegionServerTracker when getting > information for region server. And version in server manager is a int, and we > convert it to a String when displaying it on the master ui, so we will lose > the SNAPSHOT suffix. Not a big one as this is not a problem for normal > releases. Open a issue for it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20806) Split style journal for flushes and compactions
[ https://issues.apache.org/jira/browse/HBASE-20806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538741#comment-16538741 ] Hudson commented on HBASE-20806: Results for branch master [build #392 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/392/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/392//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/392//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/392//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Split style journal for flushes and compactions > --- > > Key: HBASE-20806 > URL: https://issues.apache.org/jira/browse/HBASE-20806 > Project: HBase > Issue Type: Improvement >Reporter: Abhishek Singh Chouhan >Assignee: Abhishek Singh Chouhan >Priority: Minor > Fix For: 3.0.0, 2.1.0, 1.5.0, 1.2.7, 1.3.3, 1.4.6, 2.0.2 > > Attachments: HBASE-20806.branch-1.001.patch, > HBASE-20806.branch-1.002.patch, HBASE-20806.branch-1.003.patch, > HBASE-20806.branch-2.001.patch, HBASE-20806.master.001.patch, > HBASE-20806.master.002.patch, HBASE-20806.master.003.patch > > > In 1.x we have split transaction journal that gives a clear picture of when > various stages of splits took place. We should have a similar thing for > flushes and compactions so as to have insights into time spent in various > stages, which we can use to identify regressions that might creep up. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20867) RS may got killed while master restarts
[ https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-20867: --- Status: Patch Available (was: Open) > RS may got killed while master restarts > --- > > Key: HBASE-20867 > URL: https://issues.apache.org/jira/browse/HBASE-20867 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.1, 3.0.0, 2.1.0 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > > If the master is dispatching a RPC call to RS when aborting. A connection > exception may be thrown by the RPC layer(A IOException with "Connection > closed" message in this case). The RSProcedureDispatcher will regard is as an > un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, > which will expire the RS. > Actually, the RS is very healthy, only the master is restarting. > I think we should deal with those kinds of connection exceptions in > RSProcedureDispatcher and retry the rpc call -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20867) RS may got killed while master restarts
[ https://issues.apache.org/jira/browse/HBASE-20867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-20867: --- Description: If the master is dispatching a RPC call to RS when aborting. A connection exception may be thrown by the RPC layer(A IOException with "Connection closed" message in this case). The RSProcedureDispatcher will regard is as an un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, which will expire the RS. Actually, the RS is very healthy, only the master is restarting. I think we should deal with those kinds of connection exceptions in RSProcedureDispatcher and retry the rpc call was: If the master is dispatching a RPC call to RS when aborting. A connection exception may be thrown by the RPC layer(A IOException with "Connection closed" message in this case). The RSProcedureDispatcher will regard is as an un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, which will expire the RS. Actually, the RS is very healthy, only the master is restarting. I think we should deal with those kinds of connection exceptions in RSProcedureDispatcher and retry the rpc call > RS may got killed while master restarts > --- > > Key: HBASE-20867 > URL: https://issues.apache.org/jira/browse/HBASE-20867 > Project: HBase > Issue Type: Bug >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > > If the master is dispatching a RPC call to RS when aborting. A connection > exception may be thrown by the RPC layer(A IOException with "Connection > closed" message in this case). The RSProcedureDispatcher will regard is as an > un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, > which will expire the RS. > Actually, the RS is very healthy, only the master is restarting. > I think we should deal with those kinds of connection exceptions in > RSProcedureDispatcher and retry the rpc call -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20867) RS may got killed while master restarts
Allan Yang created HBASE-20867: -- Summary: RS may got killed while master restarts Key: HBASE-20867 URL: https://issues.apache.org/jira/browse/HBASE-20867 Project: HBase Issue Type: Bug Affects Versions: 2.0.1, 3.0.0, 2.1.0 Reporter: Allan Yang Assignee: Allan Yang If the master is dispatching a RPC call to RS when aborting. A connection exception may be thrown by the RPC layer(A IOException with "Connection closed" message in this case). The RSProcedureDispatcher will regard is as an un-retryable exception and pass it to UnassignProcedue.remoteCallFailed, which will expire the RS. Actually, the RS is very healthy, only the master is restarting. I think we should deal with those kinds of connection exceptions in RSProcedureDispatcher and retry the rpc call -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20649) Validate HFiles do not have PREFIX_TREE DataBlockEncoding
[ https://issues.apache.org/jira/browse/HBASE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538704#comment-16538704 ] Sean Busbey commented on HBASE-20649: - what do y'all think about the outlined steps [~zyork] or [~Apache9]? > Validate HFiles do not have PREFIX_TREE DataBlockEncoding > - > > Key: HBASE-20649 > URL: https://issues.apache.org/jira/browse/HBASE-20649 > Project: HBase > Issue Type: New Feature >Reporter: Peter Somogyi >Assignee: Balazs Meszaros >Priority: Minor > Fix For: 3.0.0 > > Attachments: HBASE-20649.master.001.patch, > HBASE-20649.master.002.patch, HBASE-20649.master.003.patch, > HBASE-20649.master.004.patch, HBASE-20649.master.005.patch > > > HBASE-20592 adds a tool to check column families on the cluster do not have > PREFIX_TREE encoding. > Since it is possible that DataBlockEncoding was already changed but HFiles > are not rewritten yet we would need a tool that can verify the content of > hfiles in the cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20847) The parent procedure of RegionTransitionProcedure may not have the table lock
[ https://issues.apache.org/jira/browse/HBASE-20847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538686#comment-16538686 ] Duo Zhang commented on HBASE-20847: --- OK the failed UTs are related. Seems we have something wrong with the count of sharedLock accounting. Let me dig more... > The parent procedure of RegionTransitionProcedure may not have the table lock > - > > Key: HBASE-20847 > URL: https://issues.apache.org/jira/browse/HBASE-20847 > Project: HBase > Issue Type: Sub-task > Components: proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-20847-v1.patch, HBASE-20847-v2.patch, > HBASE-20847.patch > > > For example, SCP can also schedule AssignProcedure and obviously it will not > hold the table lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-20860) Merged region's RIT state may not be cleaned after master restart
[ https://issues.apache.org/jira/browse/HBASE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538654#comment-16538654 ] stack edited comment on HBASE-20860 at 7/10/18 2:19 PM: bq. Normally, if the region is marked as CLOSED, and no procedure is attached to it, it is a bug, we need to treat it as RIT so we can see them form master web clearly Ok. Or, rather, in this case, there is an associated procedure -- the MergeTableRegionsProcedure. We check if table disabling or disabled if state is CLOSED. Would it be better to not schedule the RIT since we have the MTRP going on. Hard part is figuring if MTRP is on-going So, for now, the patch is good. Would it be cleaner calling removeFromOfflineRegions inside in markRegionAsMerged rather than after markRegionAsMerged in MergeTableRegionsProcedure? Thanks [~allan163] was (Author: stack): bq. Normally, if the region is marked as CLOSED, and no procedure is attached to it, it is a bug, we need to treat it as RIT so we can see them form master web clearly Ok. > Merged region's RIT state may not be cleaned after master restart > - > > Key: HBASE-20860 > URL: https://issues.apache.org/jira/browse/HBASE-20860 > Project: HBase > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Fix For: 3.0.0, 2.1.0, 2.0.2 > > Attachments: HBASE-20860.branch-2.0.002.patch, > HBASE-20860.branch-2.0.003.patch, HBASE-20860.branch-2.0.004.patch, > HBASE-20860.branch-2.0.patch > > > In MergeTableRegionsProcedure, we issue UnassignProcedures to offline regions > to merge. But if we restart master just after MergeTableRegionsProcedure > finished these two UnassignProcedure and before it can delete their meta > entries. The new master will found these two region is CLOSED but no > procedures are attached to them. They will be regard as RIT regions and > nobody will clean the RIT state for them later. > A quick way to resolve this stuck situation in the production env is > restarting master again, since the meta entries are deleted in > MergeTableRegionsProcedure. Here, I offer a fix for this problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20838) Move all setStorage related UT cases from TestFSUtils to TestCommonFSUtils
[ https://issues.apache.org/jira/browse/HBASE-20838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538669#comment-16538669 ] Sean Busbey commented on HBASE-20838: - {quote} +1 unit202m 31sroot in the patch passed. +1 unit127m 41shbase-server in the patch passed. {quote} we shouldn't add hbase-server if we're going to run the tests in root. > Move all setStorage related UT cases from TestFSUtils to TestCommonFSUtils > -- > > Key: HBASE-20838 > URL: https://issues.apache.org/jira/browse/HBASE-20838 > Project: HBase > Issue Type: Test >Reporter: Yu Li >Assignee: Yu Li >Priority: Major > Attachments: HBASE-20838.patch, HBASE-20838.patch > > > As per > [discussed|https://issues.apache.org/jira/browse/HBASE-20691?focusedCommentId=16517662=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16517662] > in HBASE-20691, since the setStoragePolicy code is in CommonFSUtils, the > test should be in TestCommonFSUtils -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20860) Merged region's RIT state may not be cleaned after master restart
[ https://issues.apache.org/jira/browse/HBASE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538654#comment-16538654 ] stack commented on HBASE-20860: --- bq. Normally, if the region is marked as CLOSED, and no procedure is attached to it, it is a bug, we need to treat it as RIT so we can see them form master web clearly Ok. > Merged region's RIT state may not be cleaned after master restart > - > > Key: HBASE-20860 > URL: https://issues.apache.org/jira/browse/HBASE-20860 > Project: HBase > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Fix For: 3.0.0, 2.1.0, 2.0.2 > > Attachments: HBASE-20860.branch-2.0.002.patch, > HBASE-20860.branch-2.0.003.patch, HBASE-20860.branch-2.0.004.patch, > HBASE-20860.branch-2.0.patch > > > In MergeTableRegionsProcedure, we issue UnassignProcedures to offline regions > to merge. But if we restart master just after MergeTableRegionsProcedure > finished these two UnassignProcedure and before it can delete their meta > entries. The new master will found these two region is CLOSED but no > procedures are attached to them. They will be regard as RIT regions and > nobody will clean the RIT state for them later. > A quick way to resolve this stuck situation in the production env is > restarting master again, since the meta entries are deleted in > MergeTableRegionsProcedure. Here, I offer a fix for this problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20847) The parent procedure of RegionTransitionProcedure may not have the table lock
[ https://issues.apache.org/jira/browse/HBASE-20847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538649#comment-16538649 ] stack commented on HBASE-20847: --- I +1'd this patch up on rb. Its cleanup and the refactors to lockAndQueue are improvements pushing down the messy parent checking distributed around the codebase. I agree with the trySharedLock change. > The parent procedure of RegionTransitionProcedure may not have the table lock > - > > Key: HBASE-20847 > URL: https://issues.apache.org/jira/browse/HBASE-20847 > Project: HBase > Issue Type: Sub-task > Components: proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-20847-v1.patch, HBASE-20847-v2.patch, > HBASE-20847.patch > > > For example, SCP can also schedule AssignProcedure and obviously it will not > hold the table lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20847) The parent procedure of RegionTransitionProcedure may not have the table lock
[ https://issues.apache.org/jira/browse/HBASE-20847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538646#comment-16538646 ] Duo Zhang commented on HBASE-20847: --- {quote} You are arguing we should never take an exclusive lock on a table? For a modifytableprocedure? {quote} We need to release the exclusive lock if we are scheduling sub procedures. This is what we do for now. {quote} Assign can only proceed after WAL logs have been split... so if an SCP and a ModifyTableProcedure at same time, MTP should wait until SCP has finished log splitting before proceeding SCP should wait till MTP has done assigning/unassiging before it tries assigning? {quote} We need to hold the lock. You can think from opposite, what if the SCP schedules AssignProcedure and then there comes a ModifyTableProcedure? Here we do not need to wait so long if the MTP comes first. We just schedule the AssignProcedures, but it will blocked if MTP has held the exclusive lock. And after MTP schedules Unassign/AssignProcedures, it will release the exclusive lock and we can go on. The lock recovery after master restarts will be addressed by HBASE-20846, it is another story. Here the problem is we need to aqcuire the shared lock every time even if the procedure has a parent. > The parent procedure of RegionTransitionProcedure may not have the table lock > - > > Key: HBASE-20847 > URL: https://issues.apache.org/jira/browse/HBASE-20847 > Project: HBase > Issue Type: Sub-task > Components: proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-20847-v1.patch, HBASE-20847-v2.patch, > HBASE-20847.patch > > > For example, SCP can also schedule AssignProcedure and obviously it will not > hold the table lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20855) PeerConfigTracker only support one listener will cause problem when there is a recovered replication queue
[ https://issues.apache.org/jira/browse/HBASE-20855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538643#comment-16538643 ] Hadoop QA commented on HBASE-20855: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 1s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-1 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 31s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 47s{color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 38s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} branch-1 passed with JDK v1.8.0_172 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} branch-1 passed with JDK v1.7.0_181 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 2m 37s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 1m 32s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} the patch passed with JDK v1.8.0_172 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} the patch passed with JDK v1.7.0_181 {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 20s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}104m 26s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}128m 52s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.replication.TestReplicationSource | | |
[jira] [Commented] (HBASE-20847) The parent procedure of RegionTransitionProcedure may not have the table lock
[ https://issues.apache.org/jira/browse/HBASE-20847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538623#comment-16538623 ] stack commented on HBASE-20847: --- bq. For table lock this should not happen, as region assign can happen at any time if there is a server crash as it needs to hold the shared lock, so if we hold the exclusive lock for the whole life of a table procedure then it will hurt the availability... You are arguing we should never take an exclusive lock on a table? For a modifytableprocedure? bq. as region assign can happen at any time if there is a server crash Assign can only proceed after WAL logs have been split... so if an SCP and a ModifyTableProcedure at same time, MTP should wait until SCP has finished log splitting before proceeding SCP should wait till MTP has done assigning/unassiging before it tries assigning? bq. But for other procedures, such as peer related procedures, we will hold the lock for the whole life time. Post-crash, how does this lock get reinstated? Currently there is no means, right? Scheduling the parent to run before the sub-procedure would go against the Pv2 grain. Does this mean that as part of the load process we should be re-instituting locks? (Is this what HBASE-20846 is supposed to be doing? Sounds like it. If so, great) > The parent procedure of RegionTransitionProcedure may not have the table lock > - > > Key: HBASE-20847 > URL: https://issues.apache.org/jira/browse/HBASE-20847 > Project: HBase > Issue Type: Sub-task > Components: proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-20847-v1.patch, HBASE-20847-v2.patch, > HBASE-20847.patch > > > For example, SCP can also schedule AssignProcedure and obviously it will not > hold the table lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()
[ https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538581#comment-16538581 ] Duo Zhang commented on HBASE-20697: --- +1. This is a bug fix so please include it in 2.1. > Can't cache All region locations of the specify table by calling > table.getRegionLocator().getAllRegionLocations() > - > > Key: HBASE-20697 > URL: https://issues.apache.org/jira/browse/HBASE-20697 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1, 1.2.6 >Reporter: zhaoyuan >Assignee: zhaoyuan >Priority: Major > Fix For: 1.2.7, 1.3.3 > > Attachments: HBASE-20697.branch-1.2.001.patch, > HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, > HBASE-20697.branch-1.2.004.patch, HBASE-20697.master.001.patch, > HBASE-20697.master.002.patch, HBASE-20697.master.002.patch, > HBASE-20697.master.003.patch > > > When we upgrade and restart a new version application which will read and > write to HBase, we will get some operation timeout. The time out is expected > because when the application restarts,It will not hold any region locations > cache and do communication with zk and meta regionserver to get region > locations. > We want to avoid these timeouts so we do warmup work and as far as I am > concerned,the method table.getRegionLocator().getAllRegionLocations() will > fetch all region locations and cache them. However, it didn't work good. > There are still a lot of time outs,so it confused me. > I dig into the source code and find something below > {code:java} > // code placeholder > public List getAllRegionLocations() throws IOException { > TableName tableName = getName(); > NavigableMap locations = > MetaScanner.allTableRegions(this.connection, tableName); > ArrayList regions = new ArrayList<>(locations.size()); > for (Entry entry : locations.entrySet()) { > regions.add(new HRegionLocation(entry.getKey(), entry.getValue())); > } > if (regions.size() > 0) { > connection.cacheLocation(tableName, new RegionLocations(regions)); > } > return regions; > } > In MetaCache > public void cacheLocation(final TableName tableName, final RegionLocations > locations) { > byte [] startKey = > locations.getRegionLocation().getRegionInfo().getStartKey(); > ConcurrentMap tableLocations = > getTableLocations(tableName); > RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, > locations); > boolean isNewCacheEntry = (oldLocation == null); > if (isNewCacheEntry) { > if (LOG.isTraceEnabled()) { > LOG.trace("Cached location: " + locations); > } > addToCachedServers(locations); > return; > } > {code} > It will collect all regions into one RegionLocations object and only cache > the first not null region location and then when we put or get to hbase, we > do getCacheLocation() > {code:java} > // code placeholder > public RegionLocations getCachedLocation(final TableName tableName, final > byte [] row) { > ConcurrentNavigableMap tableLocations = > getTableLocations(tableName); > Entry e = tableLocations.floorEntry(row); > if (e == null) { > if (metrics!= null) metrics.incrMetaCacheMiss(); > return null; > } > RegionLocations possibleRegion = e.getValue(); > // make sure that the end key is greater than the row we're looking > // for, otherwise the row actually belongs in the next region, not > // this one. the exception case is when the endkey is > // HConstants.EMPTY_END_ROW, signifying that the region we're > // checking is actually the last region in the table. > byte[] endKey = > possibleRegion.getRegionLocation().getRegionInfo().getEndKey(); > if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) || > getRowComparator(tableName).compareRows( > endKey, 0, endKey.length, row, 0, row.length) > 0) { > if (metrics != null) metrics.incrMetaCacheHit(); > return possibleRegion; > } > // Passed all the way through, so we got nothing - complete cache miss > if (metrics != null) metrics.incrMetaCacheMiss(); > return null; > } > {code} > It will choose the first location to be possibleRegion and possibly it will > miss match. > So did I forget something or may be wrong somewhere? If this is indeed a bug > I think it can be fixed not very hard. > Hope commiters and PMC review this ! > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20847) The parent procedure of RegionTransitionProcedure may not have the table lock
[ https://issues.apache.org/jira/browse/HBASE-20847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538577#comment-16538577 ] Duo Zhang commented on HBASE-20847: --- Let me check the failed UT. Seems related. > The parent procedure of RegionTransitionProcedure may not have the table lock > - > > Key: HBASE-20847 > URL: https://issues.apache.org/jira/browse/HBASE-20847 > Project: HBase > Issue Type: Sub-task > Components: proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-20847-v1.patch, HBASE-20847-v2.patch, > HBASE-20847.patch > > > For example, SCP can also schedule AssignProcedure and obviously it will not > hold the table lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20860) Merged region's RIT state may not be cleaned after master restart
[ https://issues.apache.org/jira/browse/HBASE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-20860: --- Attachment: HBASE-20860.branch-2.0.004.patch > Merged region's RIT state may not be cleaned after master restart > - > > Key: HBASE-20860 > URL: https://issues.apache.org/jira/browse/HBASE-20860 > Project: HBase > Issue Type: Sub-task >Affects Versions: 3.0.0, 2.1.0, 2.0.1 >Reporter: Allan Yang >Assignee: Allan Yang >Priority: Major > Fix For: 3.0.0, 2.1.0, 2.0.2 > > Attachments: HBASE-20860.branch-2.0.002.patch, > HBASE-20860.branch-2.0.003.patch, HBASE-20860.branch-2.0.004.patch, > HBASE-20860.branch-2.0.patch > > > In MergeTableRegionsProcedure, we issue UnassignProcedures to offline regions > to merge. But if we restart master just after MergeTableRegionsProcedure > finished these two UnassignProcedure and before it can delete their meta > entries. The new master will found these two region is CLOSED but no > procedures are attached to them. They will be regard as RIT regions and > nobody will clean the RIT state for them later. > A quick way to resolve this stuck situation in the production env is > restarting master again, since the meta entries are deleted in > MergeTableRegionsProcedure. Here, I offer a fix for this problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20838) Move all setStorage related UT cases from TestFSUtils to TestCommonFSUtils
[ https://issues.apache.org/jira/browse/HBASE-20838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538575#comment-16538575 ] Hadoop QA commented on HBASE-20838: --- (!) A patch to the testing environment has been detected. Re-executing against the patched versions to perform further tests. The console is at https://builds.apache.org/job/PreCommit-HBASE-Build/13577/console in case of problems. > Move all setStorage related UT cases from TestFSUtils to TestCommonFSUtils > -- > > Key: HBASE-20838 > URL: https://issues.apache.org/jira/browse/HBASE-20838 > Project: HBase > Issue Type: Test >Reporter: Yu Li >Assignee: Yu Li >Priority: Major > Attachments: HBASE-20838.patch, HBASE-20838.patch > > > As per > [discussed|https://issues.apache.org/jira/browse/HBASE-20691?focusedCommentId=16517662=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16517662] > in HBASE-20691, since the setStoragePolicy code is in CommonFSUtils, the > test should be in TestCommonFSUtils -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20847) The parent procedure of RegionTransitionProcedure may not have the table lock
[ https://issues.apache.org/jira/browse/HBASE-20847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538574#comment-16538574 ] Hadoop QA commented on HBASE-20847: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 43s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 3s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 24s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 31s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 42s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} hbase-procedure: The patch generated 0 new + 1 unchanged - 5 fixed = 1 total (was 6) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 10s{color} | {color:green} hbase-server: The patch generated 0 new + 8 unchanged - 1 fixed = 8 total (was 9) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 29s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 3s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 40s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}129m 44s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 40s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}176m 46s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.regionserver.TestHStore | | | hadoop.hbase.client.TestMobRestoreSnapshotFromClient | | | hadoop.hbase.client.TestAdmin2 | | | hadoop.hbase.client.TestRestoreSnapshotFromClientWithRegionReplicas | | | hadoop.hbase.client.TestRestoreSnapshotFromClient | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-20847 | | JIRA Patch URL |
[jira] [Updated] (HBASE-20838) Move all setStorage related UT cases from TestFSUtils to TestCommonFSUtils
[ https://issues.apache.org/jira/browse/HBASE-20838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-20838: -- Attachment: HBASE-20838.patch > Move all setStorage related UT cases from TestFSUtils to TestCommonFSUtils > -- > > Key: HBASE-20838 > URL: https://issues.apache.org/jira/browse/HBASE-20838 > Project: HBase > Issue Type: Test >Reporter: Yu Li >Assignee: Yu Li >Priority: Major > Attachments: HBASE-20838.patch, HBASE-20838.patch > > > As per > [discussed|https://issues.apache.org/jira/browse/HBASE-20691?focusedCommentId=16517662=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16517662] > in HBASE-20691, since the setStoragePolicy code is in CommonFSUtils, the > test should be in TestCommonFSUtils -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20860) Merged region's RIT state may not be cleaned after master restart
[ https://issues.apache.org/jira/browse/HBASE-20860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538567#comment-16538567 ] Hadoop QA commented on HBASE-20860: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2.0 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 22s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 48s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 10s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 53s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 55s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green} branch-2.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 42s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 3s{color} | {color:red} hbase-server: The patch generated 1 new + 28 unchanged - 0 fixed = 29 total (was 28) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 55s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 46s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}171m 27s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}211m 41s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 | | JIRA Issue | HBASE-20860 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12930974/HBASE-20860.branch-2.0.003.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 824c57a834b0 4.4.0-130-generic #156-Ubuntu SMP Thu Jun 14 08:53:28 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | branch-2.0 / cd1ecae0d1 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC3 | | checkstyle | https://builds.apache.org/job/PreCommit-HBASE-Build/13571/artifact/patchprocess/diff-checkstyle-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/13571/testReport/ | | Max. process+thread count | 4161 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server |
[jira] [Commented] (HBASE-20838) Move all setStorage related UT cases from TestFSUtils to TestCommonFSUtils
[ https://issues.apache.org/jira/browse/HBASE-20838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538566#comment-16538566 ] Yu Li commented on HBASE-20838: --- Thanks for the notes on local test [~busbey], will try it later. >From the HadoopQA result, the patch works fine and could trigger hbase-server >unit test. The shellcheck report is broken, let me re-attach the patch and see >what kind of warning shell check reports. > Move all setStorage related UT cases from TestFSUtils to TestCommonFSUtils > -- > > Key: HBASE-20838 > URL: https://issues.apache.org/jira/browse/HBASE-20838 > Project: HBase > Issue Type: Test >Reporter: Yu Li >Assignee: Yu Li >Priority: Major > Attachments: HBASE-20838.patch, HBASE-20838.patch > > > As per > [discussed|https://issues.apache.org/jira/browse/HBASE-20691?focusedCommentId=16517662=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16517662] > in HBASE-20691, since the setStoragePolicy code is in CommonFSUtils, the > test should be in TestCommonFSUtils -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20784) Will lose the SNAPSHOT suffix if we get the version of RS from ServerManager
[ https://issues.apache.org/jira/browse/HBASE-20784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538502#comment-16538502 ] Hudson commented on HBASE-20784: Results for branch branch-2 [build #966 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/966/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/966//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/966//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/966//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Will lose the SNAPSHOT suffix if we get the version of RS from ServerManager > > > Key: HBASE-20784 > URL: https://issues.apache.org/jira/browse/HBASE-20784 > Project: HBase > Issue Type: Bug > Components: master, UI >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Minor > Fix For: 3.0.0, 2.1.0, 2.2.0 > > Attachments: HBASE-20784.patch > > > In HBASE-20722 we removed the usage of RegionServerTracker when getting > information for region server. And version in server manager is a int, and we > convert it to a String when displaying it on the master ui, so we will lose > the SNAPSHOT suffix. Not a big one as this is not a problem for normal > releases. Open a issue for it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20855) PeerConfigTracker only support one listener will cause problem when there is a recovered replication queue
[ https://issues.apache.org/jira/browse/HBASE-20855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538493#comment-16538493 ] Ted Yu commented on HBASE-20855: >From >https://builds.apache.org/job/PreCommit-HBASE-Build/13559/artifact/patchprocess/patch-unit-hbase-server.txt > , looks like the test environment had some issue. Please run the above tests locally to see if they pass. > PeerConfigTracker only support one listener will cause problem when there is > a recovered replication queue > -- > > Key: HBASE-20855 > URL: https://issues.apache.org/jira/browse/HBASE-20855 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0, 1.4.0, 1.5.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Attachments: HBASE-20855.branch-1.001.patch, > HBASE-20855.branch-1.002.patch, HBASE-20855.branch-1.003.patch > > > {code} > public void init(Context context) throws IOException { > this.ctx = context; > if (this.ctx != null){ > ReplicationPeer peer = this.ctx.getReplicationPeer(); > if (peer != null){ > peer.trackPeerConfigChanges(this); > } else { > LOG.warn("Not tracking replication peer config changes for Peer Id " + > this.ctx.getPeerId() + > " because there's no such peer"); > } > } > } > {code} > As we know, replication source will set itself to the PeerConfigTracker in > ReplicationPeer. When there is one or more recovered queue, each queue will > generate a new replication source, But they share the same ReplicationPeer. > Then when it calls setListener, the new generated one will cover the older > one. Thus there will only has one ReplicationPeer that receive the peer > config change notify. > {code} > public synchronized void setListener(ReplicationPeerConfigListener listener){ > this.listener = listener; > } > {code} > > To solve this, PeerConfigTracker need to support multiple listener and > listener should be removed when the replication endpoint terminated. > I will upload a patch later with fix and UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20838) Move all setStorage related UT cases from TestFSUtils to TestCommonFSUtils
[ https://issues.apache.org/jira/browse/HBASE-20838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538485#comment-16538485 ] Sean Busbey commented on HBASE-20838: - sure testing locally works as does attaching a patch here that has multiple commits, one for your change and then others to alter files. I test locally by using the "Mac Homebrew" instructions on the [bottom of hte Yetus download page|http://yetus.apache.org/downloads/]. Then I test with a local patch file that changes things the hbase personality will respond to either on a branch with my changes or by pointing at a copy of hte personality with the changes. e.g. If I want to know if the unit tests change: {code} $ test-patch --personality=dev-support/hbase-personality.sh --plugins=maven,java,compile,mvninstall,unit /some/path/to/a/test/FOOBAR-1234.patch {code} > Move all setStorage related UT cases from TestFSUtils to TestCommonFSUtils > -- > > Key: HBASE-20838 > URL: https://issues.apache.org/jira/browse/HBASE-20838 > Project: HBase > Issue Type: Test >Reporter: Yu Li >Assignee: Yu Li >Priority: Major > Attachments: HBASE-20838.patch > > > As per > [discussed|https://issues.apache.org/jira/browse/HBASE-20691?focusedCommentId=16517662=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16517662] > in HBASE-20691, since the setStoragePolicy code is in CommonFSUtils, the > test should be in TestCommonFSUtils -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20649) Validate HFiles do not have PREFIX_TREE DataBlockEncoding
[ https://issues.apache.org/jira/browse/HBASE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balazs Meszaros updated HBASE-20649: Attachment: HBASE-20649.master.005.patch > Validate HFiles do not have PREFIX_TREE DataBlockEncoding > - > > Key: HBASE-20649 > URL: https://issues.apache.org/jira/browse/HBASE-20649 > Project: HBase > Issue Type: New Feature >Reporter: Peter Somogyi >Assignee: Balazs Meszaros >Priority: Minor > Fix For: 3.0.0 > > Attachments: HBASE-20649.master.001.patch, > HBASE-20649.master.002.patch, HBASE-20649.master.003.patch, > HBASE-20649.master.004.patch, HBASE-20649.master.005.patch > > > HBASE-20592 adds a tool to check column families on the cluster do not have > PREFIX_TREE encoding. > Since it is possible that DataBlockEncoding was already changed but HFiles > are not rewritten yet we would need a tool that can verify the content of > hfiles in the cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20649) Validate HFiles do not have PREFIX_TREE DataBlockEncoding
[ https://issues.apache.org/jira/browse/HBASE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balazs Meszaros updated HBASE-20649: Attachment: (was: HBASE-20649.master.005.patch) > Validate HFiles do not have PREFIX_TREE DataBlockEncoding > - > > Key: HBASE-20649 > URL: https://issues.apache.org/jira/browse/HBASE-20649 > Project: HBase > Issue Type: New Feature >Reporter: Peter Somogyi >Assignee: Balazs Meszaros >Priority: Minor > Fix For: 3.0.0 > > Attachments: HBASE-20649.master.001.patch, > HBASE-20649.master.002.patch, HBASE-20649.master.003.patch, > HBASE-20649.master.004.patch > > > HBASE-20592 adds a tool to check column families on the cluster do not have > PREFIX_TREE encoding. > Since it is possible that DataBlockEncoding was already changed but HFiles > are not rewritten yet we would need a tool that can verify the content of > hfiles in the cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()
[ https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538460#comment-16538460 ] Guanghao Zhang commented on HBASE-20697: Ping [~apurtell] for 1.4. > Can't cache All region locations of the specify table by calling > table.getRegionLocator().getAllRegionLocations() > - > > Key: HBASE-20697 > URL: https://issues.apache.org/jira/browse/HBASE-20697 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1, 1.2.6 >Reporter: zhaoyuan >Assignee: zhaoyuan >Priority: Major > Fix For: 1.2.7, 1.3.3 > > Attachments: HBASE-20697.branch-1.2.001.patch, > HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, > HBASE-20697.branch-1.2.004.patch, HBASE-20697.master.001.patch, > HBASE-20697.master.002.patch, HBASE-20697.master.002.patch, > HBASE-20697.master.003.patch > > > When we upgrade and restart a new version application which will read and > write to HBase, we will get some operation timeout. The time out is expected > because when the application restarts,It will not hold any region locations > cache and do communication with zk and meta regionserver to get region > locations. > We want to avoid these timeouts so we do warmup work and as far as I am > concerned,the method table.getRegionLocator().getAllRegionLocations() will > fetch all region locations and cache them. However, it didn't work good. > There are still a lot of time outs,so it confused me. > I dig into the source code and find something below > {code:java} > // code placeholder > public List getAllRegionLocations() throws IOException { > TableName tableName = getName(); > NavigableMap locations = > MetaScanner.allTableRegions(this.connection, tableName); > ArrayList regions = new ArrayList<>(locations.size()); > for (Entry entry : locations.entrySet()) { > regions.add(new HRegionLocation(entry.getKey(), entry.getValue())); > } > if (regions.size() > 0) { > connection.cacheLocation(tableName, new RegionLocations(regions)); > } > return regions; > } > In MetaCache > public void cacheLocation(final TableName tableName, final RegionLocations > locations) { > byte [] startKey = > locations.getRegionLocation().getRegionInfo().getStartKey(); > ConcurrentMap tableLocations = > getTableLocations(tableName); > RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, > locations); > boolean isNewCacheEntry = (oldLocation == null); > if (isNewCacheEntry) { > if (LOG.isTraceEnabled()) { > LOG.trace("Cached location: " + locations); > } > addToCachedServers(locations); > return; > } > {code} > It will collect all regions into one RegionLocations object and only cache > the first not null region location and then when we put or get to hbase, we > do getCacheLocation() > {code:java} > // code placeholder > public RegionLocations getCachedLocation(final TableName tableName, final > byte [] row) { > ConcurrentNavigableMap tableLocations = > getTableLocations(tableName); > Entry e = tableLocations.floorEntry(row); > if (e == null) { > if (metrics!= null) metrics.incrMetaCacheMiss(); > return null; > } > RegionLocations possibleRegion = e.getValue(); > // make sure that the end key is greater than the row we're looking > // for, otherwise the row actually belongs in the next region, not > // this one. the exception case is when the endkey is > // HConstants.EMPTY_END_ROW, signifying that the region we're > // checking is actually the last region in the table. > byte[] endKey = > possibleRegion.getRegionLocation().getRegionInfo().getEndKey(); > if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) || > getRowComparator(tableName).compareRows( > endKey, 0, endKey.length, row, 0, row.length) > 0) { > if (metrics != null) metrics.incrMetaCacheHit(); > return possibleRegion; > } > // Passed all the way through, so we got nothing - complete cache miss > if (metrics != null) metrics.incrMetaCacheMiss(); > return null; > } > {code} > It will choose the first location to be possibleRegion and possibly it will > miss match. > So did I forget something or may be wrong somewhere? If this is indeed a bug > I think it can be fixed not very hard. > Hope commiters and PMC review this ! > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()
[ https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538457#comment-16538457 ] Guanghao Zhang commented on HBASE-20697: +1 for 003 patch. Will commit it if no objections. Ping [~Apache9] for 2.1 and ping [~stack] for 2.0. > Can't cache All region locations of the specify table by calling > table.getRegionLocator().getAllRegionLocations() > - > > Key: HBASE-20697 > URL: https://issues.apache.org/jira/browse/HBASE-20697 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1, 1.2.6 >Reporter: zhaoyuan >Assignee: zhaoyuan >Priority: Major > Fix For: 1.2.7, 1.3.3 > > Attachments: HBASE-20697.branch-1.2.001.patch, > HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, > HBASE-20697.branch-1.2.004.patch, HBASE-20697.master.001.patch, > HBASE-20697.master.002.patch, HBASE-20697.master.002.patch, > HBASE-20697.master.003.patch > > > When we upgrade and restart a new version application which will read and > write to HBase, we will get some operation timeout. The time out is expected > because when the application restarts,It will not hold any region locations > cache and do communication with zk and meta regionserver to get region > locations. > We want to avoid these timeouts so we do warmup work and as far as I am > concerned,the method table.getRegionLocator().getAllRegionLocations() will > fetch all region locations and cache them. However, it didn't work good. > There are still a lot of time outs,so it confused me. > I dig into the source code and find something below > {code:java} > // code placeholder > public List getAllRegionLocations() throws IOException { > TableName tableName = getName(); > NavigableMap locations = > MetaScanner.allTableRegions(this.connection, tableName); > ArrayList regions = new ArrayList<>(locations.size()); > for (Entry entry : locations.entrySet()) { > regions.add(new HRegionLocation(entry.getKey(), entry.getValue())); > } > if (regions.size() > 0) { > connection.cacheLocation(tableName, new RegionLocations(regions)); > } > return regions; > } > In MetaCache > public void cacheLocation(final TableName tableName, final RegionLocations > locations) { > byte [] startKey = > locations.getRegionLocation().getRegionInfo().getStartKey(); > ConcurrentMap tableLocations = > getTableLocations(tableName); > RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, > locations); > boolean isNewCacheEntry = (oldLocation == null); > if (isNewCacheEntry) { > if (LOG.isTraceEnabled()) { > LOG.trace("Cached location: " + locations); > } > addToCachedServers(locations); > return; > } > {code} > It will collect all regions into one RegionLocations object and only cache > the first not null region location and then when we put or get to hbase, we > do getCacheLocation() > {code:java} > // code placeholder > public RegionLocations getCachedLocation(final TableName tableName, final > byte [] row) { > ConcurrentNavigableMap tableLocations = > getTableLocations(tableName); > Entry e = tableLocations.floorEntry(row); > if (e == null) { > if (metrics!= null) metrics.incrMetaCacheMiss(); > return null; > } > RegionLocations possibleRegion = e.getValue(); > // make sure that the end key is greater than the row we're looking > // for, otherwise the row actually belongs in the next region, not > // this one. the exception case is when the endkey is > // HConstants.EMPTY_END_ROW, signifying that the region we're > // checking is actually the last region in the table. > byte[] endKey = > possibleRegion.getRegionLocation().getRegionInfo().getEndKey(); > if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) || > getRowComparator(tableName).compareRows( > endKey, 0, endKey.length, row, 0, row.length) > 0) { > if (metrics != null) metrics.incrMetaCacheHit(); > return possibleRegion; > } > // Passed all the way through, so we got nothing - complete cache miss > if (metrics != null) metrics.incrMetaCacheMiss(); > return null; > } > {code} > It will choose the first location to be possibleRegion and possibly it will > miss match. > So did I forget something or may be wrong somewhere? If this is indeed a bug > I think it can be fixed not very hard. > Hope commiters and PMC review this ! > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20855) PeerConfigTracker only support one listener will cause problem when there is a recovered replication queue
[ https://issues.apache.org/jira/browse/HBASE-20855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingyun Tian updated HBASE-20855: - Attachment: (was: HBASE-20855.branch-1.003.patch) > PeerConfigTracker only support one listener will cause problem when there is > a recovered replication queue > -- > > Key: HBASE-20855 > URL: https://issues.apache.org/jira/browse/HBASE-20855 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0, 1.4.0, 1.5.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Attachments: HBASE-20855.branch-1.001.patch, > HBASE-20855.branch-1.002.patch, HBASE-20855.branch-1.003.patch > > > {code} > public void init(Context context) throws IOException { > this.ctx = context; > if (this.ctx != null){ > ReplicationPeer peer = this.ctx.getReplicationPeer(); > if (peer != null){ > peer.trackPeerConfigChanges(this); > } else { > LOG.warn("Not tracking replication peer config changes for Peer Id " + > this.ctx.getPeerId() + > " because there's no such peer"); > } > } > } > {code} > As we know, replication source will set itself to the PeerConfigTracker in > ReplicationPeer. When there is one or more recovered queue, each queue will > generate a new replication source, But they share the same ReplicationPeer. > Then when it calls setListener, the new generated one will cover the older > one. Thus there will only has one ReplicationPeer that receive the peer > config change notify. > {code} > public synchronized void setListener(ReplicationPeerConfigListener listener){ > this.listener = listener; > } > {code} > > To solve this, PeerConfigTracker need to support multiple listener and > listener should be removed when the replication endpoint terminated. > I will upload a patch later with fix and UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20855) PeerConfigTracker only support one listener will cause problem when there is a recovered replication queue
[ https://issues.apache.org/jira/browse/HBASE-20855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingyun Tian updated HBASE-20855: - Attachment: HBASE-20855.branch-1.003.patch > PeerConfigTracker only support one listener will cause problem when there is > a recovered replication queue > -- > > Key: HBASE-20855 > URL: https://issues.apache.org/jira/browse/HBASE-20855 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.0, 1.4.0, 1.5.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Attachments: HBASE-20855.branch-1.001.patch, > HBASE-20855.branch-1.002.patch, HBASE-20855.branch-1.003.patch > > > {code} > public void init(Context context) throws IOException { > this.ctx = context; > if (this.ctx != null){ > ReplicationPeer peer = this.ctx.getReplicationPeer(); > if (peer != null){ > peer.trackPeerConfigChanges(this); > } else { > LOG.warn("Not tracking replication peer config changes for Peer Id " + > this.ctx.getPeerId() + > " because there's no such peer"); > } > } > } > {code} > As we know, replication source will set itself to the PeerConfigTracker in > ReplicationPeer. When there is one or more recovered queue, each queue will > generate a new replication source, But they share the same ReplicationPeer. > Then when it calls setListener, the new generated one will cover the older > one. Thus there will only has one ReplicationPeer that receive the peer > config change notify. > {code} > public synchronized void setListener(ReplicationPeerConfigListener listener){ > this.listener = listener; > } > {code} > > To solve this, PeerConfigTracker need to support multiple listener and > listener should be removed when the replication endpoint terminated. > I will upload a patch later with fix and UT. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()
[ https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538449#comment-16538449 ] zhaoyuan commented on HBASE-20697: -- ping [~zghaobac] Had fixed the check style problems > Can't cache All region locations of the specify table by calling > table.getRegionLocator().getAllRegionLocations() > - > > Key: HBASE-20697 > URL: https://issues.apache.org/jira/browse/HBASE-20697 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.1, 1.2.6 >Reporter: zhaoyuan >Assignee: zhaoyuan >Priority: Major > Fix For: 1.2.7, 1.3.3 > > Attachments: HBASE-20697.branch-1.2.001.patch, > HBASE-20697.branch-1.2.002.patch, HBASE-20697.branch-1.2.003.patch, > HBASE-20697.branch-1.2.004.patch, HBASE-20697.master.001.patch, > HBASE-20697.master.002.patch, HBASE-20697.master.002.patch, > HBASE-20697.master.003.patch > > > When we upgrade and restart a new version application which will read and > write to HBase, we will get some operation timeout. The time out is expected > because when the application restarts,It will not hold any region locations > cache and do communication with zk and meta regionserver to get region > locations. > We want to avoid these timeouts so we do warmup work and as far as I am > concerned,the method table.getRegionLocator().getAllRegionLocations() will > fetch all region locations and cache them. However, it didn't work good. > There are still a lot of time outs,so it confused me. > I dig into the source code and find something below > {code:java} > // code placeholder > public List getAllRegionLocations() throws IOException { > TableName tableName = getName(); > NavigableMap locations = > MetaScanner.allTableRegions(this.connection, tableName); > ArrayList regions = new ArrayList<>(locations.size()); > for (Entry entry : locations.entrySet()) { > regions.add(new HRegionLocation(entry.getKey(), entry.getValue())); > } > if (regions.size() > 0) { > connection.cacheLocation(tableName, new RegionLocations(regions)); > } > return regions; > } > In MetaCache > public void cacheLocation(final TableName tableName, final RegionLocations > locations) { > byte [] startKey = > locations.getRegionLocation().getRegionInfo().getStartKey(); > ConcurrentMap tableLocations = > getTableLocations(tableName); > RegionLocations oldLocation = tableLocations.putIfAbsent(startKey, > locations); > boolean isNewCacheEntry = (oldLocation == null); > if (isNewCacheEntry) { > if (LOG.isTraceEnabled()) { > LOG.trace("Cached location: " + locations); > } > addToCachedServers(locations); > return; > } > {code} > It will collect all regions into one RegionLocations object and only cache > the first not null region location and then when we put or get to hbase, we > do getCacheLocation() > {code:java} > // code placeholder > public RegionLocations getCachedLocation(final TableName tableName, final > byte [] row) { > ConcurrentNavigableMap tableLocations = > getTableLocations(tableName); > Entry e = tableLocations.floorEntry(row); > if (e == null) { > if (metrics!= null) metrics.incrMetaCacheMiss(); > return null; > } > RegionLocations possibleRegion = e.getValue(); > // make sure that the end key is greater than the row we're looking > // for, otherwise the row actually belongs in the next region, not > // this one. the exception case is when the endkey is > // HConstants.EMPTY_END_ROW, signifying that the region we're > // checking is actually the last region in the table. > byte[] endKey = > possibleRegion.getRegionLocation().getRegionInfo().getEndKey(); > if (Bytes.equals(endKey, HConstants.EMPTY_END_ROW) || > getRowComparator(tableName).compareRows( > endKey, 0, endKey.length, row, 0, row.length) > 0) { > if (metrics != null) metrics.incrMetaCacheHit(); > return possibleRegion; > } > // Passed all the way through, so we got nothing - complete cache miss > if (metrics != null) metrics.incrMetaCacheMiss(); > return null; > } > {code} > It will choose the first location to be possibleRegion and possibly it will > miss match. > So did I forget something or may be wrong somewhere? If this is indeed a bug > I think it can be fixed not very hard. > Hope commiters and PMC review this ! > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20649) Validate HFiles do not have PREFIX_TREE DataBlockEncoding
[ https://issues.apache.org/jira/browse/HBASE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538429#comment-16538429 ] Balazs Meszaros commented on HBASE-20649: - Thanks for the deep testing [~busbey]. I did a rebase and some concurrency fixes. > Validate HFiles do not have PREFIX_TREE DataBlockEncoding > - > > Key: HBASE-20649 > URL: https://issues.apache.org/jira/browse/HBASE-20649 > Project: HBase > Issue Type: New Feature >Reporter: Peter Somogyi >Assignee: Balazs Meszaros >Priority: Minor > Fix For: 3.0.0 > > Attachments: HBASE-20649.master.001.patch, > HBASE-20649.master.002.patch, HBASE-20649.master.003.patch, > HBASE-20649.master.004.patch, HBASE-20649.master.005.patch > > > HBASE-20592 adds a tool to check column families on the cluster do not have > PREFIX_TREE encoding. > Since it is possible that DataBlockEncoding was already changed but HFiles > are not rewritten yet we would need a tool that can verify the content of > hfiles in the cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20697) Can't cache All region locations of the specify table by calling table.getRegionLocator().getAllRegionLocations()
[ https://issues.apache.org/jira/browse/HBASE-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538422#comment-16538422 ] Hadoop QA commented on HBASE-20697: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 15s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 18s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 32s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 49s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 47s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 22s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 45s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 25s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 3s{color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}172m 40s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 48s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}228m 25s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-20697 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12930957/HBASE-20697.master.003.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux c8b23ad67314 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / d7561cee50 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_171 | | findbugs |
[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version
[ https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538417#comment-16538417 ] Ted Yu commented on HBASE-20866: Nice findings, Vikas. Looking forward to the patch. > HBase 1.x scan performance degradation compared to 0.98 version > --- > > Key: HBASE-20866 > URL: https://issues.apache.org/jira/browse/HBASE-20866 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.2 >Reporter: Vikas Vishwakarma >Priority: Major > > Internally while testing 1.3 as part of migration from 0.98 to 1.3 we > observed perf degradation in scan performance for phoenix queries varying > from few 10's to upto 200% depending on the query being executed. We tried > simple native HBase scan and there also we saw upto 40% degradation in > performance when the number of column qualifiers are high (40-50+) > To identify the root cause of performance diff between 0.98 and 1.3 we > carried out lot of experiments with profiling and git bisect iterations, > however we were not able to identify any particular source of scan > performance degradation and it looked like this is an accumulated degradation > of 5-10% over various enhancements and refactoring. > We identified few major enhancements like partialResult handling, > ScannerContext with heartbeat processing, time/size limiting, RPC > refactoring, etc that could have contributed to small degradation in > performance which put together could be leading to large overall degradation. > One of the changes is > [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which > implements partialResult handling. In ClientScanner.java the results received > from server are cached on the client side by converting the result array into > an ArrayList. This function gets called in a loop depending on the number of > rows in the scan result. Example for ten’s of millions of rows scanned, this > can be called in the order of millions of times. > In almost all the cases 99% of the time (except for handling partial results, > etc). We are just taking the resultsFromServer converting it into a ArrayList > resultsToAddToCache in addResultsToList(..) and then iterating over the list > again and adding it to cache in loadCache(..) as given in the code path below > In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → > addResultsToList(..) → > {code:java} > loadCache() { > ... > List resultsToAddToCache = > getResultsToAddToCache(values, callable.isHeartbeatMessage()); > ... > … > for (Result rs : resultsToAddToCache) { > rs = filterLoadedCell(rs); > cache.add(rs); > ... > } > } > getResultsToAddToCache(..) { > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && > scan.getAllowPartialResults(); > .. > if (allowPartials || isBatchSet) { > addResultsToList(resultsToAddToCache, resultsFromServer, 0, > (null == resultsFromServer ? 0 : resultsFromServer.length)); > return resultsToAddToCache; > } > ... > } > private void addResultsToList(List outputList, Result[] inputArray, > int start, int end) { > if (inputArray == null || start < 0 || end > inputArray.length) return; > for (int i = start; i < end; i++) { > outputList.add(inputArray[i]); > } > }{code} > > It looks like we can avoid the result array to arraylist conversion > (resultsFromServer --> resultsToAddToCache ) for the first case which is also > the most frequent case and instead directly take the values arraay returned > by callable and add it to the cache without converting it into ArrayList. > I have taken both these flags allowPartials and isBatchSet out in loadcahe() > and I am directly adding values to scanner cache if the above condition is > pass instead of coverting it into arrayList by calling > getResultsToAddToCache(). For example: > {code:java} > protected void loadCache() throws IOException { > Result[] values = null; > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && scan.getAllowPartialResults(); > .. > for (;;) { > try { > values = call(callable, caller, scannerTimeout); > .. > } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) { > .. > } > if (allowPartials || isBatchSet) { // DIRECTLY COPY values TO CACHE > if (values != null) { > for (int v=0; v Result rs = values[v]; > > cache.add(rs); > ... > } else { // DO ALL THE REGULAR PARTIAL RESULT HANDLING .. > List resultsToAddToCache = > getResultsToAddToCache(values, callable.isHeartbeatMessage()); > for (Result rs : resultsToAddToCache) { > > cache.add(rs); > ... > } > } > {code} > > I am seeing upto 10% improvement in scan time with
[jira] [Updated] (HBASE-20649) Validate HFiles do not have PREFIX_TREE DataBlockEncoding
[ https://issues.apache.org/jira/browse/HBASE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balazs Meszaros updated HBASE-20649: Attachment: HBASE-20649.master.005.patch > Validate HFiles do not have PREFIX_TREE DataBlockEncoding > - > > Key: HBASE-20649 > URL: https://issues.apache.org/jira/browse/HBASE-20649 > Project: HBase > Issue Type: New Feature >Reporter: Peter Somogyi >Assignee: Balazs Meszaros >Priority: Minor > Fix For: 3.0.0 > > Attachments: HBASE-20649.master.001.patch, > HBASE-20649.master.002.patch, HBASE-20649.master.003.patch, > HBASE-20649.master.004.patch, HBASE-20649.master.005.patch > > > HBASE-20592 adds a tool to check column families on the cluster do not have > PREFIX_TREE encoding. > Since it is possible that DataBlockEncoding was already changed but HFiles > are not rewritten yet we would need a tool that can verify the content of > hfiles in the cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20784) Will lose the SNAPSHOT suffix if we get the version of RS from ServerManager
[ https://issues.apache.org/jira/browse/HBASE-20784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538402#comment-16538402 ] Hudson commented on HBASE-20784: Results for branch branch-2.1 [build #45 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/45/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/45//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/45//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/45//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Will lose the SNAPSHOT suffix if we get the version of RS from ServerManager > > > Key: HBASE-20784 > URL: https://issues.apache.org/jira/browse/HBASE-20784 > Project: HBase > Issue Type: Bug > Components: master, UI >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Minor > Fix For: 3.0.0, 2.1.0, 2.2.0 > > Attachments: HBASE-20784.patch > > > In HBASE-20722 we removed the usage of RegionServerTracker when getting > information for region server. And version in server manager is a int, and we > convert it to a String when displaying it on the master ui, so we will lose > the SNAPSHOT suffix. Not a big one as this is not a problem for normal > releases. Open a issue for it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version
[ https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538386#comment-16538386 ] Reid Chan commented on HBASE-20866: --- bq. I am seeing upto 10% improvement in scan time with these changes But it's still worse compared to 0.98? > HBase 1.x scan performance degradation compared to 0.98 version > --- > > Key: HBASE-20866 > URL: https://issues.apache.org/jira/browse/HBASE-20866 > Project: HBase > Issue Type: Bug >Affects Versions: 1.3.2 >Reporter: Vikas Vishwakarma >Priority: Major > > Internally while testing 1.3 as part of migration from 0.98 to 1.3 we > observed perf degradation in scan performance for phoenix queries varying > from few 10's to upto 200% depending on the query being executed. We tried > simple native HBase scan and there also we saw upto 40% degradation in > performance when the number of column qualifiers are high (40-50+) > To identify the root cause of performance diff between 0.98 and 1.3 we > carried out lot of experiments with profiling and git bisect iterations, > however we were not able to identify any particular source of scan > performance degradation and it looked like this is an accumulated degradation > of 5-10% over various enhancements and refactoring. > We identified few major enhancements like partialResult handling, > ScannerContext with heartbeat processing, time/size limiting, RPC > refactoring, etc that could have contributed to small degradation in > performance which put together could be leading to large overall degradation. > One of the changes is > [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which > implements partialResult handling. In ClientScanner.java the results received > from server are cached on the client side by converting the result array into > an ArrayList. This function gets called in a loop depending on the number of > rows in the scan result. Example for ten’s of millions of rows scanned, this > can be called in the order of millions of times. > In almost all the cases 99% of the time (except for handling partial results, > etc). We are just taking the resultsFromServer converting it into a ArrayList > resultsToAddToCache in addResultsToList(..) and then iterating over the list > again and adding it to cache in loadCache(..) as given in the code path below > In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → > addResultsToList(..) → > {code:java} > loadCache() { > ... > List resultsToAddToCache = > getResultsToAddToCache(values, callable.isHeartbeatMessage()); > ... > … > for (Result rs : resultsToAddToCache) { > rs = filterLoadedCell(rs); > cache.add(rs); > ... > } > } > getResultsToAddToCache(..) { > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && > scan.getAllowPartialResults(); > .. > if (allowPartials || isBatchSet) { > addResultsToList(resultsToAddToCache, resultsFromServer, 0, > (null == resultsFromServer ? 0 : resultsFromServer.length)); > return resultsToAddToCache; > } > ... > } > private void addResultsToList(List outputList, Result[] inputArray, > int start, int end) { > if (inputArray == null || start < 0 || end > inputArray.length) return; > for (int i = start; i < end; i++) { > outputList.add(inputArray[i]); > } > }{code} > > It looks like we can avoid the result array to arraylist conversion > (resultsFromServer --> resultsToAddToCache ) for the first case which is also > the most frequent case and instead directly take the values arraay returned > by callable and add it to the cache without converting it into ArrayList. > I have taken both these flags allowPartials and isBatchSet out in loadcahe() > and I am directly adding values to scanner cache if the above condition is > pass instead of coverting it into arrayList by calling > getResultsToAddToCache(). For example: > {code:java} > protected void loadCache() throws IOException { > Result[] values = null; > .. > final boolean isBatchSet = scan != null && scan.getBatch() > 0; > final boolean allowPartials = scan != null && scan.getAllowPartialResults(); > .. > for (;;) { > try { > values = call(callable, caller, scannerTimeout); > .. > } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) { > .. > } > if (allowPartials || isBatchSet) { // DIRECTLY COPY values TO CACHE > if (values != null) { > for (int v=0; v Result rs = values[v]; > > cache.add(rs); > ... > } else { // DO ALL THE REGULAR PARTIAL RESULT HANDLING .. > List resultsToAddToCache = > getResultsToAddToCache(values, callable.isHeartbeatMessage()); > for (Result rs : resultsToAddToCache) { > > cache.add(rs); > ... > } > } >
[jira] [Updated] (HBASE-20866) HBase 1.x scan performance degradation compared to 0.98 version
[ https://issues.apache.org/jira/browse/HBASE-20866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikas Vishwakarma updated HBASE-20866: -- Description: Internally while testing 1.3 as part of migration from 0.98 to 1.3 we observed perf degradation in scan performance for phoenix queries varying from few 10's to upto 200% depending on the query being executed. We tried simple native HBase scan and there also we saw upto 40% degradation in performance when the number of column qualifiers are high (40-50+) To identify the root cause of performance diff between 0.98 and 1.3 we carried out lot of experiments with profiling and git bisect iterations, however we were not able to identify any particular source of scan performance degradation and it looked like this is an accumulated degradation of 5-10% over various enhancements and refactoring. We identified few major enhancements like partialResult handling, ScannerContext with heartbeat processing, time/size limiting, RPC refactoring, etc that could have contributed to small degradation in performance which put together could be leading to large overall degradation. One of the changes is [HBASE-11544|https://jira.apache.org/jira/browse/HBASE-11544] which implements partialResult handling. In ClientScanner.java the results received from server are cached on the client side by converting the result array into an ArrayList. This function gets called in a loop depending on the number of rows in the scan result. Example for ten’s of millions of rows scanned, this can be called in the order of millions of times. In almost all the cases 99% of the time (except for handling partial results, etc). We are just taking the resultsFromServer converting it into a ArrayList resultsToAddToCache in addResultsToList(..) and then iterating over the list again and adding it to cache in loadCache(..) as given in the code path below In ClientScanner → loadCache(..) → getResultsToAddToCache(..) → addResultsToList(..) → {code:java} loadCache() { ... List resultsToAddToCache = getResultsToAddToCache(values, callable.isHeartbeatMessage()); ... … for (Result rs : resultsToAddToCache) { rs = filterLoadedCell(rs); cache.add(rs); ... } } getResultsToAddToCache(..) { .. final boolean isBatchSet = scan != null && scan.getBatch() > 0; final boolean allowPartials = scan != null && scan.getAllowPartialResults(); .. if (allowPartials || isBatchSet) { addResultsToList(resultsToAddToCache, resultsFromServer, 0, (null == resultsFromServer ? 0 : resultsFromServer.length)); return resultsToAddToCache; } ... } private void addResultsToList(List outputList, Result[] inputArray, int start, int end) { if (inputArray == null || start < 0 || end > inputArray.length) return; for (int i = start; i < end; i++) { outputList.add(inputArray[i]); } }{code} It looks like we can avoid the result array to arraylist conversion (resultsFromServer --> resultsToAddToCache ) for the first case which is also the most frequent case and instead directly take the values arraay returned by callable and add it to the cache without converting it into ArrayList. I have taken both these flags allowPartials and isBatchSet out in loadcahe() and I am directly adding values to scanner cache if the above condition is pass instead of coverting it into arrayList by calling getResultsToAddToCache(). For example: {code:java} protected void loadCache() throws IOException { Result[] values = null; .. final boolean isBatchSet = scan != null && scan.getBatch() > 0; final boolean allowPartials = scan != null && scan.getAllowPartialResults(); .. for (;;) { try { values = call(callable, caller, scannerTimeout); .. } catch (DoNotRetryIOException | NeedUnmanagedConnectionException e) { .. } if (allowPartials || isBatchSet) { // DIRECTLY COPY values TO CACHE if (values != null) { for (int v=0; v resultsToAddToCache = getResultsToAddToCache(values, callable.isHeartbeatMessage()); for (Result rs : resultsToAddToCache) { cache.add(rs); ... } } {code} I am seeing upto 10% improvement in scan time with these changes, sample PE execution results given below. ||PE (1M , 1 thread)||with addResultsToList||without addResultsToList||%improvement|| |ScanTest|9228|8448|9| |RandomScanWithRange10Test|393413|378222|4| |RandomScanWithRange100Test|1041860|980147|6| Similarly we are observing upto 10% improvement in simple native HBase scan test used internally that just scans through a large region filtering all the rows. I still have to do the phoenix query tests with this change. Posting the initial observations for feedback/comments and suggestions. was: Internally while testing 1.3 as part of migration from 0.98 to 1.3 we observed perf degradation in scan performance for phoenix queries varying from few 10's to upto 200% depending on
[jira] [Updated] (HBASE-20864) RS was killed due to master thought the region should be on a already dead server
[ https://issues.apache.org/jira/browse/HBASE-20864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-20864: -- Issue Type: Sub-task (was: Bug) Parent: HBASE-20828 > RS was killed due to master thought the region should be on a already dead > server > - > > Key: HBASE-20864 > URL: https://issues.apache.org/jira/browse/HBASE-20864 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Allan Yang >Priority: Major > Attachments: log.zip > > > When I was running ITBLL with our internal 2.0.0 version(with 2.0.1 > backported and with other two issues: HBASE-20706, HBASE-20752). I found two > of my RS killed by master since master has a different region state with > those RS. It is very strange that master thought these region should be on a > already dead server. There might be a serious bug, but I haven't found it > yet. Here is the process: > 1. e010125048153.bja,60020,1531137365840 is crashed, and clearly > 4423e4182457c5b573729be4682cc3a3 was assigned to > e010125049164.bja,60020,1531136465378 during ServerCrashProcedure > {code:java} > 2018-07-09 20:03:32,443 INFO [PEWorker-10] procedure.ServerCrashProcedure: > Start pid=2303, state=RUNNABLE:SERVER_CRASH_START; ServerCrashProcedure > server=e010125048153.bja,60020,1531137365840, splitWal=true, meta=false > 2018-07-09 20:03:39,220 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=294,queue=24,port=6] > assignment.RegionTransitionProcedure: Received report OPENED seqId=16021, > pid=2305, ppid=2303, state=RUNNABLE:REGION_TRANSITION_DISPATCH; > AssignProcedure table=IntegrationTestBigLinkedList, > region=4423e4182457c5b573729be4682cc3a3; rit=OPENING, > location=e010125049164.bja,60020,1531136465378 > 2018-07-09 20:03:39,220 INFO [PEWorker-13] assignment.RegionStateStore: > pid=2305 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, > regionState=OPEN, openSeqNum=16021, > regionLocation=e010125049164.bja,60020,1531136465378 > 2018-07-09 20:03:43,190 INFO [PEWorker-12] procedure2.ProcedureExecutor: > Finished pid=2303, state=SUCCESS; ServerCrashProcedure > server=e010125048153.bja,60020,1531137365840, splitWal=true, meta=false in > 10.7490sec > {code} > 2. A modify table happened later, the 4423e4182457c5b573729be4682cc3a3 was > reopend on e010125049164.bja,60020,1531136465378 > {code:java} > 2018-07-09 20:04:39,929 DEBUG > [RpcServer.default.FPBQ.Fifo.handler=295,queue=25,port=6] > assignment.RegionTransitionProcedure: Received report OPENED seqId=16024, > pid=2351, ppid=2314, state=RUNNABLE:REGION_TRANSITION_DISPATCH; > AssignProcedure table=IntegrationTestBigLinkedList, > region=4423e4182457c5b573729be4682cc3a3, > target=e010125049164.bja,60020,1531136465378; rit=OPENING, > location=e010125049164.bja,60020,1531136465378 > 2018-07-09 20:04:40,554 INFO [PEWorker-6] assignment.RegionStateStore: > pid=2351 updating hbase:meta row=4423e4182457c5b573729be4682cc3a3, > regionState=OPEN, openSeqNum=16024, > regionLocation=e010125049164.bja,60020,1531136465378 > {code} > 3. Active master was killed, the backup master took over, but when loading > meta entry, it clearly showed 4423e4182457c5b573729be4682cc3a3 is on the > privous dead server e010125048153.bja,60020,1531137365840. That is very very > strange!!! > {code:java} > 2018-07-09 20:06:17,985 INFO [master/e010125048016:6] > assignment.RegionStateStore: Load hbase:meta entry > region=4423e4182457c5b573729be4682cc3a3, regionState=OPEN, > lastHost=e010125049164.bja,60020,1531136465378, > regionLocation=e010125048153.bja,60020,1531137365840, openSeqNum=16024 > {code} > 4. the rs was killed > {code:java} > 2018-07-09 20:06:20,265 WARN > [RpcServer.default.FPBQ.Fifo.handler=297,queue=27,port=6] > assignment.AssignmentManager: Killing e010125049164.bja,60020,1531136465378: > rit=OPEN, location=e010125048153.bja,60020,1531137365840, > table=IntegrationTestBigLinkedList, > region=4423e4182457c5b573729be4682cc3a3reported OPEN on > server=e010125049164.bja,60020,1531136465378 but state has otherwise. > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-20649) Validate HFiles do not have PREFIX_TREE DataBlockEncoding
[ https://issues.apache.org/jira/browse/HBASE-20649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Balazs Meszaros updated HBASE-20649: Attachment: (was: HBASE-20649.master.005.patch) > Validate HFiles do not have PREFIX_TREE DataBlockEncoding > - > > Key: HBASE-20649 > URL: https://issues.apache.org/jira/browse/HBASE-20649 > Project: HBase > Issue Type: New Feature >Reporter: Peter Somogyi >Assignee: Balazs Meszaros >Priority: Minor > Fix For: 3.0.0 > > Attachments: HBASE-20649.master.001.patch, > HBASE-20649.master.002.patch, HBASE-20649.master.003.patch, > HBASE-20649.master.004.patch > > > HBASE-20592 adds a tool to check column families on the cluster do not have > PREFIX_TREE encoding. > Since it is possible that DataBlockEncoding was already changed but HFiles > are not rewritten yet we would need a tool that can verify the content of > hfiles in the cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20847) The parent procedure of RegionTransitionProcedure may not have the table lock
[ https://issues.apache.org/jira/browse/HBASE-20847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16538355#comment-16538355 ] Duo Zhang commented on HBASE-20847: --- Any other concerns? [~stack] And [~allan163], the patch here is similar to yours in HBASE-20846, where we always need to acquire shared lock, no matter whether we have a parent or not. Could you please also take a simple look? Thanks. > The parent procedure of RegionTransitionProcedure may not have the table lock > - > > Key: HBASE-20847 > URL: https://issues.apache.org/jira/browse/HBASE-20847 > Project: HBase > Issue Type: Sub-task > Components: proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-20847-v1.patch, HBASE-20847-v2.patch, > HBASE-20847.patch > > > For example, SCP can also schedule AssignProcedure and obviously it will not > hold the table lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)