[jira] [Commented] (HBASE-23349) Expose max refCount among all compacted store files of a region
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989474#comment-16989474 ] Viraj Jasani commented on HBASE-23349: -- It seems at least max refCount based metric is not going to help here. Let me repurpose this Jira and the patch attached so far upto 002 can be ignored. The major issue is client holding read lock on compacted files for longer time and if we really want to tackle this gracefully, we might have to remove file reader from open scanners and let open scanners use new store files - seems tricky though. Also, on the other hand, once reader lock is released, we can check if archival of compacted files was blocked on this and if so, let the discharger thread run forcefully then and there before someone else takes read lock - this might also be tricky and not sure if it would be performant. [~apurtell] [~anoop.hbase] > Expose max refCount among all compacted store files of a region > --- > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > Attachments: HBASE-23349.master.000.patch, > HBASE-23349.master.001.patch, HBASE-23349.master.002.patch > > > We should expose a region level metric that represents max refCount among > refCounts of all compacted store files under the region. For successful > archival of compacted store files, it is important for this metric count to > be 0 eventually if not immediately. If it is >0 for a considerably high > amount of time, it indicates some issue i.e. reader refCount leak on some > compacted store files and in such case, archival would not be successful. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Expose max refCount among all compacted store files of a region
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987603#comment-16987603 ] Anoop Sam John commented on HBASE-23349: bq.Agree, because of this, our auto-reopen of region is not happening based on max refCount among compacted store files. The ref count based decision system wont work at all I fear! Because what value u will give for this config? when compaction happened and so the file become eligible for arhival, there can be 1 or more (Any number) scanners over this file. So the ref count would have been 1+. Later it is possible that many of these scanners are properly closed say apart from one. So the ref count would become 1. Any reopen decision based on a larger value than 1 wont help us then!. The point is that once the compaction is over, these files are moved out of this Store's HFiles. It just sits there. The newer scanners wont even touch that and so possibly increase the ref counts. So the leaks wont make this ref count to grow large over the time. Said that any decision based on ref count wont work ! My thinking is we should do a time based decision. We have the Discharger thread running at intervals which will archive the files with 0 ref count. Even if the ref count is >0 and the file is already compacted away before a long time (Configurable) we should force archive it. We should NOT do Region open for solving this at all. Infact no need for that. A region reopen will cause many other issues like invalidation of its data from block cache! In the past, before this Discharger thread way of archiving files, we used to archive compacted away files immediately after a compaction op. The current scanners used to reopen the scanners on the new files. So it is possible that even if some valid old scans still running over the compacted away files can still continue its scan. So we would need to bring back the old way (not completely though) > Expose max refCount among all compacted store files of a region > --- > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > Attachments: HBASE-23349.master.000.patch, > HBASE-23349.master.001.patch, HBASE-23349.master.002.patch > > > We should expose a region level metric that represents max refCount among > refCounts of all compacted store files under the region. For successful > archival of compacted store files, it is important for this metric count to > be 0 eventually if not immediately. If it is >0 for a considerably high > amount of time, it indicates some issue i.e. reader refCount leak on some > compacted store files and in such case, archival would not be successful. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Expose max refCount among all compacted store files of a region
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986530#comment-16986530 ] HBase QA commented on HBASE-23349: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 2m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} prototool {color} | {color:blue} 0m 0s{color} | {color:blue} prototool was not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 47s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 4s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 21s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} refguide {color} | {color:blue} 7m 29s{color} | {color:blue} branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 6m 10s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 56s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 17m 56s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 30m 55s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 4m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:blue}0{color} | {color:blue} refguide {color} | {color:blue} 7m 50s{color} | {color:blue} patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 6m 11s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 23m 28s{color} | {color:green} Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 11m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 6m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 31m 14s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red}
[jira] [Commented] (HBASE-23349) Expose max refCount among all compacted store files of a region
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986363#comment-16986363 ] HBase QA commented on HBASE-23349: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 21s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} prototool {color} | {color:blue} 0m 0s{color} | {color:blue} prototool was not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 32s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 46s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 44s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 59s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 4m 29s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 10m 17s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 32s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 2s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 17m 13s{color} | {color:green} Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 2m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 11m 8s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 28s{color} | {color:green} hbase-hadoop-compat in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 35s{color} | {color:green} hbase-hadoop2-compat in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 24s{color} | {color:green} hbase-protocol in the patch passed. {color} | | {color:green}+1{color} | {color:green}
[jira] [Commented] (HBASE-23349) Expose max refCount among all compacted store files of a region
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986034#comment-16986034 ] Viraj Jasani commented on HBASE-23349: -- {quote}Because the metric report the max ref count across all the store files in the Region. This list will not include the compacted away files. {quote} Agree, because of this, our auto-reopen of region is not happening based on max refCount among compacted store files. Sure, let me update the patch and include this change as part of this Jira, also I have linked the previous issues. FYI [~apurtell] Thanks > Expose max refCount among all compacted store files of a region > --- > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Task >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > Attachments: HBASE-23349.master.000.patch > > > We should expose a region level metric that represents max refCount among > refCounts of all compacted store files under the region. For successful > archival of compacted store files, it is important for this metric count to > be 0 eventually if not immediately. If it is >0 for a considerably high > amount of time, it indicates some issue i.e. reader refCount leak on some > compacted store files and in such case, archival would not be successful. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Expose max refCount among all compacted store files of a region
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985819#comment-16985819 ] Anoop Sam John commented on HBASE-23349: Actually the Region reopen based on the metric (To allow the compacted files to get archived) feature is not working as per expectation now. Because the metric report the max ref count across all the store files in the Region. This list will not include the compacted away files. So any decision based on that metric is not making much sense. Instead this new metric is telling what is the max ref count across all the compacted away files. The decision and reopen on this make sense. Can u include this change also as part of this Jira? Should be smaller change I believe. Pls link the older issues also here (I think 3 or so) > Expose max refCount among all compacted store files of a region > --- > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Task >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > Attachments: HBASE-23349.master.000.patch > > > We should expose a region level metric that represents max refCount among > refCounts of all compacted store files under the region. For successful > archival of compacted store files, it is important for this metric count to > be 0 eventually if not immediately. If it is >0 for a considerably high > amount of time, it indicates some issue i.e. reader refCount leak on some > compacted store files and in such case, archival would not be successful. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Expose max refCount among all compacted store files of a region
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16985461#comment-16985461 ] HBase QA commented on HBASE-23349: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 32s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} prototool {color} | {color:blue} 0m 0s{color} | {color:blue} prototool was not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 31s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 5s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 48s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 39s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 35s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 4m 14s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 10m 6s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 52s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 16m 16s{color} | {color:green} Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 2m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 10m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 39s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 31s{color} | {color:green} hbase-hadoop-compat in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 38s{color} | {color:green} hbase-hadoop2-compat in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s{color} | {color:green} hbase-protocol in the patch passed. {color} | | {color:green}+1{color} | {color:green}
[jira] [Commented] (HBASE-23349) Expose max refCount among all compacted store files of a region
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984754#comment-16984754 ] Viraj Jasani commented on HBASE-23349: -- For now, just want to expose the metric and not thinking of reopening the region since very low counts e.g. 1 or 2 might not be justifiable for region reopen. Probably we can have time duration based check or better recovery built on this but still better to expose the refCount per region. At least based on this metric, manual action can be taken at client or coprocessor or server side before it is too late. > Expose max refCount among all compacted store files of a region > --- > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Task >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > We should expose a region level metric that represents max refCount among > refCounts of all compacted store files under the region. For successful > archival of compacted store files, it is important for this metric count to > be 0 eventually if not immediately. If it is >0 for a considerably high > amount of time, it indicates some issue i.e. reader refCount leak on some > compacted store files and in such case, archival would not be successful. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Expose max refCount among all compacted store files of a region
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16984740#comment-16984740 ] Anoop Sam John commented on HBASE-23349: So u just want the metric to be exposed only or based on some condition do the reopen of region? > Expose max refCount among all compacted store files of a region > --- > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Task >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > We should expose a region level metric that represents max refCount among > refCounts of all compacted store files under the region. For successful > archival of compacted store files, it is important for this metric count to > be 0 eventually if not immediately. If it is >0 for a considerably high > amount of time, it indicates some issue i.e. reader refCount leak on some > compacted store files and in such case, archival would not be successful. -- This message was sent by Atlassian Jira (v8.3.4#803005)