[jira] [Commented] (HBASE-11288) Splittable Meta
[ https://issues.apache.org/jira/browse/HBASE-11288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937468#comment-16937468 ] Francis Liu commented on HBASE-11288: - Let me write one up tomorrow [~stack] and [~ram_krish]. Apologies I decided it was better to push stuff out in pieces than taking ages to get everything out in one go. > Splittable Meta > --- > > Key: HBASE-11288 > URL: https://issues.apache.org/jira/browse/HBASE-11288 > Project: HBase > Issue Type: Sub-task >Reporter: Francis Liu >Assignee: Francis Liu >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] Apache-HBase commented on issue #655: HBASE-23055 Alter hbase:meta
Apache-HBase commented on issue #655: HBASE-23055 Alter hbase:meta URL: https://github.com/apache/hbase/pull/655#issuecomment-534886824 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | :blue_heart: | reexec | 1m 20s | Docker mode activated. | ||| _ Prechecks _ | | :green_heart: | dupname | 0m 0s | No case conflicting files found. | | :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. | | :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | :green_heart: | test4tests | 0m 0s | The patch appears to include 8 new or modified test files. | ||| _ branch-2 Compile Tests _ | | :blue_heart: | mvndep | 0m 14s | Maven dependency ordering for branch | | :green_heart: | mvninstall | 6m 4s | branch-2 passed | | :green_heart: | compile | 2m 16s | branch-2 passed | | :green_heart: | checkstyle | 3m 8s | branch-2 passed | | :green_heart: | shadedjars | 4m 52s | branch has no errors when building our shaded downstream artifacts. | | :green_heart: | javadoc | 1m 45s | branch-2 passed | | :blue_heart: | spotbugs | 0m 40s | Used deprecated FindBugs config; considering switching to SpotBugs. | | :green_heart: | findbugs | 7m 14s | branch-2 passed | ||| _ Patch Compile Tests _ | | :blue_heart: | mvndep | 0m 16s | Maven dependency ordering for patch | | :green_heart: | mvninstall | 5m 36s | the patch passed | | :green_heart: | compile | 2m 19s | the patch passed | | :green_heart: | javac | 2m 19s | the patch passed | | :green_heart: | checkstyle | 0m 28s | The patch passed checkstyle in hbase-common | | :green_heart: | checkstyle | 0m 44s | hbase-client: The patch generated 0 new + 168 unchanged - 5 fixed = 168 total (was 173) | | :green_heart: | checkstyle | 0m 14s | The patch passed checkstyle in hbase-zookeeper | | :green_heart: | checkstyle | 1m 39s | hbase-server: The patch generated 0 new + 394 unchanged - 16 fixed = 394 total (was 410) | | :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | :green_heart: | shadedjars | 4m 51s | patch has no errors when building our shaded downstream artifacts. | | :green_heart: | hadoopcheck | 17m 45s | Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2. | | :green_heart: | javadoc | 1m 38s | the patch passed | | :green_heart: | findbugs | 7m 5s | the patch passed | ||| _ Other Tests _ | | :green_heart: | unit | 2m 53s | hbase-common in the patch passed. | | :green_heart: | unit | 3m 31s | hbase-client in the patch passed. | | :green_heart: | unit | 0m 52s | hbase-zookeeper in the patch passed. | | :broken_heart: | unit | 275m 29s | hbase-server in the patch failed. | | :green_heart: | asflicense | 1m 40s | The patch does not generate ASF License warnings. | | | | 361m 46s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hbase.util.TestFSTableDescriptors | | | hadoop.hbase.client.TestMetaWithReplicas | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=18.09.7 Server=18.09.7 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-655/5/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/655 | | Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 5c4cbc6a105e 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-655/out/precommit/personality/provided.sh | | git revision | branch-2 / c5a5bf7f48 | | Default Java | 1.8.0_181 | | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-655/5/artifact/out/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-655/5/testReport/ | | Max. process+thread count | 4749 (vs. ulimit of 1) | | modules | C: hbase-common hbase-client hbase-zookeeper hbase-server U: . | | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-655/5/console | | versions | git=2.11.0 maven=2018-06-17T18:33:14Z) findbugs=3.1.11 | | Powered by | Apache Yetus 0.11.0 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. F
[jira] [Commented] (HBASE-23038) Provide consistent and clear logging about disabling chores
[ https://issues.apache.org/jira/browse/HBASE-23038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937470#comment-16937470 ] Viraj Jasani commented on HBASE-23038: -- nit: we can include variable params with {} in logger: {code:java} LOG.info( "The period is {} seconds, {} is disabled", chore.getPeriod(), chore.getName()); {code} > Provide consistent and clear logging about disabling chores > --- > > Key: HBASE-23038 > URL: https://issues.apache.org/jira/browse/HBASE-23038 > Project: HBase > Issue Type: Improvement > Components: master, regionserver >Reporter: Sean Busbey >Assignee: Sanjeet Nishad >Priority: Minor > Labels: beginner > Attachments: HBASE-23038.001.patch > > > Right now if you want to disable any of our chores you can set the period to > be <= 0. Sometimes, if you do this you get a nice message: > {code} > 2019-09-16 22:10:16,756 INFO [master-1:16000.activeMasterManager] > master.HMaster: The period is 0 seconds, MobCompactionChore is disabled > {code} > And sometimes you get an opaque message: > {code} > 2019-09-16 22:09:45,333 INFO [master-1:16000.activeMasterManager] > hbase.ChoreService: Could not successfully schedule chore: LogsCleaner > 2019-09-16 22:09:45,340 INFO [master-1:16000.activeMasterManager] > hbase.ChoreService: Could not successfully schedule chore: HFileCleaner > {code} > This is because sometimes we just blindly submit to ChoreService which > submits to a java ScheduledExecutorService and then catches the > IllegalArgumentException. > We should remove the one-offs and make it so ChoreService checks the period > before accepting a submittal and produces a consistent "Foo is disabled" > message. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] virajjasani commented on a change in pull request #600: HBASE-22460 : Reopen regions with very high Store Ref Counts
virajjasani commented on a change in pull request #600: HBASE-22460 : Reopen regions with very high Store Ref Counts URL: https://github.com/apache/hbase/pull/600#discussion_r327724014 ## File path: hbase-client/src/main/java/org/apache/hadoop/hbase/ServerMetricsBuilder.java ## @@ -358,6 +360,8 @@ public String toString() { for (RegionMetrics r : getRegionMetrics().values()) { storeCount += r.getStoreCount(); storeFileCount += r.getStoreFileCount(); +storeRefCount += r.getStoreRefCount(); +maxStoreFileRefCount += r.getMaxStoreFileRefCount(); Review comment: This is addition of all counts for toString() right? I thought of doing max() but somehow I feel all these counts should be addition of all regionMetrics for the sake of toString() only. Still, I agree better to include max of all storeFileRefCount This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] chenxu14 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel
chenxu14 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel URL: https://github.com/apache/hbase/pull/656#discussion_r327989118 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcCallContext.java ## @@ -89,18 +91,17 @@ */ long getResponseCellSize(); + long getNumsOfGet(); Review comment: Please forgive me the poor English This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] chenxu14 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel
chenxu14 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel URL: https://github.com/apache/hbase/pull/656#discussion_r327989314 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java ## @@ -934,13 +939,61 @@ private Result increment(final HRegion region, final OperationQuota quota, builder.addResultOrException(resultOrExceptionBuilder.build()); } } +// do multiget in parallel +if (getCtxs != null && !getCtxs.isEmpty()) { + doParallelGet(getCtxs, cellsToReturn, builder); +} // Finish up any outstanding mutations if (!CollectionUtils.isEmpty(mutations)) { doNonAtomicBatchOp(builder, region, quota, mutations, cellScanner, spaceQuotaEnforcement); } return cellsToReturn; } + private void doParallelGet(List getCtxs, List cellsToReturn, + RegionActionResult.Builder builder) throws ServiceException { +ResultOrException.Builder resultOrExceptionBuilder = null; +CountDownLatch latch = new CountDownLatch(getCtxs.size()); +List handlers = new ArrayList<>(getCtxs.size()); +for (GetContext getCtx : getCtxs) { + GetActionHandler handler = new GetActionHandler(getCtx, latch); + this.regionServer.executorService.submit(handler); + handlers.add(handler); +} +try { + latch.await(); Review comment: Sorry Sir, I don't quite understand the impact you worry about. you mean isAboveQuota will not happen in that case? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] chenxu14 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel
chenxu14 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel URL: https://github.com/apache/hbase/pull/656#discussion_r327989668 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java ## @@ -1300,6 +1356,9 @@ public void onConfigurationChange(Configuration newConf) { if (rpcServer instanceof ConfigurationObserver) { ((ConfigurationObserver)rpcServer).onConfigurationChange(newConf); } +parallelGetEnable = newConf.getBoolean(HConstants.RS_PARALLEL_GET_ENABLED, false); Review comment: OK,will rename it This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] chenxu14 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel
chenxu14 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel URL: https://github.com/apache/hbase/pull/656#discussion_r327989914 ## File path: hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java ## @@ -969,6 +969,11 @@ public static final String HBASE_REGION_SPLIT_POLICY_KEY = "hbase.regionserver.region.split.policy"; + public static final String RS_PARALLEL_GET_ENABLED = "hbase.server.parallel.get.enabled"; Review comment: Yeah, I think this is the easiest way, using the org.apache.hadoop.hbase.executor.ExecutorService that already exist. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] chenxu14 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel
chenxu14 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel URL: https://github.com/apache/hbase/pull/656#discussion_r327990702 ## File path: hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java ## @@ -969,6 +969,11 @@ public static final String HBASE_REGION_SPLIT_POLICY_KEY = "hbase.regionserver.region.split.policy"; + public static final String RS_PARALLEL_GET_ENABLED = "hbase.server.parallel.get.enabled"; + public static final String RS_PARALLEL_GET_THREADS = "hbase.server.parallel.get.threads"; Review comment: > How do I get feedback on how many threads I should run with? What do I look at? Is there a log message or a metric I can see? How about add an ExecutorStatusChore to collect the Executor metrics, this is useful for all the Executors opened by RS This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] chenxu14 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel
chenxu14 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel URL: https://github.com/apache/hbase/pull/656#discussion_r327991994 ## File path: hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java ## @@ -969,6 +969,11 @@ public static final String HBASE_REGION_SPLIT_POLICY_KEY = "hbase.regionserver.region.split.policy"; + public static final String RS_PARALLEL_GET_ENABLED = "hbase.server.parallel.get.enabled"; + public static final String RS_PARALLEL_GET_THREADS = "hbase.server.parallel.get.threads"; Review comment: > How does this bloat up our thread count? Already the RegionServer has too many. How many threads will this add? Will add 20 threads by default, can be configure with hbase.server.parallel.get.threads This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (HBASE-23054) Remove synchronization block from MetaTableMetrics and fix LossyCounting algorithm
[ https://issues.apache.org/jira/browse/HBASE-23054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937520#comment-16937520 ] Sakthi commented on HBASE-23054: Looking into the patch. > Remove synchronization block from MetaTableMetrics and fix LossyCounting > algorithm > -- > > Key: HBASE-23054 > URL: https://issues.apache.org/jira/browse/HBASE-23054 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.5 >Reporter: Ankit Singhal >Assignee: Ankit Singhal >Priority: Major > Attachments: HBASE-23054.master.001.patch > > > While trying to use LossyCounting for HBASE-15519 , found following bugs in > current implementation > – Remove synchronization block from MetaTableMetrics to avoid congestion at > the code > – Fix license format > – Fix LossyCounting algorithm as per > [http://www.vldb.org/conf/2002/S10P03.pdf > |http://www.vldb.org/conf/2002/S10P03.pdf] > -- Avoid doing sweep on every insert in LossyCounting > – Remove extra redundant data structures from MetaTableMetrics. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] chenxu14 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel
chenxu14 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel URL: https://github.com/apache/hbase/pull/656#discussion_r327998371 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/ServerCall.java ## @@ -431,31 +436,42 @@ public boolean isClientCellBlockSupported() { @Override public long getResponseCellSize() { -return responseCellSize; +return responseCellSize.get(); } @Override - public void incrementResponseCellSize(long cellSize) { Review comment: The logic has been moved to RpcCallContext#addResultSize now This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] chenxu14 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel
chenxu14 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel URL: https://github.com/apache/hbase/pull/656#discussion_r327998696 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java ## @@ -408,10 +411,10 @@ public void run() throws IOException { * An RpcCallBack that creates a list of scanners that needs to perform callBack operation on * completion of multiGets. */ - static class RegionScannersCloseCallBack implements RpcCallback { + public static class RegionScannersCloseCallBack implements RpcCallback { private final List scanners = new ArrayList<>(); -public void addScanner(RegionScanner scanner) { +public synchronized void addScanner(RegionScanner scanner) { Review comment: Because it may be called concurrently in different GetActionHandler. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] chenxu14 commented on issue #656: HBASE-23063 Add an option to enable multiget in parallel
chenxu14 commented on issue #656: HBASE-23063 Add an option to enable multiget in parallel URL: https://github.com/apache/hbase/pull/656#issuecomment-534917614 @saintstack @ramkrish86 About the removal of RpcCallContext#getResponseBlockSize This is what HBASE-14978 doing, in order to limit the number of Blocks each Multi can use, but IMHO this is not suitable now, since the bbCell may be backed by an shared ByteBuffer, no matter how many blocks it will use. Another way, restrict the number of rows to be returned may be more suitable. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (HBASE-23074) scan#setVersion is invalid.
Bo Cui created HBASE-23074: -- Summary: scan#setVersion is invalid. Key: HBASE-23074 URL: https://issues.apache.org/jira/browse/HBASE-23074 Project: HBase Issue Type: Bug Affects Versions: 2.1.1 Reporter: Bo Cui I found a problem, it could be a mistake.. reproduce steps in hbase shell: 1. create 't11', \{NAME => 'f1', VERSIONS => 1} 2.put 't11','r1','f1:q1','f1' 3.flush 't11' 4.put 't11','r1','f1:q1','f2' 5.flush 't11' 6.scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:f1'))"} the result: # 1.3.1 version hbase(main):011:0> scan 't11', {RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:f1'))"} ROW COLUMN+CELL 2019-09-25 16:31:22,289 INFO [hconnection-0x7459a21e-shared--pool3-t15] ipc.AbstractRpcClient: RPC Server Kerberos principal name for service=ClientService is hbase/hadoop.hadoop1@hadoop1.com r1 column=f1:q1, timestamp=1569400085570, value=f2 r1 column=f1:q1, timestamp=1569400068958, value=f1 # 2. in 2.1.1 version hbase(main):023:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:f1'))"} ROW COLUMN+CELL r1 column=f1:q1, timestamp=1569400122280, value=f2 1 row(s) Took 0.0800 seconds -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23074) scan#setVersion is invalid.
[ https://issues.apache.org/jira/browse/HBASE-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HBASE-23074: --- Attachment: image-2019-09-25-16-45-37-780.png > scan#setVersion is invalid. > --- > > Key: HBASE-23074 > URL: https://issues.apache.org/jira/browse/HBASE-23074 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Bo Cui >Priority: Critical > Attachments: image-2019-09-25-16-45-08-870.png, > image-2019-09-25-16-45-37-780.png > > > I found a problem, it could be a mistake.. > reproduce steps in hbase shell: > 1. create 't11', \{NAME => 'f1', VERSIONS => 1} > 2.put 't11','r1','f1:q1','f1' > 3.flush 't11' > 4.put 't11','r1','f1:q1','f2' > 5.flush 't11' > 6.scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, > 'binary:f1'))"} > > the result: > # 1.3.1 version > hbase(main):011:0> scan 't11', {RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:f1'))"} > ROW COLUMN+CELL > > > > 2019-09-25 16:31:22,289 INFO [hconnection-0x7459a21e-shared--pool3-t15] > ipc.AbstractRpcClient: RPC Server Kerberos principal name for > service=ClientService is hbase/hadoop.hadoop1@hadoop1.com > r1 column=f1:q1, > timestamp=1569400085570, value=f2 > > > r1 column=f1:q1, > timestamp=1569400068958, value=f1 > # 2. in 2.1.1 version > hbase(main):023:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:f1'))"} > ROW COLUMN+CELL > r1 column=f1:q1, timestamp=1569400122280, value=f2 > 1 row(s) > Took 0.0800 seconds > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23074) scan#setVersion is invalid.
[ https://issues.apache.org/jira/browse/HBASE-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HBASE-23074: --- Attachment: image-2019-09-25-16-45-08-870.png > scan#setVersion is invalid. > --- > > Key: HBASE-23074 > URL: https://issues.apache.org/jira/browse/HBASE-23074 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Bo Cui >Priority: Critical > Attachments: image-2019-09-25-16-45-08-870.png, > image-2019-09-25-16-45-37-780.png > > > I found a problem, it could be a mistake.. > reproduce steps in hbase shell: > 1. create 't11', \{NAME => 'f1', VERSIONS => 1} > 2.put 't11','r1','f1:q1','f1' > 3.flush 't11' > 4.put 't11','r1','f1:q1','f2' > 5.flush 't11' > 6.scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, > 'binary:f1'))"} > > the result: > # 1.3.1 version > hbase(main):011:0> scan 't11', {RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:f1'))"} > ROW COLUMN+CELL > > > > 2019-09-25 16:31:22,289 INFO [hconnection-0x7459a21e-shared--pool3-t15] > ipc.AbstractRpcClient: RPC Server Kerberos principal name for > service=ClientService is hbase/hadoop.hadoop1@hadoop1.com > r1 column=f1:q1, > timestamp=1569400085570, value=f2 > > > r1 column=f1:q1, > timestamp=1569400068958, value=f1 > # 2. in 2.1.1 version > hbase(main):023:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:f1'))"} > ROW COLUMN+CELL > r1 column=f1:q1, timestamp=1569400122280, value=f2 > 1 row(s) > Took 0.0800 seconds > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23074) scan#setVersion is invalid.
[ https://issues.apache.org/jira/browse/HBASE-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937536#comment-16937536 ] Bo Cui commented on HBASE-23074: !image-2019-09-25-16-45-08-870.png! !image-2019-09-25-16-45-37-780.png! Because ScanQueryMatcher's construction method is different in two versions.. But I don't know why?Why reduce the priority of the userScan.getMaxVersions() in 2.1.1? bug? {code:java} resultMaxVersion = Math.min(userScan.getMaxVersions(), scanInfo.getMaxVersions()); maxVersionToCheck = userScan.hasFilter() ? scanInfo.getMaxVersions() : resultMaxVersion; {code} > scan#setVersion is invalid. > --- > > Key: HBASE-23074 > URL: https://issues.apache.org/jira/browse/HBASE-23074 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Bo Cui >Priority: Critical > Attachments: image-2019-09-25-16-45-08-870.png, > image-2019-09-25-16-45-37-780.png > > > I found a problem, it could be a mistake.. > reproduce steps in hbase shell: > 1. create 't11', \{NAME => 'f1', VERSIONS => 1} > 2.put 't11','r1','f1:q1','f1' > 3.flush 't11' > 4.put 't11','r1','f1:q1','f2' > 5.flush 't11' > 6.scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, > 'binary:f1'))"} > > the result: > # 1.3.1 version > hbase(main):011:0> scan 't11', {RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:f1'))"} > ROW COLUMN+CELL > > > > 2019-09-25 16:31:22,289 INFO [hconnection-0x7459a21e-shared--pool3-t15] > ipc.AbstractRpcClient: RPC Server Kerberos principal name for > service=ClientService is hbase/hadoop.hadoop1@hadoop1.com > r1 column=f1:q1, > timestamp=1569400085570, value=f2 > > > r1 column=f1:q1, > timestamp=1569400068958, value=f1 > # 2. in 2.1.1 version > hbase(main):023:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:f1'))"} > ROW COLUMN+CELL > r1 column=f1:q1, timestamp=1569400122280, value=f2 > 1 row(s) > Took 0.0800 seconds > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-23074) scan#setVersion is invalid.
[ https://issues.apache.org/jira/browse/HBASE-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937536#comment-16937536 ] Bo Cui edited comment on HBASE-23074 at 9/25/19 8:52 AM: - 1.3.1 !image-2019-09-25-16-45-08-870.png! 2.1.1 !image-2019-09-25-16-45-37-780.png! Because ScanQueryMatcher's construction method is different in two versions.. But I don't know why?Why reduce the priority of the userScan.getMaxVersions() in 2.1.1? bug? {code:java} resultMaxVersion = Math.min(userScan.getMaxVersions(), scanInfo.getMaxVersions()); maxVersionToCheck = userScan.hasFilter() ? scanInfo.getMaxVersions() : resultMaxVersion; {code} was (Author: bo cui): !image-2019-09-25-16-45-08-870.png! !image-2019-09-25-16-45-37-780.png! Because ScanQueryMatcher's construction method is different in two versions.. But I don't know why?Why reduce the priority of the userScan.getMaxVersions() in 2.1.1? bug? {code:java} resultMaxVersion = Math.min(userScan.getMaxVersions(), scanInfo.getMaxVersions()); maxVersionToCheck = userScan.hasFilter() ? scanInfo.getMaxVersions() : resultMaxVersion; {code} > scan#setVersion is invalid. > --- > > Key: HBASE-23074 > URL: https://issues.apache.org/jira/browse/HBASE-23074 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Bo Cui >Priority: Critical > Attachments: image-2019-09-25-16-45-08-870.png, > image-2019-09-25-16-45-37-780.png > > > I found a problem, it could be a mistake.. > reproduce steps in hbase shell: > 1. create 't11', \{NAME => 'f1', VERSIONS => 1} > 2.put 't11','r1','f1:q1','f1' > 3.flush 't11' > 4.put 't11','r1','f1:q1','f2' > 5.flush 't11' > 6.scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, > 'binary:f1'))"} > > the result: > # 1.3.1 version > hbase(main):011:0> scan 't11', {RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:f1'))"} > ROW COLUMN+CELL > > > > 2019-09-25 16:31:22,289 INFO [hconnection-0x7459a21e-shared--pool3-t15] > ipc.AbstractRpcClient: RPC Server Kerberos principal name for > service=ClientService is hbase/hadoop.hadoop1@hadoop1.com > r1 column=f1:q1, > timestamp=1569400085570, value=f2 > > > r1 column=f1:q1, > timestamp=1569400068958, value=f1 > # 2. in 2.1.1 version > hbase(main):023:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:f1'))"} > ROW COLUMN+CELL > r1 column=f1:q1, timestamp=1569400122280, value=f2 > 1 row(s) > Took 0.0800 seconds > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23074) scan#setVersion is invalid.
[ https://issues.apache.org/jira/browse/HBASE-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HBASE-23074: --- Description: I found a problem, it could be a mistake.. reproduce steps in hbase shell: 1. create 't11', \{NAME => 'f1', VERSIONS => 1} 2.put 't11','r1','f1:q1','f1' 3.flush 't11' 4.put 't11','r1','f1:q1','f2' 5.flush 't11' 6.scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:f1'))"} the result: # 1.3.1 version hbase(main):011:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:q1'))"} ROW COLUMN+CELL 2019-09-25 16:31:22,289 INFO [hconnection-0x7459a21e-shared--pool3-t15] ipc.AbstractRpcClient: RPC Server Kerberos principal name for service=ClientService is hbase/hadoop.hadoop1@hadoop1.com r1 column=f1:q1, timestamp=1569400085570, value=f2 r1 column=f1:q1, timestamp=1569400068958, value=f1 # 2. in 2.1.1 version hbase(main):023:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:q1'))"} ROW COLUMN+CELL r1 column=f1:q1, timestamp=1569400122280, value=f2 1 row(s) Took 0.0800 seconds was: I found a problem, it could be a mistake.. reproduce steps in hbase shell: 1. create 't11', \{NAME => 'f1', VERSIONS => 1} 2.put 't11','r1','f1:q1','f1' 3.flush 't11' 4.put 't11','r1','f1:q1','f2' 5.flush 't11' 6.scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:f1'))"} the result: # 1.3.1 version hbase(main):011:0> scan 't11', {RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:f1'))"} ROW COLUMN+CELL 2019-09-25 16:31:22,289 INFO [hconnection-0x7459a21e-shared--pool3-t15] ipc.AbstractRpcClient: RPC Server Kerberos principal name for service=ClientService is hbase/hadoop.hadoop1@hadoop1.com r1 column=f1:q1, timestamp=1569400085570, value=f2 r1 column=f1:q1, timestamp=1569400068958, value=f1 # 2. in 2.1.1 version hbase(main):023:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:f1'))"} ROW COLUMN+CELL r1 column=f1:q1, timestamp=1569400122280, value=f2 1 row(s) Took 0.0800 seconds > scan#setVersion is invalid. > --- > > Key: HBASE-23074 > URL: https://issues.apache.org/jira/browse/HBASE-23074 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Bo Cui >Priority: Critical > Attachments: image-2019-09-25-16-45-08-870.png, > image-2019-09-25-16-45-37-780.png > > > I found a problem, it could be a mistake.. > reproduce steps in hbase shell: > 1. create 't11', \{NAME => 'f1', VERSIONS => 1} > 2.put 't11','r1','f1:q1','f1' > 3.flush 't11' > 4.put 't11','r1','f1:q1','f2' > 5.flush 't11' > 6.scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, > 'binary:f1'))"} > > the result: > # 1.3.1 version > hbase(main):011:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:q1'))"} > ROW COLUMN+CELL > 2019-09-25 16:31:22,289 INFO [hconnection-0x7459a21e-shared--pool3-t15] > ipc.AbstractRpcClient: RPC Server Kerberos principal name for > service=ClientService is hbase/hadoop.hadoop1@hadoop1.com > r1 column=f1:q1, timestamp=1569400085570, value=f2 > r1 column=f1:q1, timestamp=1569400068958, value=f1 > # 2. in 2.1.1 version > hbase(main):023:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:q1'))"} > ROW COLUMN+CELL > r1 column=f1:q1, timestamp=1569400122280, value=f2 > 1 row(s) > Took 0.0800 seconds > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23074) scan#setVersion is invalid.
[ https://issues.apache.org/jira/browse/HBASE-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HBASE-23074: --- Description: I found a problem, it could be a mistake.. reproduce steps in hbase shell: 1. create 't11', \{NAME => 'f1', VERSIONS => 1} 2.put 't11','r1','f1:q1','f1' 3.flush 't11' 4.put 't11','r1','f1:q1','f2' 5.flush 't11' 6.scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:f1'))"} the result: # 1.3.1 version hbase(main):011:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:q1'))"} ROW COLUMN+CELL r1 column=f1:q1, timestamp=1569400085570, value=f2 r1 column=f1:q1, timestamp=1569400068958, value=f1 # 2. in 2.1.1 version hbase(main):023:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:q1'))"} ROW COLUMN+CELL r1 column=f1:q1, timestamp=1569400122280, value=f2 1 row(s) Took 0.0800 seconds was: I found a problem, it could be a mistake.. reproduce steps in hbase shell: 1. create 't11', \{NAME => 'f1', VERSIONS => 1} 2.put 't11','r1','f1:q1','f1' 3.flush 't11' 4.put 't11','r1','f1:q1','f2' 5.flush 't11' 6.scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:f1'))"} the result: # 1.3.1 version hbase(main):011:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:q1'))"} ROW COLUMN+CELL 2019-09-25 16:31:22,289 INFO [hconnection-0x7459a21e-shared--pool3-t15] ipc.AbstractRpcClient: RPC Server Kerberos principal name for service=ClientService is hbase/hadoop.hadoop1@hadoop1.com r1 column=f1:q1, timestamp=1569400085570, value=f2 r1 column=f1:q1, timestamp=1569400068958, value=f1 # 2. in 2.1.1 version hbase(main):023:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:q1'))"} ROW COLUMN+CELL r1 column=f1:q1, timestamp=1569400122280, value=f2 1 row(s) Took 0.0800 seconds > scan#setVersion is invalid. > --- > > Key: HBASE-23074 > URL: https://issues.apache.org/jira/browse/HBASE-23074 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Bo Cui >Priority: Critical > Attachments: image-2019-09-25-16-45-08-870.png, > image-2019-09-25-16-45-37-780.png > > > I found a problem, it could be a mistake.. > reproduce steps in hbase shell: > 1. create 't11', \{NAME => 'f1', VERSIONS => 1} > 2.put 't11','r1','f1:q1','f1' > 3.flush 't11' > 4.put 't11','r1','f1:q1','f2' > 5.flush 't11' > 6.scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, > 'binary:f1'))"} > > the result: > # 1.3.1 version > hbase(main):011:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:q1'))"} > ROW COLUMN+CELL > r1 column=f1:q1, timestamp=1569400085570, value=f2 > r1 column=f1:q1, timestamp=1569400068958, value=f1 > # 2. in 2.1.1 version > hbase(main):023:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:q1'))"} > ROW COLUMN+CELL > r1 column=f1:q1, timestamp=1569400122280, value=f2 > 1 row(s) > Took 0.0800 seconds > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23074) scan#setVersion is invalid.
[ https://issues.apache.org/jira/browse/HBASE-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HBASE-23074: --- Description: I found a problem, it could be a mistake.. reproduce steps in hbase shell: 1. create 't11', \{NAME => 'f1', VERSIONS => 1} 2.put 't11','r1','f1:q1','f1' 3.flush 't11' 4.put 't11','r1','f1:q1','f2' 5.flush 't11' 6.scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:f1'))"} the result: 1. 1.3.1 version hbase(main):011:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:q1'))"} ROW COLUMN+CELL r1 column=f1:q1, timestamp=1569400085570, value=f2 r1 column=f1:q1, timestamp=1569400068958, value=f1 2. in 2.1.1 version hbase(main):023:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:q1'))"} ROW COLUMN+CELL r1 column=f1:q1, timestamp=1569400122280, value=f2 1 row(s) Took 0.0800 seconds was: I found a problem, it could be a mistake.. reproduce steps in hbase shell: 1. create 't11', \{NAME => 'f1', VERSIONS => 1} 2.put 't11','r1','f1:q1','f1' 3.flush 't11' 4.put 't11','r1','f1:q1','f2' 5.flush 't11' 6.scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:f1'))"} the result: # 1.3.1 version hbase(main):011:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:q1'))"} ROW COLUMN+CELL r1 column=f1:q1, timestamp=1569400085570, value=f2 r1 column=f1:q1, timestamp=1569400068958, value=f1 # 2. in 2.1.1 version hbase(main):023:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, 'binary:q1'))"} ROW COLUMN+CELL r1 column=f1:q1, timestamp=1569400122280, value=f2 1 row(s) Took 0.0800 seconds > scan#setVersion is invalid. > --- > > Key: HBASE-23074 > URL: https://issues.apache.org/jira/browse/HBASE-23074 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.1 >Reporter: Bo Cui >Priority: Critical > Attachments: image-2019-09-25-16-45-08-870.png, > image-2019-09-25-16-45-37-780.png > > > I found a problem, it could be a mistake.. > reproduce steps in hbase shell: > 1. create 't11', \{NAME => 'f1', VERSIONS => 1} > 2.put 't11','r1','f1:q1','f1' > 3.flush 't11' > 4.put 't11','r1','f1:q1','f2' > 5.flush 't11' > 6.scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, > 'binary:f1'))"} > > the result: > 1. 1.3.1 version > hbase(main):011:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:q1'))"} > ROW COLUMN+CELL > r1 column=f1:q1, timestamp=1569400085570, value=f2 > r1 column=f1:q1, timestamp=1569400068958, value=f1 > 2. in 2.1.1 version > hbase(main):023:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:q1'))"} > ROW COLUMN+CELL > r1 column=f1:q1, timestamp=1569400122280, value=f2 > 1 row(s) > Took 0.0800 seconds > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23074) scan#setVersion is invalid.
[ https://issues.apache.org/jira/browse/HBASE-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bo Cui updated HBASE-23074: --- Affects Version/s: master > scan#setVersion is invalid. > --- > > Key: HBASE-23074 > URL: https://issues.apache.org/jira/browse/HBASE-23074 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.1, master >Reporter: Bo Cui >Priority: Critical > Attachments: image-2019-09-25-16-45-08-870.png, > image-2019-09-25-16-45-37-780.png > > > I found a problem, it could be a mistake.. > reproduce steps in hbase shell: > 1. create 't11', \{NAME => 'f1', VERSIONS => 1} > 2.put 't11','r1','f1:q1','f1' > 3.flush 't11' > 4.put 't11','r1','f1:q1','f2' > 5.flush 't11' > 6.scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, > 'binary:f1'))"} > > the result: > 1. 1.3.1 version > hbase(main):011:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:q1'))"} > ROW COLUMN+CELL > r1 column=f1:q1, timestamp=1569400085570, value=f2 > r1 column=f1:q1, timestamp=1569400068958, value=f1 > 2. in 2.1.1 version > hbase(main):023:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:q1'))"} > ROW COLUMN+CELL > r1 column=f1:q1, timestamp=1569400122280, value=f2 > 1 row(s) > Took 0.0800 seconds > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23038) Provide consistent and clear logging about disabling chores
[ https://issues.apache.org/jira/browse/HBASE-23038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sanjeet Nishad updated HBASE-23038: --- Attachment: HBASE-23038.002.patch > Provide consistent and clear logging about disabling chores > --- > > Key: HBASE-23038 > URL: https://issues.apache.org/jira/browse/HBASE-23038 > Project: HBase > Issue Type: Improvement > Components: master, regionserver >Reporter: Sean Busbey >Assignee: Sanjeet Nishad >Priority: Minor > Labels: beginner > Attachments: HBASE-23038.001.patch, HBASE-23038.002.patch > > > Right now if you want to disable any of our chores you can set the period to > be <= 0. Sometimes, if you do this you get a nice message: > {code} > 2019-09-16 22:10:16,756 INFO [master-1:16000.activeMasterManager] > master.HMaster: The period is 0 seconds, MobCompactionChore is disabled > {code} > And sometimes you get an opaque message: > {code} > 2019-09-16 22:09:45,333 INFO [master-1:16000.activeMasterManager] > hbase.ChoreService: Could not successfully schedule chore: LogsCleaner > 2019-09-16 22:09:45,340 INFO [master-1:16000.activeMasterManager] > hbase.ChoreService: Could not successfully schedule chore: HFileCleaner > {code} > This is because sometimes we just blindly submit to ChoreService which > submits to a java ScheduledExecutorService and then catches the > IllegalArgumentException. > We should remove the one-offs and make it so ChoreService checks the period > before accepting a submittal and produces a consistent "Foo is disabled" > message. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HBASE-22942) move snapshot verification during restore / clone into the async master side handling
[ https://issues.apache.org/jira/browse/HBASE-22942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sanjeet Nishad reassigned HBASE-22942: -- Assignee: Sanjeet Nishad > move snapshot verification during restore / clone into the async master side > handling > - > > Key: HBASE-22942 > URL: https://issues.apache.org/jira/browse/HBASE-22942 > Project: HBase > Issue Type: Improvement > Components: snapshots >Affects Versions: 1.3.0, 1.4.0, 2.1.0, 2.2.0 >Reporter: Sean Busbey >Assignee: Sanjeet Nishad >Priority: Minor > > Right now we do snapshot verification prior to queueing the restore / clone > request from the client. That means the initial call from the client has to > block until we're done. On large manifests (~single digit millions) this > easily takes longer than the default timeout of 20 minutes. > > Instead we should handle verification as one of the steps in the relevant > procedure (hbase 2+) or table handler (hbase 1) so that the master can do it > in the background and report status. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] brfrn169 commented on issue #647: HBASE-22988 Backport HBASE-11062 "hbtop" to branch-1
brfrn169 commented on issue #647: HBASE-22988 Backport HBASE-11062 "hbtop" to branch-1 URL: https://github.com/apache/hbase/pull/647#issuecomment-534936171 I changed the commons-lang3 version to 3.8.1 to support java7: https://commons.apache.org/proper/commons-lang/changes-report.html This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] brfrn169 commented on issue #647: HBASE-22988 Backport HBASE-11062 "hbtop" to branch-1
brfrn169 commented on issue #647: HBASE-22988 Backport HBASE-11062 "hbtop" to branch-1 URL: https://github.com/apache/hbase/pull/647#issuecomment-534936440 Thank you for reviewing! @Reidd This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (HBASE-23074) scan#setVersion is invalid.
[ https://issues.apache.org/jira/browse/HBASE-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937559#comment-16937559 ] Bo Cui commented on HBASE-23074: [~zghao] my scene:the value of cell is a file path, when we update this cell, write the new cell , and the find the old cell and delete the file. > scan#setVersion is invalid. > --- > > Key: HBASE-23074 > URL: https://issues.apache.org/jira/browse/HBASE-23074 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.1, master >Reporter: Bo Cui >Priority: Critical > Attachments: image-2019-09-25-16-45-08-870.png, > image-2019-09-25-16-45-37-780.png > > > I found a problem, it could be a mistake.. > reproduce steps in hbase shell: > 1. create 't11', \{NAME => 'f1', VERSIONS => 1} > 2.put 't11','r1','f1:q1','f1' > 3.flush 't11' > 4.put 't11','r1','f1:q1','f2' > 5.flush 't11' > 6.scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, > 'binary:f1'))"} > > the result: > 1. 1.3.1 version > hbase(main):011:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:q1'))"} > ROW COLUMN+CELL > r1 column=f1:q1, timestamp=1569400085570, value=f2 > r1 column=f1:q1, timestamp=1569400068958, value=f1 > 2. in 2.1.1 version > hbase(main):023:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:q1'))"} > ROW COLUMN+CELL > r1 column=f1:q1, timestamp=1569400122280, value=f2 > 1 row(s) > Took 0.0800 seconds > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HBASE-23074) scan#setVersion is invalid.
[ https://issues.apache.org/jira/browse/HBASE-23074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937559#comment-16937559 ] Bo Cui edited comment on HBASE-23074 at 9/25/19 9:46 AM: - [~zghao] my scene:the value of cell is a file path, when we update this cell, write the new cell , and the find the old cell and delete the file. we can add a configuration about HBASE-17125 is valid or not ? was (Author: bo cui): [~zghao] my scene:the value of cell is a file path, when we update this cell, write the new cell , and the find the old cell and delete the file. > scan#setVersion is invalid. > --- > > Key: HBASE-23074 > URL: https://issues.apache.org/jira/browse/HBASE-23074 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.1, master >Reporter: Bo Cui >Priority: Critical > Attachments: image-2019-09-25-16-45-08-870.png, > image-2019-09-25-16-45-37-780.png > > > I found a problem, it could be a mistake.. > reproduce steps in hbase shell: > 1. create 't11', \{NAME => 'f1', VERSIONS => 1} > 2.put 't11','r1','f1:q1','f1' > 3.flush 't11' > 4.put 't11','r1','f1:q1','f2' > 5.flush 't11' > 6.scan 't11', \{RAW => true, VERSIONS => 10, FILTER => "(QualifierFilter (>=, > 'binary:f1'))"} > > the result: > 1. 1.3.1 version > hbase(main):011:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:q1'))"} > ROW COLUMN+CELL > r1 column=f1:q1, timestamp=1569400085570, value=f2 > r1 column=f1:q1, timestamp=1569400068958, value=f1 > 2. in 2.1.1 version > hbase(main):023:0> scan 't11', \{RAW => true, VERSIONS => 10, FILTER => > "(QualifierFilter (>=, 'binary:q1'))"} > ROW COLUMN+CELL > r1 column=f1:q1, timestamp=1569400122280, value=f2 > 1 row(s) > Took 0.0800 seconds > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23038) Provide consistent and clear logging about disabling chores
[ https://issues.apache.org/jira/browse/HBASE-23038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937587#comment-16937587 ] HBase QA commented on HBASE-23038: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 12s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange} 0m 0s{color} | {color:orange} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 41s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 58s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 0m 48s{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 47s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 51s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 16m 54s{color} | {color:green} Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 52s{color} | {color:green} hbase-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 9s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 51m 46s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.0 Server=19.03.0 base: https://builds.apache.org/job/PreCommit-HBASE-Build/921/artifact/patchprocess/Dockerfile | | JIRA Issue | HBASE-23038 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12981300/HBASE-23038.002.patch | | Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux d19a2d13ce85 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/hbase-personality.sh | | git revision | master / 52f5a85bfc | | Default Java | 1.8.0_181 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/921/test
[jira] [Commented] (HBASE-23038) Provide consistent and clear logging about disabling chores
[ https://issues.apache.org/jira/browse/HBASE-23038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937611#comment-16937611 ] Sanjeet Nishad commented on HBASE-23038: Hi [~vjasani], I have updated the patch. Please have a look. > Provide consistent and clear logging about disabling chores > --- > > Key: HBASE-23038 > URL: https://issues.apache.org/jira/browse/HBASE-23038 > Project: HBase > Issue Type: Improvement > Components: master, regionserver >Reporter: Sean Busbey >Assignee: Sanjeet Nishad >Priority: Minor > Labels: beginner > Attachments: HBASE-23038.001.patch, HBASE-23038.002.patch > > > Right now if you want to disable any of our chores you can set the period to > be <= 0. Sometimes, if you do this you get a nice message: > {code} > 2019-09-16 22:10:16,756 INFO [master-1:16000.activeMasterManager] > master.HMaster: The period is 0 seconds, MobCompactionChore is disabled > {code} > And sometimes you get an opaque message: > {code} > 2019-09-16 22:09:45,333 INFO [master-1:16000.activeMasterManager] > hbase.ChoreService: Could not successfully schedule chore: LogsCleaner > 2019-09-16 22:09:45,340 INFO [master-1:16000.activeMasterManager] > hbase.ChoreService: Could not successfully schedule chore: HFileCleaner > {code} > This is because sometimes we just blindly submit to ChoreService which > submits to a java ScheduledExecutorService and then catches the > IllegalArgumentException. > We should remove the one-offs and make it so ChoreService checks the period > before accepting a submittal and produces a consistent "Foo is disabled" > message. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-22514) Move rsgroup feature into core of HBase
[ https://issues.apache.org/jira/browse/HBASE-22514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937634#comment-16937634 ] Hudson commented on HBASE-22514: Results for branch HBASE-22514 [build #127 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-22514/127/]: (x) *{color:red}-1 overall{color}* details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-22514/127//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-22514/127//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-22514/127//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (x) {color:red}-1 client integration test{color} --Failed when running client tests on top of Hadoop 2. [see log for details|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-22514/127//artifact/output-integration/hadoop-2.log]. (note that this means we didn't run on Hadoop 3) > Move rsgroup feature into core of HBase > --- > > Key: HBASE-22514 > URL: https://issues.apache.org/jira/browse/HBASE-22514 > Project: HBase > Issue Type: Umbrella > Components: Admin, Client, rsgroup >Reporter: Yechao Chen >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-22514.master.001.patch, > image-2019-05-31-18-25-38-217.png > > > The class RSGroupAdminClient is not public > we need to use java api RSGroupAdminClient to manager RSG > so RSGroupAdminClient should be public > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r32805 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/factories/SlowDeterministicMonkeyFactory.java ## @@ -128,56 +136,65 @@ public ChaosMonkey build() { private void loadProperties() { - action1Period = Long.parseLong(this.properties.getProperty( -MonkeyConstants.PERIODIC_ACTION1_PERIOD, -MonkeyConstants.DEFAULT_PERIODIC_ACTION1_PERIOD + "")); - action2Period = Long.parseLong(this.properties.getProperty( -MonkeyConstants.PERIODIC_ACTION2_PERIOD, -MonkeyConstants.DEFAULT_PERIODIC_ACTION2_PERIOD + "")); - action3Period = Long.parseLong(this.properties.getProperty( -MonkeyConstants.COMPOSITE_ACTION3_PERIOD, -MonkeyConstants.DEFAULT_COMPOSITE_ACTION3_PERIOD + "")); - action4Period = Long.parseLong(this.properties.getProperty( -MonkeyConstants.PERIODIC_ACTION4_PERIOD, -MonkeyConstants.DEFAULT_PERIODIC_ACTION4_PERIOD + "")); - moveRegionsMaxTime = Long.parseLong(this.properties.getProperty( -MonkeyConstants.MOVE_REGIONS_MAX_TIME, -MonkeyConstants.DEFAULT_MOVE_REGIONS_MAX_TIME + "")); - moveRegionsSleepTime = Long.parseLong(this.properties.getProperty( -MonkeyConstants.MOVE_REGIONS_SLEEP_TIME, -MonkeyConstants.DEFAULT_MOVE_REGIONS_SLEEP_TIME + "")); - moveRandomRegionSleepTime = Long.parseLong(this.properties.getProperty( -MonkeyConstants.MOVE_RANDOM_REGION_SLEEP_TIME, -MonkeyConstants.DEFAULT_MOVE_RANDOM_REGION_SLEEP_TIME + "")); - restartRandomRSSleepTime = Long.parseLong(this.properties.getProperty( -MonkeyConstants.RESTART_RANDOM_RS_SLEEP_TIME, -MonkeyConstants.DEFAULT_RESTART_RANDOM_RS_SLEEP_TIME + "")); - batchRestartRSSleepTime = Long.parseLong(this.properties.getProperty( -MonkeyConstants.BATCH_RESTART_RS_SLEEP_TIME, -MonkeyConstants.DEFAULT_BATCH_RESTART_RS_SLEEP_TIME + "")); - batchRestartRSRatio = Float.parseFloat(this.properties.getProperty( -MonkeyConstants.BATCH_RESTART_RS_RATIO, -MonkeyConstants.DEFAULT_BATCH_RESTART_RS_RATIO + "")); - restartActiveMasterSleepTime = Long.parseLong(this.properties.getProperty( -MonkeyConstants.RESTART_ACTIVE_MASTER_SLEEP_TIME, -MonkeyConstants.DEFAULT_RESTART_ACTIVE_MASTER_SLEEP_TIME + "")); - rollingBatchRestartRSSleepTime = Long.parseLong(this.properties.getProperty( +action1Period = Long.parseLong(this.properties.getProperty( Review comment: Please do not reformat unrelated lines. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328015807 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/Action.java ## @@ -150,78 +151,106 @@ public void perform() throws Exception { } } protected void killMaster(ServerName server) throws IOException { -LOG.info("Killing master " + server); +LOG.info("Killing master {}", server); cluster.killMaster(server); cluster.waitForMasterToStop(server, killMasterTimeout); LOG.info("Killed master " + server); } protected void startMaster(ServerName server) throws IOException { -LOG.info("Starting master " + server.getHostname()); +LOG.info("Starting master {}", server.getHostname()); cluster.startMaster(server.getHostname(), server.getPort()); cluster.waitForActiveAndReadyMaster(startMasterTimeout); LOG.info("Started master " + server.getHostname()); } + protected void stopRs(ServerName server) throws IOException { +LOG.info("Stopping regionserver {}", server); +cluster.stopRegionServer(server); +cluster.waitForRegionServerToStop(server, killRsTimeout); +LOG.info("Stoppiong regionserver {}. Reported num of rs:{}", server, +cluster.getClusterMetrics().getLiveServerMetrics().size()); + } + + protected void suspendRs(ServerName server) throws IOException { +LOG.info("Suspending regionserver {}", server); +cluster.suspendRegionServer(server); +if(!(cluster instanceof MiniHBaseCluster)){ Review comment: What happens on non-minicluster? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328030516 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/GracefulRollingRestartRsAction.java ## @@ -0,0 +1,72 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.chaos.actions; + +import java.io.IOException; +import java.util.Arrays; +import java.util.List; +import org.apache.commons.lang3.RandomUtils; +import org.apache.hadoop.hbase.ServerName; +import org.apache.hadoop.hbase.util.RegionMover; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Gracefully restarts every non-admin regionserver in a rolling fashion. At each step, it unloads, + * restarts the loads every rs server sleeping randomly (0-sleepTime) in between servers. + */ +public class GracefulRollingRestartRsAction extends RestartActionBaseAction { + private static final Logger LOG = LoggerFactory.getLogger(GracefulRollingRestartRsAction.class); + + public GracefulRollingRestartRsAction(long sleepTime) { +super(sleepTime); + } + + @Override + public void perform() throws Exception { +LOG.info("Performing action: Rolling restarting non-master region servers"); +List selectedServers = selectServers(); + +LOG.info("Disabling balancer to make unloading possible"); +setBalancer(false, false); + +for(ServerName server : selectedServers){ + String rsName = server.getAddress().toString(); + try (RegionMover rm = + new RegionMover.RegionMoverBuilder(rsName, getConf()).ack(true).build()) { +LOG.info("Unloading " + server); Review comment: nit: parameterized logging This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328058038 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/RollingBatchSuspendResumeRsAction.java ## @@ -0,0 +1,116 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.chaos.actions; + +import java.io.IOException; +import java.util.LinkedList; +import java.util.List; +import java.util.Queue; + +import org.apache.commons.lang3.RandomUtils; +import org.apache.hadoop.hbase.ServerName; +import org.apache.hadoop.hbase.chaos.monkies.PolicyBasedChaosMonkey; +import org.apache.hadoop.hbase.util.Threads; +import org.apache.hadoop.util.Shell; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Suspend then resume a ratio of the regionservers in a rolling fashion. At each step, either + * suspend a server, or resume one, sleeping (sleepTime) in between steps. The parameter + * maxSuspendedServers limits the maximum number of servers that can be down at the same time + * during rolling restarts. + */ +public class RollingBatchSuspendResumeRsAction extends Action { + private static final Logger LOG = + LoggerFactory.getLogger(RollingBatchSuspendResumeRsAction.class); + private float ratio; + private long sleepTime; + private int maxSuspendedServers; // number of maximum suspended servers at any given time. + + public RollingBatchSuspendResumeRsAction(long sleepTime, float ratio) { +this(sleepTime, ratio, 5); + } + + public RollingBatchSuspendResumeRsAction(long sleepTime, float ratio, int maxSuspendedServers) { +this.ratio = ratio; +this.sleepTime = sleepTime; +this.maxSuspendedServers = maxSuspendedServers; + } + + enum SuspendOrResume { +SUSPEND, RESUME + } + + @Override public void perform() throws Exception { Review comment: @Override in new line This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328020894 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/GracefulRollingRestartRsAction.java ## @@ -0,0 +1,72 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.chaos.actions; + +import java.io.IOException; +import java.util.Arrays; +import java.util.List; +import org.apache.commons.lang3.RandomUtils; +import org.apache.hadoop.hbase.ServerName; +import org.apache.hadoop.hbase.util.RegionMover; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Gracefully restarts every non-admin regionserver in a rolling fashion. At each step, it unloads, + * restarts the loads every rs server sleeping randomly (0-sleepTime) in between servers. + */ +public class GracefulRollingRestartRsAction extends RestartActionBaseAction { + private static final Logger LOG = LoggerFactory.getLogger(GracefulRollingRestartRsAction.class); + + public GracefulRollingRestartRsAction(long sleepTime) { +super(sleepTime); + } + + @Override + public void perform() throws Exception { +LOG.info("Performing action: Rolling restarting non-master region servers"); +List selectedServers = selectServers(); + +LOG.info("Disabling balancer to make unloading possible"); +setBalancer(false, false); Review comment: This is an async call this way. Don't you need to wait for balancer to finish? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328031685 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/RestartActionBaseAction.java ## @@ -50,6 +50,23 @@ void restartMaster(ServerName server, long sleepTime) throws IOException { startMaster(server); } + /** + * Stop and then restart the region server instaedof killing it. Review comment: typo instaedof This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328019811 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/GracefulRollingRestartRsAction.java ## @@ -0,0 +1,72 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.chaos.actions; + +import java.io.IOException; +import java.util.Arrays; +import java.util.List; +import org.apache.commons.lang3.RandomUtils; +import org.apache.hadoop.hbase.ServerName; +import org.apache.hadoop.hbase.util.RegionMover; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Gracefully restarts every non-admin regionserver in a rolling fashion. At each step, it unloads, Review comment: What is non-admin regionserver? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328019053 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/Action.java ## @@ -269,6 +298,19 @@ protected void forceBalancer() throws Exception { } } + protected void setBalancer(boolean onOrOff, boolean synchronous) throws Exception { +Admin admin = this.context.getHBaseIntegrationTestingUtility().getAdmin(); +boolean result = false; +try { + result = admin.balancerSwitch(onOrOff, synchronous); +} catch (Exception e) { + LOG.warn("Got exception while switching balance ", e); +} +if (!result) { Review comment: admin.balancerSwitch returns the previous state. If you disable balancer `result` will be `true` and logs the error. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328061190 ## File path: hbase-server/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java ## @@ -489,6 +499,32 @@ public String abortRegionServer(int serverNumber) { return server; } + /** + * Suspend the specified region server + * @param serverNumber Used as index into a list. + * @return + */ + public JVMClusterUtil.RegionServerThread suspendRegionServer(int serverNumber) { +JVMClusterUtil.RegionServerThread server = +hbaseCluster.getRegionServers().get(serverNumber); +LOG.info("Suspending " + server.toString()); Review comment: parameterized logging This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328060329 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/RollingBatchSuspendResumeRsAction.java ## @@ -0,0 +1,116 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.chaos.actions; + +import java.io.IOException; +import java.util.LinkedList; +import java.util.List; +import java.util.Queue; + +import org.apache.commons.lang3.RandomUtils; +import org.apache.hadoop.hbase.ServerName; +import org.apache.hadoop.hbase.chaos.monkies.PolicyBasedChaosMonkey; +import org.apache.hadoop.hbase.util.Threads; +import org.apache.hadoop.util.Shell; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Suspend then resume a ratio of the regionservers in a rolling fashion. At each step, either + * suspend a server, or resume one, sleeping (sleepTime) in between steps. The parameter + * maxSuspendedServers limits the maximum number of servers that can be down at the same time + * during rolling restarts. + */ +public class RollingBatchSuspendResumeRsAction extends Action { + private static final Logger LOG = + LoggerFactory.getLogger(RollingBatchSuspendResumeRsAction.class); + private float ratio; + private long sleepTime; + private int maxSuspendedServers; // number of maximum suspended servers at any given time. + + public RollingBatchSuspendResumeRsAction(long sleepTime, float ratio) { +this(sleepTime, ratio, 5); + } + + public RollingBatchSuspendResumeRsAction(long sleepTime, float ratio, int maxSuspendedServers) { +this.ratio = ratio; +this.sleepTime = sleepTime; +this.maxSuspendedServers = maxSuspendedServers; + } + + enum SuspendOrResume { +SUSPEND, RESUME + } + + @Override public void perform() throws Exception { +LOG.info(String.format("Performing action: Rolling batch restarting %d%% of region servers", +(int) (ratio * 100))); +List selectedServers = selectServers(); + +Queue serversToBeSuspended = new LinkedList<>(selectedServers); +Queue suspendedServers = new LinkedList<>(); + +// loop while there are servers to be suspended or suspended servers to be resumed +while ((!serversToBeSuspended.isEmpty() || !suspendedServers.isEmpty()) && !context +.isStopping()) { + SuspendOrResume action; + + if (serversToBeSuspended.isEmpty()) { // no more servers to suspend +action = SuspendOrResume.RESUME; + } else if (suspendedServers.isEmpty()) { +action = SuspendOrResume.SUSPEND; // no more servers to resume + } else if (suspendedServers.size() >= maxSuspendedServers) { +// we have too many suspended servers. Don't suspend any more +action = SuspendOrResume.RESUME; + } else { +// do a coin toss +action = RandomUtils.nextBoolean() ? SuspendOrResume.SUSPEND : SuspendOrResume.RESUME; + } + + ServerName server; + switch (action) { +case SUSPEND: + server = serversToBeSuspended.remove(); + try { +suspendRs(server); + } catch (Shell.ExitCodeException e) { +LOG.info("Problem suspending but presume successful; code=" + e.getExitCode(), e); + } + suspendedServers.add(server); + break; +case RESUME: + server = suspendedServers.remove(); + try { +resumeRs(server); + } catch (Shell.ExitCodeException e) { +LOG.info("Problem resuming, will retry; code=" + e.getExitCode(), e); Review comment: ditto This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328021113 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/GracefulRollingRestartRsAction.java ## @@ -0,0 +1,72 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.chaos.actions; + +import java.io.IOException; +import java.util.Arrays; +import java.util.List; +import org.apache.commons.lang3.RandomUtils; +import org.apache.hadoop.hbase.ServerName; +import org.apache.hadoop.hbase.util.RegionMover; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Gracefully restarts every non-admin regionserver in a rolling fashion. At each step, it unloads, + * restarts the loads every rs server sleeping randomly (0-sleepTime) in between servers. + */ +public class GracefulRollingRestartRsAction extends RestartActionBaseAction { + private static final Logger LOG = LoggerFactory.getLogger(GracefulRollingRestartRsAction.class); + + public GracefulRollingRestartRsAction(long sleepTime) { +super(sleepTime); + } + + @Override + public void perform() throws Exception { +LOG.info("Performing action: Rolling restarting non-master region servers"); +List selectedServers = selectServers(); + +LOG.info("Disabling balancer to make unloading possible"); +setBalancer(false, false); + +for(ServerName server : selectedServers){ Review comment: nit: add spaces around () This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328030997 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/GracefulRollingRestartRsAction.java ## @@ -0,0 +1,72 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.chaos.actions; + +import java.io.IOException; +import java.util.Arrays; +import java.util.List; +import org.apache.commons.lang3.RandomUtils; +import org.apache.hadoop.hbase.ServerName; +import org.apache.hadoop.hbase.util.RegionMover; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Gracefully restarts every non-admin regionserver in a rolling fashion. At each step, it unloads, + * restarts the loads every rs server sleeping randomly (0-sleepTime) in between servers. + */ +public class GracefulRollingRestartRsAction extends RestartActionBaseAction { + private static final Logger LOG = LoggerFactory.getLogger(GracefulRollingRestartRsAction.class); + + public GracefulRollingRestartRsAction(long sleepTime) { +super(sleepTime); + } + + @Override + public void perform() throws Exception { +LOG.info("Performing action: Rolling restarting non-master region servers"); +List selectedServers = selectServers(); + +LOG.info("Disabling balancer to make unloading possible"); +setBalancer(false, false); + +for(ServerName server : selectedServers){ + String rsName = server.getAddress().toString(); + try (RegionMover rm = + new RegionMover.RegionMoverBuilder(rsName, getConf()).ack(true).build()) { +LOG.info("Unloading " + server); +rm.unload(); +LOG.info("Restarting " + server); +gracefulRestartRs(server, sleepTime); +LOG.info("Loading " + server); +rm.load(); + } catch (org.apache.hadoop.util.Shell.ExitCodeException e) { Review comment: Import the class. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328015214 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/Action.java ## @@ -150,78 +151,106 @@ public void perform() throws Exception { } } protected void killMaster(ServerName server) throws IOException { -LOG.info("Killing master " + server); +LOG.info("Killing master {}", server); cluster.killMaster(server); cluster.waitForMasterToStop(server, killMasterTimeout); LOG.info("Killed master " + server); } protected void startMaster(ServerName server) throws IOException { -LOG.info("Starting master " + server.getHostname()); +LOG.info("Starting master {}", server.getHostname()); cluster.startMaster(server.getHostname(), server.getPort()); cluster.waitForActiveAndReadyMaster(startMasterTimeout); LOG.info("Started master " + server.getHostname()); } + protected void stopRs(ServerName server) throws IOException { +LOG.info("Stopping regionserver {}", server); +cluster.stopRegionServer(server); +cluster.waitForRegionServerToStop(server, killRsTimeout); Review comment: Is it good to use killRsTimeout for stop timeout? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328032967 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/RestartActionBaseAction.java ## @@ -50,6 +50,23 @@ void restartMaster(ServerName server, long sleepTime) throws IOException { startMaster(server); } + /** + * Stop and then restart the region server instaedof killing it. + * @param server hostname to restart the regionserver on + * @param sleepTime number of milliseconds between stop and restart + * @throws IOException if something goes wrong + */ + void gracefulRestartRs(ServerName server, long sleepTime) throws IOException { +sleepTime = Math.max(sleepTime, 1000); +// Don't try the stop if we're stopping already +if (context.isStopping()) { + return; +} +stopRs(server); Review comment: Please add logs to see the progress. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328060218 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/RollingBatchSuspendResumeRsAction.java ## @@ -0,0 +1,116 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.chaos.actions; + +import java.io.IOException; +import java.util.LinkedList; +import java.util.List; +import java.util.Queue; + +import org.apache.commons.lang3.RandomUtils; +import org.apache.hadoop.hbase.ServerName; +import org.apache.hadoop.hbase.chaos.monkies.PolicyBasedChaosMonkey; +import org.apache.hadoop.hbase.util.Threads; +import org.apache.hadoop.util.Shell; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Suspend then resume a ratio of the regionservers in a rolling fashion. At each step, either + * suspend a server, or resume one, sleeping (sleepTime) in between steps. The parameter + * maxSuspendedServers limits the maximum number of servers that can be down at the same time + * during rolling restarts. + */ +public class RollingBatchSuspendResumeRsAction extends Action { + private static final Logger LOG = + LoggerFactory.getLogger(RollingBatchSuspendResumeRsAction.class); + private float ratio; + private long sleepTime; + private int maxSuspendedServers; // number of maximum suspended servers at any given time. + + public RollingBatchSuspendResumeRsAction(long sleepTime, float ratio) { +this(sleepTime, ratio, 5); + } + + public RollingBatchSuspendResumeRsAction(long sleepTime, float ratio, int maxSuspendedServers) { +this.ratio = ratio; +this.sleepTime = sleepTime; +this.maxSuspendedServers = maxSuspendedServers; + } + + enum SuspendOrResume { +SUSPEND, RESUME + } + + @Override public void perform() throws Exception { +LOG.info(String.format("Performing action: Rolling batch restarting %d%% of region servers", +(int) (ratio * 100))); +List selectedServers = selectServers(); + +Queue serversToBeSuspended = new LinkedList<>(selectedServers); +Queue suspendedServers = new LinkedList<>(); + +// loop while there are servers to be suspended or suspended servers to be resumed +while ((!serversToBeSuspended.isEmpty() || !suspendedServers.isEmpty()) && !context +.isStopping()) { + SuspendOrResume action; + + if (serversToBeSuspended.isEmpty()) { // no more servers to suspend +action = SuspendOrResume.RESUME; + } else if (suspendedServers.isEmpty()) { +action = SuspendOrResume.SUSPEND; // no more servers to resume + } else if (suspendedServers.size() >= maxSuspendedServers) { +// we have too many suspended servers. Don't suspend any more +action = SuspendOrResume.RESUME; + } else { +// do a coin toss +action = RandomUtils.nextBoolean() ? SuspendOrResume.SUSPEND : SuspendOrResume.RESUME; + } + + ServerName server; + switch (action) { +case SUSPEND: + server = serversToBeSuspended.remove(); + try { +suspendRs(server); + } catch (Shell.ExitCodeException e) { +LOG.info("Problem suspending but presume successful; code=" + e.getExitCode(), e); Review comment: Log.warn? Use parameterized logging. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase-operator-tools] petersomogyi opened a new pull request #38: Move version to 1.1.0-SNAPSHOT
petersomogyi opened a new pull request #38: Move version to 1.1.0-SNAPSHOT URL: https://github.com/apache/hbase-operator-tools/pull/38 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (HBASE-23075) Upgrade jackson version
Nicholas Jiang created HBASE-23075: -- Summary: Upgrade jackson version Key: HBASE-23075 URL: https://issues.apache.org/jira/browse/HBASE-23075 Project: HBase Issue Type: Improvement Reporter: Nicholas Jiang A Polymorphic Typing issue was discovered in FasterXML jackson-databind before 2.9.10. It is related to com.zaxxer.hikari.HikariDataSource. This is a different vulnerability than CVE-2019-14540. https://nvd.nist.gov/vuln/detail/CVE-2019-16335 A Polymorphic Typing issue was discovered in FasterXML jackson-databind before 2.9.10. It is related to com.zaxxer.hikari.HikariConfig. https://nvd.nist.gov/vuln/detail/CVE-2019-14540 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] SteNicholas opened a new pull request #660: HBASE-23075 Upgrade jackson version
SteNicholas opened a new pull request #660: HBASE-23075 Upgrade jackson version URL: https://github.com/apache/hbase/pull/660 Jackson security issues: A Polymorphic Typing issue was discovered in FasterXML jackson-databind before 2.9.10. It is related to com.zaxxer.hikari.HikariDataSource. This is a different vulnerability than CVE-2019-14540. https://nvd.nist.gov/vuln/detail/CVE-2019-16335 A Polymorphic Typing issue was discovered in FasterXML jackson-databind before 2.9.10. It is related to com.zaxxer.hikari.HikariConfig. https://nvd.nist.gov/vuln/detail/CVE-2019-14540 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (HBASE-23031) Upgrade Yetus version in RM scripts
[ https://issues.apache.org/jira/browse/HBASE-23031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Somogyi updated HBASE-23031: -- Fix Version/s: (was: hbase-operator-tools-2.0.0) hbase-operator-tools-1.1.0 > Upgrade Yetus version in RM scripts > --- > > Key: HBASE-23031 > URL: https://issues.apache.org/jira/browse/HBASE-23031 > Project: HBase > Issue Type: Improvement > Components: hbase-operator-tools >Affects Versions: hbase-operator-tools-1.0.0 >Reporter: Peter Somogyi >Assignee: Peter Somogyi >Priority: Minor > Fix For: hbase-operator-tools-1.1.0 > > > The RM scripts use Yetus 0.9.0 to generate release notes. We can upgrade it > to the latest version. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase-operator-tools] asf-ci commented on issue #38: Move version to 1.1.0-SNAPSHOT
asf-ci commented on issue #38: Move version to 1.1.0-SNAPSHOT URL: https://github.com/apache/hbase-operator-tools/pull/38#issuecomment-534990914 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/PreCommit-HBASE-OPERATOR-TOOLS-Build/111/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (HBASE-22983) [HBCK2] Record executed command and output to log file
[ https://issues.apache.org/jira/browse/HBASE-22983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Somogyi updated HBASE-22983: -- Fix Version/s: hbase-operator-tools-1.1.0 > [HBCK2] Record executed command and output to log file > -- > > Key: HBASE-22983 > URL: https://issues.apache.org/jira/browse/HBASE-22983 > Project: HBase > Issue Type: Sub-task > Components: hbase-operator-tools >Reporter: Peter Somogyi >Priority: Major > Fix For: hbase-operator-tools-1.1.0 > > > HBCK2 operations by default are logged to the console only. For > troubleshooting and tracking it is helpful to write the logs to a file with > the executed command arguments. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-21369) [hbase-operator-tools] Create a tool that will write the versions file
[ https://issues.apache.org/jira/browse/HBASE-21369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Somogyi updated HBASE-21369: -- Fix Version/s: hbase-operator-tools-1.1.0 > [hbase-operator-tools] Create a tool that will write the versions file > -- > > Key: HBASE-21369 > URL: https://issues.apache.org/jira/browse/HBASE-21369 > Project: HBase > Issue Type: Task > Components: hbase-operator-tools >Reporter: stack >Priority: Major > Fix For: hbase-operator-tools-1.1.0 > > > hbck1 has the facility for restoring the hbase.version file under /hbase if > it goes missing. This issue is about building a dedicated tool to do this in > hbase-operator-tools or making use of hbck1 as is to do it and document how > in refguide. > For description, see > http://hbase.apache.org/book.html#_special_cases_hbase_version_file_is_missing > As is, we have a regression in our fixup tooling. > Perhaps we create general fs fixup tool? It would do the above and then some. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-21163) Support backup-and-restore operations without Hbase Super user privilege
[ https://issues.apache.org/jira/browse/HBASE-21163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Somogyi updated HBASE-21163: -- Component/s: (was: hbase-operator-tools) backup&restore > Support backup-and-restore operations without Hbase Super user privilege > > > Key: HBASE-21163 > URL: https://issues.apache.org/jira/browse/HBASE-21163 > Project: HBase > Issue Type: Improvement > Components: backup&restore >Reporter: Sujit P >Assignee: Vladimir Rodionov >Priority: Critical > Labels: Backup/Restore > Fix For: 3.0.0 > > > Hello Team, > I am opening this Apache Jira to request for an analysis on considering > following problem statement: > Currently backup-and-restore utility is designed to work with "hbase" > superuser privileges. > I see at-least couple concerns on that, may be more, will add more later on: > * For smaller organizations with less than 20 hbase tables or couple of > clusters, it is manageable, hbase admins. However, for larger organizations > or larger clusters, that would need providing hbase super user access to many > people to manage such operations which can be a security risk on source > cluster. > * In certain scenarios, it may be typical to have one DR Cluster in remote > data center to store backup tables, and having super privileges for all > tables in remote cluster is another risk for same reasons above. > I suggest to review into making backup and restore without hbase super > privileges . > Tenants or application admins may have certainly have admin access to > relevant tables/namespaces/snapshots. > Here is an example on what I am proposing from RDBMS : > [https://docs.oracle.com/cd/E16926_01/doc.121/e16564/configure_users_classes.htm#OBADM144] > Thanks > > PS: Forgive me if I hadn't opened my second apache Jira correct way, happy to > correct it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase-connectors] asf-ci commented on issue #42: HBASE-22817 Use hbase-shaded dependencies in hbase-spark
asf-ci commented on issue #42: HBASE-22817 Use hbase-shaded dependencies in hbase-spark URL: https://github.com/apache/hbase-connectors/pull/42#issuecomment-534998486 Refer to this link for build results (access rights to CI server needed): https://builds.apache.org/job/PreCommit-HBASE-CONNECTORS-Build/75/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328095804 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/Action.java ## @@ -150,78 +151,106 @@ public void perform() throws Exception { } } protected void killMaster(ServerName server) throws IOException { -LOG.info("Killing master " + server); +LOG.info("Killing master {}", server); cluster.killMaster(server); cluster.waitForMasterToStop(server, killMasterTimeout); LOG.info("Killed master " + server); } protected void startMaster(ServerName server) throws IOException { -LOG.info("Starting master " + server.getHostname()); +LOG.info("Starting master {}", server.getHostname()); cluster.startMaster(server.getHostname(), server.getPort()); cluster.waitForActiveAndReadyMaster(startMasterTimeout); LOG.info("Started master " + server.getHostname()); } + protected void stopRs(ServerName server) throws IOException { +LOG.info("Stopping regionserver {}", server); +cluster.stopRegionServer(server); +cluster.waitForRegionServerToStop(server, killRsTimeout); Review comment: The default is 1 minute. I assumed that it will be sufficient considering the exec timeout is 30 sec and tried to minimize the number of new properties. If you think it should be separately configurable please let me know. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328097760 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/Action.java ## @@ -150,78 +151,106 @@ public void perform() throws Exception { } } protected void killMaster(ServerName server) throws IOException { -LOG.info("Killing master " + server); +LOG.info("Killing master {}", server); cluster.killMaster(server); cluster.waitForMasterToStop(server, killMasterTimeout); LOG.info("Killed master " + server); } protected void startMaster(ServerName server) throws IOException { -LOG.info("Starting master " + server.getHostname()); +LOG.info("Starting master {}", server.getHostname()); cluster.startMaster(server.getHostname(), server.getPort()); cluster.waitForActiveAndReadyMaster(startMasterTimeout); LOG.info("Started master " + server.getHostname()); } + protected void stopRs(ServerName server) throws IOException { +LOG.info("Stopping regionserver {}", server); +cluster.stopRegionServer(server); +cluster.waitForRegionServerToStop(server, killRsTimeout); +LOG.info("Stoppiong regionserver {}. Reported num of rs:{}", server, +cluster.getClusterMetrics().getLiveServerMetrics().size()); + } + + protected void suspendRs(ServerName server) throws IOException { +LOG.info("Suspending regionserver {}", server); +cluster.suspendRegionServer(server); +if(!(cluster instanceof MiniHBaseCluster)){ Review comment: We send a SIGSTOP and then make sure the process is suspended before we continue. For MiniCluster it makes no sense to it because we just suspend the thread. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328097760 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/Action.java ## @@ -150,78 +151,106 @@ public void perform() throws Exception { } } protected void killMaster(ServerName server) throws IOException { -LOG.info("Killing master " + server); +LOG.info("Killing master {}", server); cluster.killMaster(server); cluster.waitForMasterToStop(server, killMasterTimeout); LOG.info("Killed master " + server); } protected void startMaster(ServerName server) throws IOException { -LOG.info("Starting master " + server.getHostname()); +LOG.info("Starting master {}", server.getHostname()); cluster.startMaster(server.getHostname(), server.getPort()); cluster.waitForActiveAndReadyMaster(startMasterTimeout); LOG.info("Started master " + server.getHostname()); } + protected void stopRs(ServerName server) throws IOException { +LOG.info("Stopping regionserver {}", server); +cluster.stopRegionServer(server); +cluster.waitForRegionServerToStop(server, killRsTimeout); +LOG.info("Stoppiong regionserver {}. Reported num of rs:{}", server, +cluster.getClusterMetrics().getLiveServerMetrics().size()); + } + + protected void suspendRs(ServerName server) throws IOException { +LOG.info("Suspending regionserver {}", server); +cluster.suspendRegionServer(server); +if(!(cluster instanceof MiniHBaseCluster)){ Review comment: We send a SIGSTOP and then make sure the process is suspended before we continue. For MiniCluster it makes no sense to wait because we just suspend the thread. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328101050 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/Action.java ## @@ -269,6 +298,19 @@ protected void forceBalancer() throws Exception { } } + protected void setBalancer(boolean onOrOff, boolean synchronous) throws Exception { +Admin admin = this.context.getHBaseIntegrationTestingUtility().getAdmin(); +boolean result = false; +try { + result = admin.balancerSwitch(onOrOff, synchronous); +} catch (Exception e) { + LOG.warn("Got exception while switching balance ", e); +} +if (!result) { Review comment: You are right. I'll move the error message next to the warn. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase-operator-tools] petersomogyi merged pull request #38: Move version to 1.1.0-SNAPSHOT
petersomogyi merged pull request #38: Move version to 1.1.0-SNAPSHOT URL: https://github.com/apache/hbase-operator-tools/pull/38 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328103562 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/GracefulRollingRestartRsAction.java ## @@ -0,0 +1,72 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.chaos.actions; + +import java.io.IOException; +import java.util.Arrays; +import java.util.List; +import org.apache.commons.lang3.RandomUtils; +import org.apache.hadoop.hbase.ServerName; +import org.apache.hadoop.hbase.util.RegionMover; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Gracefully restarts every non-admin regionserver in a rolling fashion. At each step, it unloads, Review comment: You are right it makes no sense, I'll remove the "non-admin" part. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Assigned] (HBASE-16822) Enable restore-snapshot and clone-snapshot to use external specified snapshot locatioin
[ https://issues.apache.org/jira/browse/HBASE-16822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sanjeet Nishad reassigned HBASE-16822: -- Assignee: Sanjeet Nishad > Enable restore-snapshot and clone-snapshot to use external specified snapshot > locatioin > > > Key: HBASE-16822 > URL: https://issues.apache.org/jira/browse/HBASE-16822 > Project: HBase > Issue Type: Improvement >Reporter: Jerry He >Assignee: Sanjeet Nishad >Priority: Major > > Currently restore-snapshot and clone-snapshot only work with the snapshots > that are under hbase root.dir. > In combination with export-snapshot, this means the snapshot needs to be > exported out to another hbase root.dir, or back and forth eventually to a > hbase root.dir. > There are a few issues with the approach. > We've known that export-snapshot has a limitation dealing with secure > cluster, where the external user needs to have read access to hbase root.dir > data, by-passing table ACL check. > The second problem is when we try to use or bring back the exported snapshot > for restore/clone. They have to be in the target hbase root.dir, and needs > write permission to get it in there. > Again we will have permission problem. > This ticket tries to deal with the second problem, clone and restore from > exported snapshots. The exported snapshots can be on the same cluster, but > the user may not have write permission to move them to hbase root.dir. > We should have a solution that allow clone/restore snapshot from an external > path that keeps snapshot backups. And also do it with security permission in > mind. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HBASE-16822) Enable restore-snapshot and clone-snapshot to use external specified snapshot locatioin
[ https://issues.apache.org/jira/browse/HBASE-16822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sanjeet Nishad reassigned HBASE-16822: -- Assignee: (was: Sanjeet Nishad) > Enable restore-snapshot and clone-snapshot to use external specified snapshot > locatioin > > > Key: HBASE-16822 > URL: https://issues.apache.org/jira/browse/HBASE-16822 > Project: HBase > Issue Type: Improvement >Reporter: Jerry He >Priority: Major > > Currently restore-snapshot and clone-snapshot only work with the snapshots > that are under hbase root.dir. > In combination with export-snapshot, this means the snapshot needs to be > exported out to another hbase root.dir, or back and forth eventually to a > hbase root.dir. > There are a few issues with the approach. > We've known that export-snapshot has a limitation dealing with secure > cluster, where the external user needs to have read access to hbase root.dir > data, by-passing table ACL check. > The second problem is when we try to use or bring back the exported snapshot > for restore/clone. They have to be in the target hbase root.dir, and needs > write permission to get it in there. > Again we will have permission problem. > This ticket tries to deal with the second problem, clone and restore from > exported snapshots. The exported snapshots can be on the same cluster, but > the user may not have write permission to move them to hbase root.dir. > We should have a solution that allow clone/restore snapshot from an external > path that keeps snapshot backups. And also do it with security permission in > mind. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328113442 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/GracefulRollingRestartRsAction.java ## @@ -0,0 +1,72 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.chaos.actions; + +import java.io.IOException; +import java.util.Arrays; +import java.util.List; +import org.apache.commons.lang3.RandomUtils; +import org.apache.hadoop.hbase.ServerName; +import org.apache.hadoop.hbase.util.RegionMover; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Gracefully restarts every non-admin regionserver in a rolling fashion. At each step, it unloads, + * restarts the loads every rs server sleeping randomly (0-sleepTime) in between servers. + */ +public class GracefulRollingRestartRsAction extends RestartActionBaseAction { + private static final Logger LOG = LoggerFactory.getLogger(GracefulRollingRestartRsAction.class); + + public GracefulRollingRestartRsAction(long sleepTime) { +super(sleepTime); + } + + @Override + public void perform() throws Exception { +LOG.info("Performing action: Rolling restarting non-master region servers"); +List selectedServers = selectServers(); + +LOG.info("Disabling balancer to make unloading possible"); +setBalancer(false, false); Review comment: You are right. I set the synchronous flag to true. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328113713 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/GracefulRollingRestartRsAction.java ## @@ -0,0 +1,72 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.chaos.actions; + +import java.io.IOException; +import java.util.Arrays; +import java.util.List; +import org.apache.commons.lang3.RandomUtils; +import org.apache.hadoop.hbase.ServerName; +import org.apache.hadoop.hbase.util.RegionMover; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Gracefully restarts every non-admin regionserver in a rolling fashion. At each step, it unloads, + * restarts the loads every rs server sleeping randomly (0-sleepTime) in between servers. + */ +public class GracefulRollingRestartRsAction extends RestartActionBaseAction { + private static final Logger LOG = LoggerFactory.getLogger(GracefulRollingRestartRsAction.class); + + public GracefulRollingRestartRsAction(long sleepTime) { +super(sleepTime); + } + + @Override + public void perform() throws Exception { +LOG.info("Performing action: Rolling restarting non-master region servers"); +List selectedServers = selectServers(); + +LOG.info("Disabling balancer to make unloading possible"); +setBalancer(false, false); + +for(ServerName server : selectedServers){ Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328114387 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/GracefulRollingRestartRsAction.java ## @@ -0,0 +1,72 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.chaos.actions; + +import java.io.IOException; +import java.util.Arrays; +import java.util.List; +import org.apache.commons.lang3.RandomUtils; +import org.apache.hadoop.hbase.ServerName; +import org.apache.hadoop.hbase.util.RegionMover; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Gracefully restarts every non-admin regionserver in a rolling fashion. At each step, it unloads, + * restarts the loads every rs server sleeping randomly (0-sleepTime) in between servers. + */ +public class GracefulRollingRestartRsAction extends RestartActionBaseAction { + private static final Logger LOG = LoggerFactory.getLogger(GracefulRollingRestartRsAction.class); + + public GracefulRollingRestartRsAction(long sleepTime) { +super(sleepTime); + } + + @Override + public void perform() throws Exception { +LOG.info("Performing action: Rolling restarting non-master region servers"); +List selectedServers = selectServers(); + +LOG.info("Disabling balancer to make unloading possible"); +setBalancer(false, false); + +for(ServerName server : selectedServers){ + String rsName = server.getAddress().toString(); + try (RegionMover rm = + new RegionMover.RegionMoverBuilder(rsName, getConf()).ack(true).build()) { +LOG.info("Unloading " + server); Review comment: Thanks I missed these. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328114798 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/GracefulRollingRestartRsAction.java ## @@ -0,0 +1,72 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.chaos.actions; + +import java.io.IOException; +import java.util.Arrays; +import java.util.List; +import org.apache.commons.lang3.RandomUtils; +import org.apache.hadoop.hbase.ServerName; +import org.apache.hadoop.hbase.util.RegionMover; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Gracefully restarts every non-admin regionserver in a rolling fashion. At each step, it unloads, + * restarts the loads every rs server sleeping randomly (0-sleepTime) in between servers. + */ +public class GracefulRollingRestartRsAction extends RestartActionBaseAction { + private static final Logger LOG = LoggerFactory.getLogger(GracefulRollingRestartRsAction.class); + + public GracefulRollingRestartRsAction(long sleepTime) { +super(sleepTime); + } + + @Override + public void perform() throws Exception { +LOG.info("Performing action: Rolling restarting non-master region servers"); +List selectedServers = selectServers(); + +LOG.info("Disabling balancer to make unloading possible"); +setBalancer(false, false); + +for(ServerName server : selectedServers){ + String rsName = server.getAddress().toString(); + try (RegionMover rm = + new RegionMover.RegionMoverBuilder(rsName, getConf()).ack(true).build()) { +LOG.info("Unloading " + server); +rm.unload(); +LOG.info("Restarting " + server); +gracefulRestartRs(server, sleepTime); +LOG.info("Loading " + server); +rm.load(); + } catch (org.apache.hadoop.util.Shell.ExitCodeException e) { Review comment: stupid copy/paste issue This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328115617 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/RestartActionBaseAction.java ## @@ -50,6 +50,23 @@ void restartMaster(ServerName server, long sleepTime) throws IOException { startMaster(server); } + /** + * Stop and then restart the region server instaedof killing it. Review comment: Thanks, fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase-connectors] meszibalu commented on issue #42: HBASE-22817 Use hbase-shaded dependencies in hbase-spark
meszibalu commented on issue #42: HBASE-22817 Use hbase-shaded dependencies in hbase-spark URL: https://github.com/apache/hbase-connectors/pull/42#issuecomment-535018984 I did the rebase but it still fails. This package is definitely in hbase-spark-protocol-shaded package. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328118752 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/RestartActionBaseAction.java ## @@ -50,6 +50,23 @@ void restartMaster(ServerName server, long sleepTime) throws IOException { startMaster(server); } + /** + * Stop and then restart the region server instaedof killing it. + * @param server hostname to restart the regionserver on + * @param sleepTime number of milliseconds between stop and restart + * @throws IOException if something goes wrong + */ + void gracefulRestartRs(ServerName server, long sleepTime) throws IOException { +sleepTime = Math.max(sleepTime, 1000); +// Don't try the stop if we're stopping already +if (context.isStopping()) { + return; +} +stopRs(server); Review comment: I'll add Stopping/Starting logs to the other methods too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328122475 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/factories/SlowDeterministicMonkeyFactory.java ## @@ -128,56 +136,65 @@ public ChaosMonkey build() { private void loadProperties() { - action1Period = Long.parseLong(this.properties.getProperty( -MonkeyConstants.PERIODIC_ACTION1_PERIOD, -MonkeyConstants.DEFAULT_PERIODIC_ACTION1_PERIOD + "")); - action2Period = Long.parseLong(this.properties.getProperty( -MonkeyConstants.PERIODIC_ACTION2_PERIOD, -MonkeyConstants.DEFAULT_PERIODIC_ACTION2_PERIOD + "")); - action3Period = Long.parseLong(this.properties.getProperty( -MonkeyConstants.COMPOSITE_ACTION3_PERIOD, -MonkeyConstants.DEFAULT_COMPOSITE_ACTION3_PERIOD + "")); - action4Period = Long.parseLong(this.properties.getProperty( -MonkeyConstants.PERIODIC_ACTION4_PERIOD, -MonkeyConstants.DEFAULT_PERIODIC_ACTION4_PERIOD + "")); - moveRegionsMaxTime = Long.parseLong(this.properties.getProperty( -MonkeyConstants.MOVE_REGIONS_MAX_TIME, -MonkeyConstants.DEFAULT_MOVE_REGIONS_MAX_TIME + "")); - moveRegionsSleepTime = Long.parseLong(this.properties.getProperty( -MonkeyConstants.MOVE_REGIONS_SLEEP_TIME, -MonkeyConstants.DEFAULT_MOVE_REGIONS_SLEEP_TIME + "")); - moveRandomRegionSleepTime = Long.parseLong(this.properties.getProperty( -MonkeyConstants.MOVE_RANDOM_REGION_SLEEP_TIME, -MonkeyConstants.DEFAULT_MOVE_RANDOM_REGION_SLEEP_TIME + "")); - restartRandomRSSleepTime = Long.parseLong(this.properties.getProperty( -MonkeyConstants.RESTART_RANDOM_RS_SLEEP_TIME, -MonkeyConstants.DEFAULT_RESTART_RANDOM_RS_SLEEP_TIME + "")); - batchRestartRSSleepTime = Long.parseLong(this.properties.getProperty( -MonkeyConstants.BATCH_RESTART_RS_SLEEP_TIME, -MonkeyConstants.DEFAULT_BATCH_RESTART_RS_SLEEP_TIME + "")); - batchRestartRSRatio = Float.parseFloat(this.properties.getProperty( -MonkeyConstants.BATCH_RESTART_RS_RATIO, -MonkeyConstants.DEFAULT_BATCH_RESTART_RS_RATIO + "")); - restartActiveMasterSleepTime = Long.parseLong(this.properties.getProperty( -MonkeyConstants.RESTART_ACTIVE_MASTER_SLEEP_TIME, -MonkeyConstants.DEFAULT_RESTART_ACTIVE_MASTER_SLEEP_TIME + "")); - rollingBatchRestartRSSleepTime = Long.parseLong(this.properties.getProperty( +action1Period = Long.parseLong(this.properties.getProperty( Review comment: I know, but tabulation was all over the place and couldn't resist. I'll revert these changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase-connectors] busbey commented on issue #42: HBASE-22817 Use hbase-shaded dependencies in hbase-spark
busbey commented on issue #42: HBASE-22817 Use hbase-shaded dependencies in hbase-spark URL: https://github.com/apache/hbase-connectors/pull/42#issuecomment-535023466 okay let me dig in and see what's going on. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Comment Edited] (HBASE-23038) Provide consistent and clear logging about disabling chores
[ https://issues.apache.org/jira/browse/HBASE-23038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937611#comment-16937611 ] Sanjeet Nishad edited comment on HBASE-23038 at 9/25/19 1:46 PM: - Hi [~vjasani] [~busbey] , I have updated the patch. Please have a look. was (Author: sanjeetnishad): Hi [~vjasani], I have updated the patch. Please have a look. > Provide consistent and clear logging about disabling chores > --- > > Key: HBASE-23038 > URL: https://issues.apache.org/jira/browse/HBASE-23038 > Project: HBase > Issue Type: Improvement > Components: master, regionserver >Reporter: Sean Busbey >Assignee: Sanjeet Nishad >Priority: Minor > Labels: beginner > Attachments: HBASE-23038.001.patch, HBASE-23038.002.patch > > > Right now if you want to disable any of our chores you can set the period to > be <= 0. Sometimes, if you do this you get a nice message: > {code} > 2019-09-16 22:10:16,756 INFO [master-1:16000.activeMasterManager] > master.HMaster: The period is 0 seconds, MobCompactionChore is disabled > {code} > And sometimes you get an opaque message: > {code} > 2019-09-16 22:09:45,333 INFO [master-1:16000.activeMasterManager] > hbase.ChoreService: Could not successfully schedule chore: LogsCleaner > 2019-09-16 22:09:45,340 INFO [master-1:16000.activeMasterManager] > hbase.ChoreService: Could not successfully schedule chore: HFileCleaner > {code} > This is because sometimes we just blindly submit to ChoreService which > submits to a java ScheduledExecutorService and then catches the > IllegalArgumentException. > We should remove the one-offs and make it so ChoreService checks the period > before accepting a submittal and produces a consistent "Foo is disabled" > message. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328132821 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/RollingBatchSuspendResumeRsAction.java ## @@ -0,0 +1,116 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.chaos.actions; + +import java.io.IOException; +import java.util.LinkedList; +import java.util.List; +import java.util.Queue; + +import org.apache.commons.lang3.RandomUtils; +import org.apache.hadoop.hbase.ServerName; +import org.apache.hadoop.hbase.chaos.monkies.PolicyBasedChaosMonkey; +import org.apache.hadoop.hbase.util.Threads; +import org.apache.hadoop.util.Shell; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Suspend then resume a ratio of the regionservers in a rolling fashion. At each step, either + * suspend a server, or resume one, sleeping (sleepTime) in between steps. The parameter + * maxSuspendedServers limits the maximum number of servers that can be down at the same time + * during rolling restarts. + */ +public class RollingBatchSuspendResumeRsAction extends Action { + private static final Logger LOG = + LoggerFactory.getLogger(RollingBatchSuspendResumeRsAction.class); + private float ratio; + private long sleepTime; + private int maxSuspendedServers; // number of maximum suspended servers at any given time. + + public RollingBatchSuspendResumeRsAction(long sleepTime, float ratio) { +this(sleepTime, ratio, 5); + } + + public RollingBatchSuspendResumeRsAction(long sleepTime, float ratio, int maxSuspendedServers) { +this.ratio = ratio; +this.sleepTime = sleepTime; +this.maxSuspendedServers = maxSuspendedServers; + } + + enum SuspendOrResume { +SUSPEND, RESUME + } + + @Override public void perform() throws Exception { Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328133706 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/RollingBatchSuspendResumeRsAction.java ## @@ -0,0 +1,116 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.hbase.chaos.actions; + +import java.io.IOException; +import java.util.LinkedList; +import java.util.List; +import java.util.Queue; + +import org.apache.commons.lang3.RandomUtils; +import org.apache.hadoop.hbase.ServerName; +import org.apache.hadoop.hbase.chaos.monkies.PolicyBasedChaosMonkey; +import org.apache.hadoop.hbase.util.Threads; +import org.apache.hadoop.util.Shell; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Suspend then resume a ratio of the regionservers in a rolling fashion. At each step, either + * suspend a server, or resume one, sleeping (sleepTime) in between steps. The parameter + * maxSuspendedServers limits the maximum number of servers that can be down at the same time + * during rolling restarts. + */ +public class RollingBatchSuspendResumeRsAction extends Action { + private static final Logger LOG = + LoggerFactory.getLogger(RollingBatchSuspendResumeRsAction.class); + private float ratio; + private long sleepTime; + private int maxSuspendedServers; // number of maximum suspended servers at any given time. + + public RollingBatchSuspendResumeRsAction(long sleepTime, float ratio) { +this(sleepTime, ratio, 5); + } + + public RollingBatchSuspendResumeRsAction(long sleepTime, float ratio, int maxSuspendedServers) { +this.ratio = ratio; +this.sleepTime = sleepTime; +this.maxSuspendedServers = maxSuspendedServers; + } + + enum SuspendOrResume { +SUSPEND, RESUME + } + + @Override public void perform() throws Exception { +LOG.info(String.format("Performing action: Rolling batch restarting %d%% of region servers", +(int) (ratio * 100))); +List selectedServers = selectServers(); + +Queue serversToBeSuspended = new LinkedList<>(selectedServers); +Queue suspendedServers = new LinkedList<>(); + +// loop while there are servers to be suspended or suspended servers to be resumed +while ((!serversToBeSuspended.isEmpty() || !suspendedServers.isEmpty()) && !context +.isStopping()) { + SuspendOrResume action; + + if (serversToBeSuspended.isEmpty()) { // no more servers to suspend +action = SuspendOrResume.RESUME; + } else if (suspendedServers.isEmpty()) { +action = SuspendOrResume.SUSPEND; // no more servers to resume + } else if (suspendedServers.size() >= maxSuspendedServers) { +// we have too many suspended servers. Don't suspend any more +action = SuspendOrResume.RESUME; + } else { +// do a coin toss +action = RandomUtils.nextBoolean() ? SuspendOrResume.SUSPEND : SuspendOrResume.RESUME; + } + + ServerName server; + switch (action) { +case SUSPEND: + server = serversToBeSuspended.remove(); + try { +suspendRs(server); + } catch (Shell.ExitCodeException e) { +LOG.info("Problem suspending but presume successful; code=" + e.getExitCode(), e); Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
BukrosSzabolcs commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328134999 ## File path: hbase-server/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java ## @@ -489,6 +499,32 @@ public String abortRegionServer(int serverNumber) { return server; } + /** + * Suspend the specified region server + * @param serverNumber Used as index into a list. + * @return + */ + public JVMClusterUtil.RegionServerThread suspendRegionServer(int serverNumber) { +JVMClusterUtil.RegionServerThread server = +hbaseCluster.getRegionServers().get(serverNumber); +LOG.info("Suspending " + server.toString()); Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (HBASE-22969) A new binary component comparator(BinaryComponentComparator) to perform comparison of arbitrary length and position
[ https://issues.apache.org/jira/browse/HBASE-22969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937754#comment-16937754 ] Josh Elser commented on HBASE-22969: Does this need to be a new class (with new pb's)? You could default the offset to 0 to keep everything working without any change, add the 'offset' attribute to the existing ByteArrayComparable PB as optional (to preserve backwards compatibility). I think this would do away with a bit of the boiler-plate in BinaryComponentComparable. Also, what about adding some unit tests, exercising this new code with a Filter? Your Jira description was very descriptive/illustrative of what you were trying to do. Capturing that in an end-to-end test would be great. > A new binary component comparator(BinaryComponentComparator) to perform > comparison of arbitrary length and position > --- > > Key: HBASE-22969 > URL: https://issues.apache.org/jira/browse/HBASE-22969 > Project: HBase > Issue Type: Improvement > Components: Filters >Reporter: Udai Bhan Kashyap >Assignee: Udai Bhan Kashyap >Priority: Minor > Attachments: HBASE-22969.0003.patch, HBASE-22969.0004.patch, > HBASE-22969.0005.patch, HBASE-22969.0006.patch, HBASE-22969.0007.patch, > HBASE-22969.0008.patch, HBASE-22969.0009.patch, > HBASE-22969.HBASE-22969.0001.patch, HBASE-22969.master.0001.patch > > > Lets say you have composite key: a+b+c+d. And for simplicity assume that > a,b,c, and d all are 4 byte integers. > Now, if you want to execute a query which is semantically same to following > sql: > {{"SELECT * from table where a=1 and b > 10 and b < 20 and c > 90 and c < 100 > and d=1"}} > The only choice you have is to do client side filtering. That could be lots > of unwanted data going through various software components and network. > Solution: > We can create a "component" comparator which takes the value of the > "component" and its relative position in the key to pass the 'Filter' > subsystem of the server: > {code} > FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL); > int bOffset = 4; > byte[] b10 = Bytes.toBytes(10); > Filter b10Filter = new RowFilter(CompareFilter.CompareOp.GREATER, > new BinaryComponentComparator(b10,bOffset)); > filterList.addFilter(b10Filter); > byte[] b20 = Bytes.toBytes(20); > Filter b20Filter = new RowFilter(CompareFilter.CompareOp.LESS, > new BinaryComponentComparator(b20,bOffset)); > filterList.addFilter(b20Filter); > int cOffset = 8; > byte[] c90 = Bytes.toBytes(90); > Filter c90Filter = new RowFilter(CompareFilter.CompareOp.GREATER, > new BinaryComponentComparator(c90,cOffset)); > filterList.addFilter(c90Filter); > byte[] c100 = Bytes.toBytes(100); > Filter c100Filter = new RowFilter(CompareFilter.CompareOp.LESS, > new BinaryComponentComparator(c100,cOffset)); > filterList.addFilter(c100Filter); > in dOffset = 12; > byte[] d1 = Bytes.toBytes(1); > Filter dFilter = new RowFilter(CompareFilter.CompareOp.EQUAL, > new BinaryComponentComparator(d1,dOffset)); > filterList.addFilter(dFilter); > //build start and end key for scan > int aOffset = 0; > byte[] startKey = new byte[16]; //key size with four ints > Bytes.putInt(startKey,aOffset,1); //a=1 > Bytes.putInt(startKey,bOffset,11); //b=11, takes care of b > 10 > Bytes.putInt(startKey,cOffset,91); //c=91, > Bytes.putInt(startKey,dOffset,1); //d=1, > byte[] endKey = new byte[16]; > Bytes.putInt(endKey,aOffset,1); //a=1 > Bytes.putInt(endKey,bOffset,20); //b=20, takes care of b < 20 > Bytes.putInt(endKey,cOffset,100); //c=100, > Bytes.putInt(endKey,dOffset,1); //d=1, > //setup scan > Scan scan = new Scan(startKey,endKey); > scan.setFilter(filterList); > //The scanner below now should give only desired rows. > //No client side filtering is required. > ResultScanner scanner = table.getScanner(scan); > {code} > The comparator can be used with any filter which makes use of > ByteArrayComparable. Most notably it can be used with ValueFilter to filter > out KV based on partial comparison of 'values' : > {code} > byte[] partialValue = Bytes.toBytes("partial_value"); > int partialValueOffset = > Filter partialValueFilter = new > ValueFilter(CompareFilter.CompareOp.GREATER, > new BinaryComponentComparator(partialValue,partialValueOffset)); > {code} > Which in turn can be combined with RowFilter to create a poweful predicate: > {code} > RowFilter rowFilter = new RowFilter(GREATER, new > BinaryComponentComparator(Bytes.toBytes("a"),1); > Filt
[GitHub] [hbase] petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions
petersomogyi commented on a change in pull request #592: HBASE-22982: region server suspend/resume and graceful rolling restart actions URL: https://github.com/apache/hbase/pull/592#discussion_r328139466 ## File path: hbase-it/src/test/java/org/apache/hadoop/hbase/chaos/actions/Action.java ## @@ -150,78 +151,106 @@ public void perform() throws Exception { } } protected void killMaster(ServerName server) throws IOException { -LOG.info("Killing master " + server); +LOG.info("Killing master {}", server); cluster.killMaster(server); cluster.waitForMasterToStop(server, killMasterTimeout); LOG.info("Killed master " + server); } protected void startMaster(ServerName server) throws IOException { -LOG.info("Starting master " + server.getHostname()); +LOG.info("Starting master {}", server.getHostname()); cluster.startMaster(server.getHostname(), server.getPort()); cluster.waitForActiveAndReadyMaster(startMasterTimeout); LOG.info("Started master " + server.getHostname()); } + protected void stopRs(ServerName server) throws IOException { +LOG.info("Stopping regionserver {}", server); +cluster.stopRegionServer(server); +cluster.waitForRegionServerToStop(server, killRsTimeout); Review comment: The name might be a bit confusing but let just leave it like this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (HBASE-23038) Provide consistent and clear logging about disabling chores
[ https://issues.apache.org/jira/browse/HBASE-23038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937759#comment-16937759 ] Sean Busbey commented on HBASE-23038: - thanks for taking this up! can you remove the case in HMaster where we fetch the MobCompactionChore period and check it to log a similar message? > Provide consistent and clear logging about disabling chores > --- > > Key: HBASE-23038 > URL: https://issues.apache.org/jira/browse/HBASE-23038 > Project: HBase > Issue Type: Improvement > Components: master, regionserver >Reporter: Sean Busbey >Assignee: Sanjeet Nishad >Priority: Minor > Labels: beginner > Attachments: HBASE-23038.001.patch, HBASE-23038.002.patch > > > Right now if you want to disable any of our chores you can set the period to > be <= 0. Sometimes, if you do this you get a nice message: > {code} > 2019-09-16 22:10:16,756 INFO [master-1:16000.activeMasterManager] > master.HMaster: The period is 0 seconds, MobCompactionChore is disabled > {code} > And sometimes you get an opaque message: > {code} > 2019-09-16 22:09:45,333 INFO [master-1:16000.activeMasterManager] > hbase.ChoreService: Could not successfully schedule chore: LogsCleaner > 2019-09-16 22:09:45,340 INFO [master-1:16000.activeMasterManager] > hbase.ChoreService: Could not successfully schedule chore: HFileCleaner > {code} > This is because sometimes we just blindly submit to ChoreService which > submits to a java ScheduledExecutorService and then catches the > IllegalArgumentException. > We should remove the one-offs and make it so ChoreService checks the period > before accepting a submittal and produces a consistent "Foo is disabled" > message. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23069) periodic dependency bump for Sep 2019
[ https://issues.apache.org/jira/browse/HBASE-23069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-23069: Summary: periodic dependency bump for Sep 2019 (was: periodic dependency bump) > periodic dependency bump for Sep 2019 > - > > Key: HBASE-23069 > URL: https://issues.apache.org/jira/browse/HBASE-23069 > Project: HBase > Issue Type: Improvement > Components: dependencies, hbase-thirdparty >Reporter: Sean Busbey >Priority: Critical > Fix For: 3.0.0, 1.5.0, 2.3.0 > > > we should do a pass to see if there are any dependencies we can bump. (also > follow-on we should automate this check) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23075) Upgrade jackson version
[ https://issues.apache.org/jira/browse/HBASE-23075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-23075: Component/s: REST dependencies > Upgrade jackson version > --- > > Key: HBASE-23075 > URL: https://issues.apache.org/jira/browse/HBASE-23075 > Project: HBase > Issue Type: Improvement > Components: dependencies, REST >Reporter: Nicholas Jiang >Priority: Major > > A Polymorphic Typing issue was discovered in FasterXML jackson-databind > before 2.9.10. It is related to com.zaxxer.hikari.HikariDataSource. This is a > different vulnerability than CVE-2019-14540. > https://nvd.nist.gov/vuln/detail/CVE-2019-16335 > A Polymorphic Typing issue was discovered in FasterXML jackson-databind > before 2.9.10. It is related to com.zaxxer.hikari.HikariConfig. > https://nvd.nist.gov/vuln/detail/CVE-2019-14540 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23075) Upgrade jackson version
[ https://issues.apache.org/jira/browse/HBASE-23075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937772#comment-16937772 ] Sean Busbey commented on HBASE-23075: - thanks for working on this [~nicholasjiang]! I've added you to the contributors group, so you should be able to assign this jira to yourself now. would you mind putting up a PR to upgrade jackson in the {{hbase-connectors}} repo as well? > Upgrade jackson version > --- > > Key: HBASE-23075 > URL: https://issues.apache.org/jira/browse/HBASE-23075 > Project: HBase > Issue Type: Improvement > Components: dependencies, REST >Reporter: Nicholas Jiang >Priority: Major > > A Polymorphic Typing issue was discovered in FasterXML jackson-databind > before 2.9.10. It is related to com.zaxxer.hikari.HikariDataSource. This is a > different vulnerability than CVE-2019-14540. > https://nvd.nist.gov/vuln/detail/CVE-2019-16335 > A Polymorphic Typing issue was discovered in FasterXML jackson-databind > before 2.9.10. It is related to com.zaxxer.hikari.HikariConfig. > https://nvd.nist.gov/vuln/detail/CVE-2019-14540 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] Apache-HBase commented on issue #647: HBASE-22988 Backport HBASE-11062 "hbtop" to branch-1
Apache-HBase commented on issue #647: HBASE-22988 Backport HBASE-11062 "hbtop" to branch-1 URL: https://github.com/apache/hbase/pull/647#issuecomment-535048911 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | :blue_heart: | reexec | 22m 32s | Docker mode activated. | ||| _ Prechecks _ | | :green_heart: | dupname | 0m 1s | No case conflicting files found. | | :blue_heart: | shelldocs | 0m 1s | Shelldocs was not available. | | :green_heart: | hbaseanti | 0m 0s | Patch does not have any anti-patterns. | | :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | :green_heart: | test4tests | 0m 0s | The patch appears to include 23 new or modified test files. | ||| _ branch-1 Compile Tests _ | | :blue_heart: | mvndep | 6m 11s | Maven dependency ordering for branch | | :green_heart: | mvninstall | 3m 23s | branch-1 passed | | :green_heart: | compile | 9m 9s | branch-1 passed with JDK v1.8.0_222 | | :broken_heart: | compile | 0m 20s | root in branch-1 failed with JDK v1.7.0_232. | | :green_heart: | checkstyle | 5m 38s | branch-1 passed | | :green_heart: | shadedjars | 2m 51s | branch has no errors when building our shaded downstream artifacts. | | :green_heart: | javadoc | 2m 17s | branch-1 passed with JDK v1.8.0_222 | | :green_heart: | javadoc | 3m 54s | branch-1 passed with JDK v1.7.0_232 | | :blue_heart: | spotbugs | 10m 27s | Used deprecated FindBugs config; considering switching to SpotBugs. | | :blue_heart: | findbugs | 0m 34s | branch/hbase-assembly no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | :blue_heart: | mvndep | 0m 30s | Maven dependency ordering for patch | | :green_heart: | mvninstall | 2m 2s | the patch passed | | :green_heart: | compile | 8m 40s | the patch passed with JDK v1.8.0_222 | | :broken_heart: | javac | 8m 40s | root-jdk1.8.0_222 with JDK v1.8.0_222 generated 1 new + 804 unchanged - 0 fixed = 805 total (was 804) | | :broken_heart: | compile | 0m 18s | root in the patch failed with JDK v1.7.0_232. | | :broken_heart: | javac | 0m 18s | root in the patch failed with JDK v1.7.0_232. | | :green_heart: | checkstyle | 5m 33s | the patch passed | | :green_heart: | shellcheck | 0m 2s | There were no new shellcheck issues. | | :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | :broken_heart: | xml | 0m 0s | The patch has 3 ill-formed XML file(s). | | :green_heart: | shadedjars | 2m 46s | patch has no errors when building our shaded downstream artifacts. | | :green_heart: | hadoopcheck | 5m 11s | Patch does not cause any errors with Hadoop 2.8.5 2.9.2. | | :green_heart: | javadoc | 2m 38s | the patch passed with JDK v1.8.0_222 | | :green_heart: | javadoc | 4m 21s | the patch passed with JDK v1.7.0_232 | | :blue_heart: | findbugs | 0m 25s | hbase-assembly has no data from findbugs | ||| _ Other Tests _ | | :green_heart: | unit | 178m 14s | root in the patch passed. | | :green_heart: | asflicense | 1m 44s | The patch does not generate ASF License warnings. | | | | 295m 44s | | | Reason | Tests | |---:|:--| | XML | Parsing Error(s): | | | hbase-assembly/pom.xml | | | hbase-hbtop/pom.xml | | | pom.xml | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-647/7/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/647 | | Optional Tests | dupname asflicense shellcheck shelldocs javac javadoc unit shadedjars hadoopcheck xml compile spotbugs findbugs hbaseanti checkstyle | | uname | Linux 390de8bf6773 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-647/out/precommit/personality/provided.sh | | git revision | branch-1 / 771e184 | | Default Java | 1.7.0_232 | | Multi-JDK versions | /usr/lib/jvm/zulu-8-amd64:1.8.0_222 /usr/lib/jvm/zulu-7-amd64:1.7.0_232 | | compile | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-647/7/artifact/out/branch-compile-root-jdk1.7.0_232.txt | | javac | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-647/7/artifact/out/diff-compile-javac-root-jdk1.8.0_222.txt | | compile | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-647/7/artifact/out/patch-compile-root-jdk1.7.0_232.txt | | javac | https://builds.apache.org/job/HBase-Pre
[GitHub] [hbase] brfrn169 commented on issue #647: HBASE-22988 Backport HBASE-11062 "hbtop" to branch-1
brfrn169 commented on issue #647: HBASE-22988 Backport HBASE-11062 "hbtop" to branch-1 URL: https://github.com/apache/hbase/pull/647#issuecomment-535052624 I think we can ignore the error-prone waring: ``` [WARNING] /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-647/src/hbase-hbtop/src/main/java/org/apache/hadoop/hbase/hbtop/mode/Mode.java:[42,30] [ImmutableEnumChecker] enums should only have immutable fields, the declaration of type 'org.apache.hadoop.hbase.hbtop.mode.ModeStrategy' is not annotated @Immutable ``` It looks like the errors other than the above in the last QA are not related to hbtop. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Commented] (HBASE-23034) List all of our repos in one place (download page? Home page?)
[ https://issues.apache.org/jira/browse/HBASE-23034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937826#comment-16937826 ] Sean Busbey commented on HBASE-23034: - the ref guide page is also missing the {{hbase-native-client}} repo, though as far as I know that work in that repo is not ready for releases. > List all of our repos in one place (download page? Home page?) > -- > > Key: HBASE-23034 > URL: https://issues.apache.org/jira/browse/HBASE-23034 > Project: HBase > Issue Type: Task >Reporter: stack >Priority: Major > > Chatting w/ a co-worker, they said "5 repos!" when I listed out all the hbase > PMC maintains. There is no list anywhere that I know of. There is the > download page which has core and connectors. Thats a start. Should have > operator tools and thirdparty too... Does filesystem have a release? If so, > it should bet added there too. > Should we list the repos on the home page too or somewhere on the main site? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HBASE-23034) List all of our repos in one place
[ https://issues.apache.org/jira/browse/HBASE-23034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-23034: Summary: List all of our repos in one place (was: List all of our repos in one place (download page? Home page?)) > List all of our repos in one place > -- > > Key: HBASE-23034 > URL: https://issues.apache.org/jira/browse/HBASE-23034 > Project: HBase > Issue Type: Task >Reporter: stack >Priority: Major > > Chatting w/ a co-worker, they said "5 repos!" when I listed out all the hbase > PMC maintains. There is no list anywhere that I know of. There is the > download page which has core and connectors. Thats a start. Should have > operator tools and thirdparty too... Does filesystem have a release? If so, > it should bet added there too. > Should we list the repos on the home page too or somewhere on the main site? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-23076) [HBOSS] ZKTreeLockManager shouldn't try to acquire a lock from the InterProcessMutex instance when checking if other processes hold it.
Wellington Chevreuil created HBASE-23076: Summary: [HBOSS] ZKTreeLockManager shouldn't try to acquire a lock from the InterProcessMutex instance when checking if other processes hold it. Key: HBASE-23076 URL: https://issues.apache.org/jira/browse/HBASE-23076 Project: HBase Issue Type: Improvement Reporter: Wellington Chevreuil Assignee: Wellington Chevreuil While going through some internal tests, our [~elserj] had faced major bottleneck problems when creating tables with reasonable large number of pre-split tables: {quote}create 'josh', 'f1', {SPLITS=> (1..500).map {|i| "user# Unknown macro: \{1000+i*(-1000)/500} "}} {quote} The above resulted in RSes taking long time to complete all assignments, leading to APs timeout failures from master point of view, which in turn submits further APs, in a cascade fashion, until RSes RPC queues got flood and started throwing CallQueueFullException, leaving Master with loads of procedures to complete and many RITs. Jstack analysis pointed to potential lock contentions inside *ZKTreeLockManager.isLocked* method. To quote [~elserj] report: {quote}Specifically, lots of threads that look like this: {noformat} "RpcServer.priority.FPBQ.Fifo.handler=8,queue=0,port=16020" #100 daemon prio=5 os_prio=0 tid=0x7f5d6dc3a000 nid=0x6b1 waiting for monitor entry [0x7f5d3bafb000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.recipes.locks.LockInternals.internalLockLoop(LockInternals.java:289) - waiting to lock <0x00074ddd0d10> (a org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.recipes.locks.LockInternals) at org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.recipes.locks.LockInternals.attemptLock(LockInternals.java:219) at org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.recipes.locks.InterProcessMutex.internalLock(InterProcessMutex.java:237) at org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.recipes.locks.InterProcessMutex.acquire(InterProcessMutex.java:108) at org.apache.hadoop.hbase.oss.sync.ZKTreeLockManager.isLocked(ZKTreeLockManager.java:310) at org.apache.hadoop.hbase.oss.sync.ZKTreeLockManager.writeLockAbove(ZKTreeLockManager.java:183) at org.apache.hadoop.hbase.oss.sync.TreeLockManager.treeReadLock(TreeLockManager.java:282) at org.apache.hadoop.hbase.oss.sync.TreeLockManager.lock(TreeLockManager.java:449) at org.apache.hadoop.hbase.oss.HBaseObjectStoreSemantics.open(HBaseObjectStoreSemantics.java:181) at org.apache.hadoop.fs.FilterFileSystem.open(FilterFileSystem.java:166) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:911) at org.apache.hadoop.hbase.util.FSTableDescriptors.readTableDescriptor(FSTableDescriptors.java:566) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorFromFs(FSTableDescriptors.java:559) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorFromFs(FSTableDescriptors.java:545) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:241) at org.apache.hadoop.hbase.regionserver.RSRpcServices.executeOpenRegionProcedures(RSRpcServices.java:3626) at org.apache.hadoop.hbase.regionserver.RSRpcServices.lambda$executeProcedures$2(RSRpcServices.java:3694) at org.apache.hadoop.hbase.regionserver.RSRpcServices$$Lambda$107/1985163471.accept(Unknown Source) at java.util.ArrayList.forEach(ArrayList.java:1257) at java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080) at org.apache.hadoop.hbase.regionserver.RSRpcServices.executeProcedures(RSRpcServices.java:3694) at org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:29774) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:132) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) Locked ownable synchronizers: - None {noformat} This means that we can only open Regions in a table one at a time now (across all regionservers). That's pretty bad and would explain why that part was so slow. Two thoughts already: 1) Having to grab the lock to determine if it's held is sub-optimal. That's what the top of this stacktrace is and I think we need to come up with some other approach because this doesn't scale. 2) We're all blocked in reading the TableDescriptor. Maybe the Master can include the TableDescriptor in the OpenRegionRequest so the R
[GitHub] [hbase] joshelser commented on issue #572: HBASE-22012 Space Quota: DisableTableViolationPolicy will cause cycles of enable/disable table
joshelser commented on issue #572: HBASE-22012 Space Quota: DisableTableViolationPolicy will cause cycles of enable/disable table URL: https://github.com/apache/hbase/pull/572#issuecomment-535071760 Looks ok now. Let me try to get this in. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (HBASE-23034) List all of our repos in one place
[ https://issues.apache.org/jira/browse/HBASE-23034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-23034: Component/s: community > List all of our repos in one place > -- > > Key: HBASE-23034 > URL: https://issues.apache.org/jira/browse/HBASE-23034 > Project: HBase > Issue Type: Task > Components: community >Reporter: stack >Priority: Major > > Chatting w/ a co-worker, they said "5 repos!" when I listed out all the hbase > PMC maintains. There is no list anywhere that I know of. There is the > download page which has core and connectors. Thats a start. Should have > operator tools and thirdparty too... Does filesystem have a release? If so, > it should bet added there too. > Should we list the repos on the home page too or somewhere on the main site? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase-filesystem] wchevreuil opened a new pull request #8: first attempt to improve performance by removing unnecessary calls to…
wchevreuil opened a new pull request #8: first attempt to improve performance by removing unnecessary calls to… URL: https://github.com/apache/hbase-filesystem/pull/8 … lock.acquire Change-Id: I5e75695f56f05e5481ebbfa7350734936b222aba This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Updated] (HBASE-23076) [HBOSS] ZKTreeLockManager shouldn't try to acquire a lock from the InterProcessMutex instance when checking if other processes hold it.
[ https://issues.apache.org/jira/browse/HBASE-23076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Elser updated HBASE-23076: --- Description: While going through some internal tests, our [~elserj] had faced major bottleneck problems when creating tables with reasonable large number of pre-split tables: {noformat} create 'josh', 'f1', {SPLITS=> (1..500).map {|i| "user#{1000+i*(-1000)/500}"}} {noformat} The above resulted in RSes taking long time to complete all assignments, leading to APs timeout failures from master point of view, which in turn submits further APs, in a cascade fashion, until RSes RPC queues got flood and started throwing CallQueueFullException, leaving Master with loads of procedures to complete and many RITs. Jstack analysis pointed to potential lock contentions inside *ZKTreeLockManager.isLocked* method. To quote [~elserj] report: {quote}Specifically, lots of threads that look like this: {noformat} "RpcServer.priority.FPBQ.Fifo.handler=8,queue=0,port=16020" #100 daemon prio=5 os_prio=0 tid=0x7f5d6dc3a000 nid=0x6b1 waiting for monitor entry [0x7f5d3bafb000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.recipes.locks.LockInternals.internalLockLoop(LockInternals.java:289) - waiting to lock <0x00074ddd0d10> (a org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.recipes.locks.LockInternals) at org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.recipes.locks.LockInternals.attemptLock(LockInternals.java:219) at org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.recipes.locks.InterProcessMutex.internalLock(InterProcessMutex.java:237) at org.apache.hadoop.hbase.oss.thirdparty.org.apache.curator.framework.recipes.locks.InterProcessMutex.acquire(InterProcessMutex.java:108) at org.apache.hadoop.hbase.oss.sync.ZKTreeLockManager.isLocked(ZKTreeLockManager.java:310) at org.apache.hadoop.hbase.oss.sync.ZKTreeLockManager.writeLockAbove(ZKTreeLockManager.java:183) at org.apache.hadoop.hbase.oss.sync.TreeLockManager.treeReadLock(TreeLockManager.java:282) at org.apache.hadoop.hbase.oss.sync.TreeLockManager.lock(TreeLockManager.java:449) at org.apache.hadoop.hbase.oss.HBaseObjectStoreSemantics.open(HBaseObjectStoreSemantics.java:181) at org.apache.hadoop.fs.FilterFileSystem.open(FilterFileSystem.java:166) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:911) at org.apache.hadoop.hbase.util.FSTableDescriptors.readTableDescriptor(FSTableDescriptors.java:566) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorFromFs(FSTableDescriptors.java:559) at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorFromFs(FSTableDescriptors.java:545) at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:241) at org.apache.hadoop.hbase.regionserver.RSRpcServices.executeOpenRegionProcedures(RSRpcServices.java:3626) at org.apache.hadoop.hbase.regionserver.RSRpcServices.lambda$executeProcedures$2(RSRpcServices.java:3694) at org.apache.hadoop.hbase.regionserver.RSRpcServices$$Lambda$107/1985163471.accept(Unknown Source) at java.util.ArrayList.forEach(ArrayList.java:1257) at java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080) at org.apache.hadoop.hbase.regionserver.RSRpcServices.executeProcedures(RSRpcServices.java:3694) at org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:29774) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:132) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318) Locked ownable synchronizers: - None {noformat} This means that we can only open Regions in a table one at a time now (across all regionservers). That's pretty bad and would explain why that part was so slow. Two thoughts already: 1) Having to grab the lock to determine if it's held is sub-optimal. That's what the top of this stacktrace is and I think we need to come up with some other approach because this doesn't scale. 2) We're all blocked in reading the TableDescriptor. Maybe the Master can include the TableDescriptor in the OpenRegionRequest so the RS's don't have to read it back? {quote} >From [~elserj] suggestion above, #2 would require changes at hbase project >side, but we still can try optmize hboss *ZKTreeLockManager.isLocked* method >as mentioned in #1. Looking at curator's *InterProcessMutex*, we can use its *getParticipantNodes()*
[GitHub] [hbase] ramkrish86 commented on issue #656: HBASE-23063 Add an option to enable multiget in parallel
ramkrish86 commented on issue #656: HBASE-23063 Add an option to enable multiget in parallel URL: https://github.com/apache/hbase/pull/656#issuecomment-535091493 > About the removal of RpcCallContext#getResponseBlockSize This is what HBASE-14978 doing, in order to limit the number of Blocks each Multi can use, but IMHO this is not suitable now, since the bbCell may be backed by an shared ByteBuffer, no matter how many blocks it will use. Another way, restrict the number of rows to be returned may be more suitable. I see. Let me check this part to understand you better. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] Apache-HBase commented on issue #660: HBASE-23075 Upgrade jackson version
Apache-HBase commented on issue #660: HBASE-23075 Upgrade jackson version URL: https://github.com/apache/hbase/pull/660#issuecomment-535091642 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | :blue_heart: | reexec | 0m 35s | Docker mode activated. | ||| _ Prechecks _ | | :green_heart: | dupname | 0m 0s | No case conflicting files found. | | :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | :yellow_heart: | test4tests | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ master Compile Tests _ | | :green_heart: | mvninstall | 5m 30s | master passed | | :green_heart: | compile | 2m 59s | master passed | | :green_heart: | shadedjars | 4m 30s | branch has no errors when building our shaded downstream artifacts. | | :green_heart: | javadoc | 2m 52s | master passed | ||| _ Patch Compile Tests _ | | :green_heart: | mvninstall | 4m 57s | the patch passed | | :green_heart: | compile | 3m 0s | the patch passed | | :green_heart: | javac | 3m 0s | the patch passed | | :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | :green_heart: | xml | 0m 1s | The patch has no ill-formed XML file. | | :green_heart: | shadedjars | 4m 33s | patch has no errors when building our shaded downstream artifacts. | | :green_heart: | hadoopcheck | 15m 45s | Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2. | | :green_heart: | javadoc | 2m 51s | the patch passed | ||| _ Other Tests _ | | :broken_heart: | unit | 175m 2s | root in the patch failed. | | :green_heart: | asflicense | 0m 41s | The patch does not generate ASF License warnings. | | | | 229m 28s | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-660/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hbase/pull/660 | | Optional Tests | dupname asflicense javac javadoc unit shadedjars hadoopcheck xml compile | | uname | Linux 78532474be7c 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-660/out/precommit/personality/provided.sh | | git revision | master / 52f5a85bfc | | Default Java | 1.8.0_181 | | unit | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-660/1/artifact/out/patch-unit-root.txt | | Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-660/1/testReport/ | | Max. process+thread count | 4456 (vs. ulimit of 1) | | modules | C: . U: . | | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-660/1/console | | versions | git=2.11.0 maven=2018-06-17T18:33:14Z) | | Powered by | Apache Yetus 0.11.0 https://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase-filesystem] joshelser commented on issue #8: HBASE-23076 [HBOSS] ZKTreeLockManager shouldn't try to acquire a lock from the InterProcessMutex instance when checking if other processes
joshelser commented on issue #8: HBASE-23076 [HBOSS] ZKTreeLockManager shouldn't try to acquire a lock from the InterProcessMutex instance when checking if other processes hold it. URL: https://github.com/apache/hbase-filesystem/pull/8#issuecomment-535093164 For reference, I wrote a little test that spun up 100 clients all trying to read the same file, ~5directories beneath the root of the filesystem (analogous to reading a table descriptor when opening a region). At `HEAD`, this took about 80 seconds. With Wellington's change here, it takes about 400ms. The other concern I had was around the correctness of the change, but TreeLockManager is always checking for a conflicting lock after it grabs the new lock in ZK. Best as I can think through things, that means we're just fine. Really good change, Wellington! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] ramkrish86 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel
ramkrish86 commented on a change in pull request #656: HBASE-23063 Add an option to enable multiget in parallel URL: https://github.com/apache/hbase/pull/656#discussion_r328211174 ## File path: hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java ## @@ -934,13 +939,61 @@ private Result increment(final HRegion region, final OperationQuota quota, builder.addResultOrException(resultOrExceptionBuilder.build()); } } +// do multiget in parallel +if (getCtxs != null && !getCtxs.isEmpty()) { + doParallelGet(getCtxs, cellsToReturn, builder); +} // Finish up any outstanding mutations if (!CollectionUtils.isEmpty(mutations)) { doNonAtomicBatchOp(builder, region, quota, mutations, cellScanner, spaceQuotaEnforcement); } return cellsToReturn; } + private void doParallelGet(List getCtxs, List cellsToReturn, + RegionActionResult.Builder builder) throws ServiceException { +ResultOrException.Builder resultOrExceptionBuilder = null; +CountDownLatch latch = new CountDownLatch(getCtxs.size()); +List handlers = new ArrayList<>(getCtxs.size()); +for (GetContext getCtx : getCtxs) { + GetActionHandler handler = new GetActionHandler(getCtx, latch); + this.regionServer.executorService.submit(handler); + handlers.add(handler); +} +try { + latch.await(); Review comment: My concern was since the handler is already submitted to the executor, before even the await() happens there could be a chance that the countdown() could be called. Seems calling countDown() without await() is not a problem. But the reverse could be a problem. Will that happen? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (HBASE-23077) move entirely to spotbugs
Sean Busbey created HBASE-23077: --- Summary: move entirely to spotbugs Key: HBASE-23077 URL: https://issues.apache.org/jira/browse/HBASE-23077 Project: HBase Issue Type: Improvement Components: build, test Reporter: Sean Busbey we've been relying on spotbugs definitions with findbugs tooling for awhile now. I think spotbugs now provides its own versions of everything we need (and Yetus supports spotbugs directly). Do a pass removing findbugs tooling from all branches in favor of spotbugs. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [hbase] busbey commented on issue #616: HBASE-23015 : Moving from Jackson2 to shaded Gson (Backport HBASE-20587)
busbey commented on issue #616: HBASE-23015 : Moving from Jackson2 to shaded Gson (Backport HBASE-20587) URL: https://github.com/apache/hbase/pull/616#issuecomment-535120241 Okay! a new version of the hbase-thirdparty library is out now and includes a jdk7 compatible relocated GSON. I just published it on the ASF nexus, so it might take a bit for it to show up in maven central. When it does you should be able to update this PR to use the following dependency: ``` org.apache.hbase.thirdparty hbase-shaded-gson 3.0.0 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] virajjasani commented on issue #616: HBASE-23015 : Moving from Jackson2 to shaded Gson (Backport HBASE-20587)
virajjasani commented on issue #616: HBASE-23015 : Moving from Jackson2 to shaded Gson (Backport HBASE-20587) URL: https://github.com/apache/hbase/pull/616#issuecomment-535147235 That's great! Thanks @busbey Sure I will be on it, soon as my local build passes, will commit it This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [hbase] virajjasani commented on issue #616: HBASE-23015 : Moving from Jackson2 to shaded Gson (Backport HBASE-20587)
virajjasani commented on issue #616: HBASE-23015 : Moving from Jackson2 to shaded Gson (Backport HBASE-20587) URL: https://github.com/apache/hbase/pull/616#issuecomment-535151722 It is available: https://repository.apache.org/content/repositories/releases/org/apache/hbase/thirdparty/hbase-shaded-gson/3.0.0/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services