[jira] [Resolved] (HBASE-25482) Improve SimpleRegionNormalizer#getAverageRegionSizeMb
[ https://issues.apache.org/jira/browse/HBASE-25482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-25482. -- Resolution: Fixed > Improve SimpleRegionNormalizer#getAverageRegionSizeMb > - > > Key: HBASE-25482 > URL: https://issues.apache.org/jira/browse/HBASE-25482 > Project: HBase > Issue Type: Improvement > Components: Normalizer >Affects Versions: 3.0.0-alpha-1, 2.4.0 >Reporter: Baiqiang Zhao >Assignee: Baiqiang Zhao >Priority: Minor > Fix For: 3.0.0-alpha-1, 2.5.0, 2.3.6, 2.4.2 > > > If the table is set NormalizerTargetRegionSize, we take > NormalizerTargetRegionSize as avgRegionSize and return it. So the totalSizeMb > of table is not always calculated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (HBASE-25482) Improve SimpleRegionNormalizer#getAverageRegionSizeMb
[ https://issues.apache.org/jira/browse/HBASE-25482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk reopened HBASE-25482: -- > Improve SimpleRegionNormalizer#getAverageRegionSizeMb > - > > Key: HBASE-25482 > URL: https://issues.apache.org/jira/browse/HBASE-25482 > Project: HBase > Issue Type: Improvement > Components: Normalizer >Affects Versions: 3.0.0-alpha-1, 2.4.0 >Reporter: Baiqiang Zhao >Assignee: Baiqiang Zhao >Priority: Minor > Fix For: 3.0.0-alpha-1, 2.4.2 > > > If the table is set NormalizerTargetRegionSize, we take > NormalizerTargetRegionSize as avgRegionSize and return it. So the totalSizeMb > of table is not always calculated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (HBASE-25653) Add units and round off region size to 2 digits after decimal
[ https://issues.apache.org/jira/browse/HBASE-25653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk reopened HBASE-25653: -- Reopen for backports. > Add units and round off region size to 2 digits after decimal > - > > Key: HBASE-25653 > URL: https://issues.apache.org/jira/browse/HBASE-25653 > Project: HBase > Issue Type: Improvement > Components: master, Normalizer >Affects Versions: 3.0.0-alpha-1, 2.3.0, 2.4.0 >Reporter: Nick Dimiduk >Assignee: Divyesh Chandra >Priority: Major > Fix For: 3.0.0-alpha-1 > > > Normalizer logs progress and includes details regarding region count and > target size. The size values are logged without units, and floating point > numbers are logged without a specified precision, which can result in > scientific notation. Clean up the formatting of these log messages to make > them easily readable. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HBASE-25482) Improve SimpleRegionNormalizer#getAverageRegionSizeMb
[ https://issues.apache.org/jira/browse/HBASE-25482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk resolved HBASE-25482. -- Resolution: Fixed Sorry, wrong issue. > Improve SimpleRegionNormalizer#getAverageRegionSizeMb > - > > Key: HBASE-25482 > URL: https://issues.apache.org/jira/browse/HBASE-25482 > Project: HBase > Issue Type: Improvement > Components: Normalizer >Affects Versions: 3.0.0-alpha-1, 2.4.0 >Reporter: Baiqiang Zhao >Assignee: Baiqiang Zhao >Priority: Minor > Fix For: 3.0.0-alpha-1, 2.4.2 > > > If the table is set NormalizerTargetRegionSize, we take > NormalizerTargetRegionSize as avgRegionSize and return it. So the totalSizeMb > of table is not always calculated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (HBASE-25482) Improve SimpleRegionNormalizer#getAverageRegionSizeMb
[ https://issues.apache.org/jira/browse/HBASE-25482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk reopened HBASE-25482: -- Reopening for backports. > Improve SimpleRegionNormalizer#getAverageRegionSizeMb > - > > Key: HBASE-25482 > URL: https://issues.apache.org/jira/browse/HBASE-25482 > Project: HBase > Issue Type: Improvement > Components: Normalizer >Affects Versions: 3.0.0-alpha-1, 2.4.0 >Reporter: Baiqiang Zhao >Assignee: Baiqiang Zhao >Priority: Minor > Fix For: 3.0.0-alpha-1, 2.4.2 > > > If the table is set NormalizerTargetRegionSize, we take > NormalizerTargetRegionSize as avgRegionSize and return it. So the totalSizeMb > of table is not always calculated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-25769) Update default weight of cost functions
Clara Xiong created HBASE-25769: --- Summary: Update default weight of cost functions Key: HBASE-25769 URL: https://issues.apache.org/jira/browse/HBASE-25769 Project: HBase Issue Type: Sub-task Components: Balancer Reporter: Clara Xiong In production, we have seen some critical big tables that handle majority of the load. Table Skew is becoming more important. With the update of table skew function, balancer finally works for large table distribution on large cluster. We should increase the weight from 35 to a level comparable to region count skew: 500. We can even push further to replace region count skew by table skew since the latter works in the same way and account for region distribution per node. Another weight we found helpful to increase is for store file size cost function. Ideally if normalizer works perfectly, we don't need to worry about it since region count skew would have accounted for it. But we are often in a situation it doesn't. Store file distribution needs to be given more way as accommodation. we tested changing it from 5 to 200 and it works fine. -- This message was sent by Atlassian Jira (v8.3.4#803005)
Multi-dimensional Range Queries Help
Hi! Our application requires fast read queries that specify two ranges. One range on timestamps, and another on ids. We are currently using Apache HBase as our db, but we’re unsure how to optimally design the row keys / schemas. Currently, scanning over row key (the ids) with filter on timeranges is taking more time than what we expect. A normal query would probably have say 200 rows that match the id range, and about 10 rows that match both ranges, and we have currently on the order of 10s of millions of rows. We’re wondering if there’s something we can do to increase throughput with HBase (e.g., is there something like composite indexing like in MySQL?). Not sure if this is the best place to ask this, but if anyone could point us to the right direction, that would be great! Thank you!
Multi-dimensional Range Queries Help
Hi! Our application requires fast read queries that specify two ranges. One range on timestamps, and another on ids. We are currently using Apache HBase as our db, but we’re unsure how to optimally design the row keys / schemas. Currently, scanning over row key (the ids) with filter on timeranges is taking more time than what we expect. A normal query would probably have say 200 rows that match the id range, and about 10 rows that match both ranges, and we have currently on the order of 10s of millions of rows. We’re wondering if there’s something we can do to increase throughput with HBase (e.g., is there something like composite indexing like in MySQL?). Not sure if this is the best place to ask this, but if anyone could point us to the right direction, that would be great! Thank you!
[jira] [Resolved] (HBASE-25759) The master services field in LocalityBasedCostFunction is never used
[ https://issues.apache.org/jira/browse/HBASE-25759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang resolved HBASE-25759. --- Fix Version/s: 2.3.6 2.4.3 2.5.0 3.0.0-alpha-1 Hadoop Flags: Reviewed Resolution: Fixed Pushed to branch-2.3+. Thanks [~niuyulin] for reviewing. > The master services field in LocalityBasedCostFunction is never used > > > Key: HBASE-25759 > URL: https://issues.apache.org/jira/browse/HBASE-25759 > Project: HBase > Issue Type: Improvement > Components: Balancer >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0-alpha-1, 2.5.0, 2.4.3, 2.3.6 > > > We just use it to test whether we should do the calculation. > But in fact, it should never be null when we arrive here. Maybe it is because > in test we may skip setting the master services but I think we should just > skip balancing in the balancer, not in the cost function. -- This message was sent by Atlassian Jira (v8.3.4#803005)
how to join slack channel
Hi, adminI'm a bigdata developer and very interested in hbase, so I want to join hbase slack channel, but i don't have an email address like xx.apache.org or xx.google.comhow could i join the channel, please show me a method to join Thx
Re: [ANNOUNCE] New HBase committer Geoffrey Jacoby
Congratulations Geoffrey! - Toshi > On Apr 11, 2021, at 0:12, Ankit Singhal wrote: > > Congratulations and welcome Geoffrey!! > > On Sat, Apr 10, 2021 at 5:48 PM Jan Hentschel < > jan.hentsc...@ultratendency.com> wrote: > >> Congratulations and welcome! >> >> From: Viraj Jasani >> Reply-To: "dev@hbase.apache.org" >> Date: Friday, April 9, 2021 at 1:24 PM >> To: hbase-dev , hbase-user >> Subject: [ANNOUNCE] New HBase committer Geoffrey Jacoby >> >> On behalf of the Apache HBase PMC I am pleased to announce that Geoffrey >> Jacoby has accepted the PMC's invitation to become a committer on the >> project. >> >> Thanks so much for the work you've been contributing. We look forward to >> your continued involvement. >> >> Congratulations and welcome, Geoffrey! >> >>
Re: [VOTE] The first HBase 2.2.7 release candidate (RC0) is available
+1 (binding) * Signature: ok * Checksum : ok * Rat check (1.8.0_282): ok - mvn clean apache-rat:check * Built from source (1.8.0_282): ok - mvn clean install -DskipTests * Unit tests pass (1.8.0_282): ok - mvn package -P runAllTests -Dsurefire.rerunFailingTestsCount=3 On Sun, Apr 11, 2021 at 6:05 PM Jan Hentschel < jan.hentsc...@ultratendency.com> wrote: > +1 (binding) > > * Signature: ok > * Checksum : ok > * Rat check (1.8.0_202-ea): ok > - mvn clean apache-rat:check > * Built from source (1.8.0_202-ea): ok > - mvn clean install -DskipTests > * Unit tests pass (1.8.0_202-ea): ok > - mvn package -P runSmallTests > -Dsurefire.rerunFailingTestsCount=3 > > Verified the compatibility report, as well as the changes and release > notes. > > From: Guanghao Zhang > Reply-To: "u...@hbase.apache.org" > Date: Sunday, April 11, 2021 at 2:34 PM > To: HBase Dev List , Hbase-User < > u...@hbase.apache.org> > Subject: [VOTE] The first HBase 2.2.7 release candidate (RC0) is available > > [CAUTION] The sender of this email is outside our organization. Please DO > NOT CLICK links, download attachments or respond unless you recognize the > sender and know the content is safe. Please contact IT immediately in case > you find it suspicious. > > Please vote on this release candidate (RC) for Apache HBase 2.2.7. > Meanwhile, as branch-2.2 will be EOL, please don't push new commits to it. > And this will be the last one of the 2.2.x releases. Thanks. > > The VOTE will remain open for at least 72 hours. > > [ ] +1 Release this package as Apache HBase 2.2.7 > [ ] -1 Do not release this package because ... > > The tag to be voted on is 2.2.7RC0. The release files, including > signatures, digests, etc. can be found at: > https://dist.apache.org/repos/dist/dev/hbase/2.2.7RC0/ > > Maven artifacts are available in a staging repository at: > https://repository.apache.org/content/repositories/orgapachehbase-1440/ > > Signatures used for HBase RCs can be found in this file: > https://dist.apache.org/repos/dist/release/hbase/KEYS > > The list of bug fixes going into 2.2.7 can be found in included > CHANGES.md and RELEASENOTES.md available here: > https://dist.apache.org/repos/dist/dev/hbase/2.2.7RC0/CHANGES.md > https://dist.apache.org/repos/dist/dev/hbase/2.2.7RC0/RELEASENOTES.md > > A detailed source and binary compatibility report for this release is > available at: > > https://dist.apache.org/repos/dist/dev/hbase/2.2.7RC0/api_compare_2.2.7RC0_to_2.2.6.html > > To learn more about Apache HBase, please see http://hbase.apache.org/ > > Thanks, > Guanghao Zhang > >
[jira] [Reopened] (HBASE-25748) [Flake Test][branch-1] TestAdmin2
[ https://issues.apache.org/jira/browse/HBASE-25748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reid Chan reopened HBASE-25748: --- Spotted a bug in testCreateTableRPCTimeOut, let me try to fix it. > [Flake Test][branch-1] TestAdmin2 > - > > Key: HBASE-25748 > URL: https://issues.apache.org/jira/browse/HBASE-25748 > Project: HBase > Issue Type: Test > Components: test >Reporter: Reid Chan >Assignee: Reid Chan >Priority: Major > Fix For: 1.7.0 > > > Attempt to improve the stability. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HBASE-25768) Support a overall simple and fast balance strategy for StochasticLoadBalancer
Xiaolin Ha created HBASE-25768: -- Summary: Support a overall simple and fast balance strategy for StochasticLoadBalancer Key: HBASE-25768 URL: https://issues.apache.org/jira/browse/HBASE-25768 Project: HBase Issue Type: Improvement Components: Balancer Affects Versions: 1.4.13, 2.0.0, 3.0.0-alpha-1 Reporter: Xiaolin Ha Assignee: Xiaolin Ha When we use StochasticLoadBalancer + balanceByTable, we could face two difficulties. # For each table, their regions are distributed uniformly, but for the overall cluster, still exiting imbalance between RSes; # When there are large-scaled restart of RSes, or expansion for groups or cluster, we hope the balancer can execute as soon as possible, but the StochasticLoadBalancer may need a lot of time to compute costs. We can detect these circumstances in StochasticLoadBalancer, and before the normal balance steps trying, we can add a strategy to let it just balance like the SimpleLoadBalancer or use few light cost functions here. -- This message was sent by Atlassian Jira (v8.3.4#803005)