[jira] [Resolved] (HBASE-20341) Nothing in refguide on hedgedreads; fix
[ https://issues.apache.org/jira/browse/HBASE-20341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-20341. --- Resolution: Invalid Assignee: Wei-Chiu Chuang Fix Version/s: (was: 2.0.0) It is not valid [~jojochuang]. I must have mis-searched. Hedged Read doc has been around for ever, before.. commit 1a21c1684c5d68cb2d1da8ed33500993b0965f8a Author: Misty Stanley-JonesDate: Wed Jan 7 14:02:16 2015 +1000 HBASE-11533 Asciidoc Proof of Concept ... And then got updates with the likes of the commit 6ee2dcf480dd95877a20e33086a020eb1a19e41f Author: Michael Stack Date: Mon Nov 14 10:27:58 2016 -0800 HBASE-17089 Add doc on experience running with hedged reads Which is Yu Li's experience w/ hedged reads... and then below commit 86df89b01608052dad4ef75abde5a3fe79447ac0 Author: Michael Stack Date: Mon Nov 14 21:06:29 2016 -0800 HBASE-17089 Add doc on experience running with hedged reads; ADDENDUM adding in Ashu Pachauri's experience I'm not sure how I got it so wrong. Thanks [~jojochuang] Assigning you the issue because you noticed the mess-up. > Nothing in refguide on hedgedreads; fix > --- > > Key: HBASE-20341 > URL: https://issues.apache.org/jira/browse/HBASE-20341 > Project: HBase > Issue Type: Bug > Components: documentation >Reporter: stack >Assignee: Wei-Chiu Chuang >Priority: Critical > > There are even metrics from HBASE-12220 that expose counts. Talk them up and > hedged reads in refguide. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
RE: [VOTE] First release candidate for HBase 2.0.0 (RC0) is available
bq. signatures & sums- NOT OK (md5 checksums missing) This is intentional I think, check HBASE-20385. Regards, Ashish -Original Message- From: Umesh Agashe [mailto:uaga...@cloudera.com] Sent: Tuesday, April 17, 2018 4:01 AM To: dev@hbase.apache.org Subject: Re: [VOTE] First release candidate for HBase 2.0.0 (RC0) is available -1 non-binding (hbck with write operations disabled not included) download src & bin tar ball - OK signatures & sums- NOT OK (md5 checksums missing) build from source (openjdk version "1.8.0_151") - OK rat check - OK start local instance from bin & CRUD from shell - OK LTT write, read1 million rows, 2 cols/row - OK check logs - OK On Fri, Apr 13, 2018 at 10:55 AM, Stackwrote: > On Fri, Apr 13, 2018 at 10:53 AM, Josh Elser wrote: > > > Was poking around with PE on a few nodes (I forget the exact > > circumstances, need to look back at this), and ran into a case where > > ~35 regions were left as RIT > > > > 2018-04-12 22:05:24,431 ERROR > > [master/ctr-e138-1518143905142-221855-01- > 02:16000] > > procedure2.ProcedureExecutor: Corrupt pid=3580, ppid=3534, > > state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=TestTable, > > region=71fef ffe6b5b3cf1cb6d3328a5a58690 > > > > Saw entries like this (I think) for each region which was stuck. A > > simple `assign` in the shell brought them back, but I need to dig in > > some more > to > > understand what went wrong. > > > > > Log? > > HBASE-18152? > > Thanks Josh, > S > > > > > > > > On 4/10/18 4:47 PM, Stack wrote: > > > >> The first release candidate for Apache HBase 2.0.0 is available for > >> downloading and testing. > >> > >> Artifacts are available here: > >> > >> https://dist.apache.org/repos/dist/dev/hbase/hbase-2.0.0RC0/ > >> > >> Maven artifacts are available in the staging repository at: > >> > >> https://repository.apache.org/content/repositories/ > orgapachehbase-1209 > >> > >> All artifacts are signed with my signing key 8ACC93D2, which is > >> also in the project KEYS file at > >> > >> http://www.apache.org/dist/hbase/KEYS > >> > >> These artifacts were tagged 2.0.0RC0 at hash > >> 011dd2dae33456b3a2bcc2513e9fdd29de23be46 > >> > >> Please review 'Upgrading from 1.x to 2.x' in the bundled HBase > >> 2.0.0 Reference Guide before installing or upgrading for a list of > >> incompatibilities, major changes, and notable new features. Be > >> aware > that > >> according to our adopted Semantic Versioning guidelines[1], we've > >> allow ourselves to make breaking changes in this major version > >> release. For example, Coprocessors will need to be recast to fit > >> more constrained CP APIs and a rolling upgrade of an hbase-1.x > >> install to hbase-2.x without downtime is (currently) not possible. > >> That said, a bunch of effort has been expended mitigating > >> differences; a hbase-1.x client can perform DML against an hbase-2 > >> cluster. > >> > >> For the full list of ~6k issues addressed, see [2]. There are also > >> CHANGES.md and RELEASENOTES.md in the root directory of the source > >> tarball. > >> > >> Please take a few minutes to verify the release and vote on > >> releasing > it: > >> > >> [ ] +1 Release this package as Apache HBase 2.0.0 [ ] +0 no opinion > >> [ ] -1 Do not release this package because... > >> > >> This VOTE will run for one week and close Tuesday, April 17, 2018 @ > 13:00 > >> PST. > >> > >> Thanks to the myriad who have helped out with this release, Your > >> 2.0.0 Release Manager > >> > >> 1. http://hbase.apache.org/2.0/book.html#hbase.versioning.post10 > >> 2. https://s.apache.org/zwS9 > >> > >> >
Re: [ANNOUNCE] Please welcome Francis Liu to the HBase PMC
Thanks a lot guys! Looking forward to more 1.3 releases and beyond. Francis On Thu, Apr 12, 2018 at 4:45 PM Stackwrote: > Hot dog! > > On Wed, Apr 11, 2018 at 1:03 PM, Andrew Purtell > wrote: > > > On behalf of the Apache HBase PMC I am pleased to announce that Francis > > Liu has accepted our invitation to become a PMC member on the Apache > > HBase project. We appreciate Francis stepping up to take more > > responsibility in the HBase project. He has been an active contributor to > > HBase for many years and recently took over responsibilities as branch RM > > for branch-1.3. > > > > Please join me in welcoming Francis to the HBase PMC! > > > > -- > > Best regards, > > Andrew > > >
Re: [VOTE] First release candidate for HBase 2.0.0 (RC0) is available
-1 non-binding (hbck with write operations disabled not included) download src & bin tar ball - OK signatures & sums- NOT OK (md5 checksums missing) build from source (openjdk version "1.8.0_151") - OK rat check - OK start local instance from bin & CRUD from shell - OK LTT write, read1 million rows, 2 cols/row - OK check logs - OK On Fri, Apr 13, 2018 at 10:55 AM, Stackwrote: > On Fri, Apr 13, 2018 at 10:53 AM, Josh Elser wrote: > > > Was poking around with PE on a few nodes (I forget the exact > > circumstances, need to look back at this), and ran into a case where ~35 > > regions were left as RIT > > > > 2018-04-12 22:05:24,431 ERROR [master/ctr-e138-1518143905142-221855-01- > 02:16000] > > procedure2.ProcedureExecutor: Corrupt pid=3580, ppid=3534, > > state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=TestTable, > > region=71fef ffe6b5b3cf1cb6d3328a5a58690 > > > > Saw entries like this (I think) for each region which was stuck. A simple > > `assign` in the shell brought them back, but I need to dig in some more > to > > understand what went wrong. > > > > > Log? > > HBASE-18152? > > Thanks Josh, > S > > > > > > > > On 4/10/18 4:47 PM, Stack wrote: > > > >> The first release candidate for Apache HBase 2.0.0 is available for > >> downloading and testing. > >> > >> Artifacts are available here: > >> > >> https://dist.apache.org/repos/dist/dev/hbase/hbase-2.0.0RC0/ > >> > >> Maven artifacts are available in the staging repository at: > >> > >> https://repository.apache.org/content/repositories/ > orgapachehbase-1209 > >> > >> All artifacts are signed with my signing key 8ACC93D2, which is also > >> in the project KEYS file at > >> > >> http://www.apache.org/dist/hbase/KEYS > >> > >> These artifacts were tagged 2.0.0RC0 at > >> hash 011dd2dae33456b3a2bcc2513e9fdd29de23be46 > >> > >> Please review 'Upgrading from 1.x to 2.x' in the bundled HBase 2.0.0 > >> Reference Guide before installing or upgrading for a list of > >> incompatibilities, major changes, and notable new features. Be aware > that > >> according to our adopted Semantic Versioning guidelines[1], we've allow > >> ourselves to make breaking changes in this major version release. For > >> example, Coprocessors will need to be recast to fit more constrained CP > >> APIs and a rolling upgrade of an hbase-1.x install to hbase-2.x without > >> downtime is (currently) not possible. That said, a bunch of effort has > >> been > >> expended mitigating differences; a hbase-1.x client can perform DML > >> against > >> an hbase-2 cluster. > >> > >> For the full list of ~6k issues addressed, see [2]. There are also > >> CHANGES.md and RELEASENOTES.md in the root directory of the source > >> tarball. > >> > >> Please take a few minutes to verify the release and vote on releasing > it: > >> > >> [ ] +1 Release this package as Apache HBase 2.0.0 > >> [ ] +0 no opinion > >> [ ] -1 Do not release this package because... > >> > >> This VOTE will run for one week and close Tuesday, April 17, 2018 @ > 13:00 > >> PST. > >> > >> Thanks to the myriad who have helped out with this release, > >> Your 2.0.0 Release Manager > >> > >> 1. http://hbase.apache.org/2.0/book.html#hbase.versioning.post10 > >> 2. https://s.apache.org/zwS9 > >> > >> >
[jira] [Created] (HBASE-20431) Store commit transaction for filesystems that do not support an atomic rename
Andrew Purtell created HBASE-20431: -- Summary: Store commit transaction for filesystems that do not support an atomic rename Key: HBASE-20431 URL: https://issues.apache.org/jira/browse/HBASE-20431 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell HBase expects the Hadoop filesystem implementation to support an atomic rename() operation. HDFS does. The S3 backed filesystems do not. The fundamental issue is the non-atomic and eventually consistent nature of the S3 service. A S3 bucket is not a filesystem. S3 is not always immediately read-your-writes. Object metadata can be temporarily inconsistent just after new objects are stored. There can be a settling period to ride over. Renaming/moving objects from one path to another are copy operations with O(file) complexity and O(data) time followed by a series of deletes with O(file) complexity. Failures at any point prior to completion will leave the operation in an inconsistent state. The missing atomic rename semantic opens opportunities for corruption and data loss, which may or may not be repairable with HBCK. Handling this at the HBase level could be done with a new multi-step filesystem transaction framework. Call it StoreCommitTransaction. SplitTransaction and MergeTransaction are well established cases where even on HDFS we have non-atomic filesystem changes and are our implementation template for the new work. In this new StoreCommitTransaction we'd be moving flush and compaction temporaries out of the temporary directory into the region store directory. On HDFS the implementation would be easy. We can rely on the filesystem's atomic rename semantics. On S3 it would be work: First we would build the list of objects to move, then copy each object into the destination, and then finally delete all objects at the original path. We must handle transient errors with retry strategies appropriate for the action at hand. We must handle serious or permanent errors where the RS doesn't need to be aborted with a rollback that cleans it all up. Finally, we must handle permanent errors where the RS must be aborted with a rollback during region open/recovery. Note that after all objects have been copied and we are deleting obsolete source objects we must roll forward, not back. To support recovery after an abort we must utilize the WAL to track transaction progress. Put markers in for StoreCommitTransaction start and completion state, with details of the store file(s) involved, so it can be rolled back during region recovery at open. This will be significant work in HFile, HStore, flusher, compactor, and HRegion. Wherever we use HDFS's rename now we would substitute the running of this new multi-step filesystem transaction. We need to determine this for certain, but I believe the PUT or multipart upload of an object must complete before the object is visible, so we don't have to worry about the case where an object is visible before fully uploaded as part of normal operations. So an individual object copy will either happen entirely and the target will then become visible, or it won't and the target won't exist. S3 has an optimization, PUT COPY (https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectCOPY.html), which the AmazonClient embedded in S3A utilizes for moves. When designing the StoreCommitTransaction be sure to allow for filesystem implementations that leverage a server side copy operation. Doing a get-then-put should be optional. (Not sure Hadoop has an interface that advertises this capability yet; we can add one if not.) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20430) Improve store file management for non-HDFS filesystems
Andrew Purtell created HBASE-20430: -- Summary: Improve store file management for non-HDFS filesystems Key: HBASE-20430 URL: https://issues.apache.org/jira/browse/HBASE-20430 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell HBase keeps a file open for every active store file so no additional round trips to the NameNode are needed after the initial open. HDFS internally multiplexes open files, but the Hadoop S3 filesystem implementations do not, or, at least, not as well. As the bulk of data under management increases we observe the required number of concurrently open connections will rise, and expect it will eventually exhaust a limit somewhere (the client, the OS file descriptor table or open file limits, or the S3 service). Initially we can simply introduce an option to close every store file after the reader has finished, and determine the performance impact. Use cases backed by non-HDFS filesystems will already have to cope with a different read performance profile. Based on experiments with the S3 backed Hadoop filesystems, notably S3A, even with aggressively tuned options simple reads can be very slow when there are blockcache misses, 15-20 seconds observed for Get of a single small row, for example. We expect extensive use of the BucketCache to mitigate in this application already. Could be backed by offheap storage, but more likely a large number of cache files managed by the file engine on local SSD storage. If misses are already going to be super expensive, then the motivation to do more than simply open store files on demand is largely absent. Still, we could employ a predictive cache. Where frequent access to a given store file (or, at least, its store) is predicted, keep a reference to the store file open. Can keep statistics about read frequency, write it out to HFiles during compaction, and note these stats when opening the region, perhaps by reading all meta blocks of region HFiles when opening. Otherwise, close the file after reading and open again on demand. Need to be careful not to use ARC or equivalent as cache replacement strategy as it is encumbered. The size of the cache can be determined at startup after detecting the underlying filesystem. Eg. setCacheSize(VERY_LARGE_CONSTANT) if (fs instanceof DistributedFileSystem), so we don't lose much when on HDFS still. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20429) Support for mixed or write-heavy workloads on non-HDFS filesystems
Andrew Purtell created HBASE-20429: -- Summary: Support for mixed or write-heavy workloads on non-HDFS filesystems Key: HBASE-20429 URL: https://issues.apache.org/jira/browse/HBASE-20429 Project: HBase Issue Type: Umbrella Reporter: Andrew Purtell We can support reasonably well use cases on non-HDFS filesystems, like S3, where an external writer has loaded (and continues to load) HFiles via the bulk load mechanism, and then we serve out a read only workload at the HBase API. Mixed workloads or write-heavy workloads won't fare as well. In fact, data loss seems certain. It will depend in the specific filesystem, but all of the S3 backed Hadoop filesystems suffer from a couple of obvious problems, notably a lack of atomic rename. This umbrella will serve to collect some related ideas for consideration. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-20428) [shell] list.first method in HBase2 shell fails with "NoMethodError: undefined method `first' for nil:NilClass"
[ https://issues.apache.org/jira/browse/HBASE-20428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Jindal resolved HBASE-20428. -- Resolution: Duplicate Fixed in HBase-20276 > [shell] list.first method in HBase2 shell fails with "NoMethodError: > undefined method `first' for nil:NilClass" > --- > > Key: HBASE-20428 > URL: https://issues.apache.org/jira/browse/HBASE-20428 > Project: HBase > Issue Type: Bug > Components: hbase >Affects Versions: 2.0.0-beta-2 >Reporter: Arpit Jindal >Priority: Minor > > list.fist in hbase shell does not list the first table > {code} > hbase(main):001:0> list.first > TABLE > IntegrationTestBigLinkedList_20180331004141 > IntegrationTestBigLinkedList_20180403004104 > IntegrationTestBigLinkedList_20180409123038 > IntegrationTestBigLinkedList_20180409172704 > IntegrationTestBigLinkedList_20180410103309 > IntegrationTestBigLinkedList_20180411151159 > IntegrationTestBigLinkedList_20180411172500 > IntegrationTestBigLinkedList_20180412095403 > 8 row(s) > Took 0.5432 seconds > NoMethodError: undefined method `first' for nil:NilClass > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20428) [shell] list.first method in HBase2 shell fails with "NoMethodError: undefined method `first' for nil:NilClass"
Arpit Jindal created HBASE-20428: Summary: [shell] list.first method in HBase2 shell fails with "NoMethodError: undefined method `first' for nil:NilClass" Key: HBASE-20428 URL: https://issues.apache.org/jira/browse/HBASE-20428 Project: HBase Issue Type: Bug Components: hbase Affects Versions: 2.0.0-beta-2 Reporter: Arpit Jindal list.fist in hbase shell does not list the first table {code} hbase(main):001:0> list.first TABLE IntegrationTestBigLinkedList_20180331004141 IntegrationTestBigLinkedList_20180403004104 IntegrationTestBigLinkedList_20180409123038 IntegrationTestBigLinkedList_20180409172704 IntegrationTestBigLinkedList_20180410103309 IntegrationTestBigLinkedList_20180411151159 IntegrationTestBigLinkedList_20180411172500 IntegrationTestBigLinkedList_20180412095403 8 row(s) Took 0.5432 seconds NoMethodError: undefined method `first' for nil:NilClass {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20427) thrift.jsp displays "Framed transport" incorrectly
Balazs Meszaros created HBASE-20427: --- Summary: thrift.jsp displays "Framed transport" incorrectly Key: HBASE-20427 URL: https://issues.apache.org/jira/browse/HBASE-20427 Project: HBase Issue Type: Bug Components: Thrift Affects Versions: 2.0.0 Reporter: Balazs Meszaros Fix For: 3.0.0, 2.0.0 According to thrift usage text: {code} -nonblocking Use the TNonblockingServer This implies the framed transport. {code} But the web page at port 9095 indicates {{framed = false}} when I start it with {{-nonblocking}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20426) Give up replicating anything in S state
Duo Zhang created HBASE-20426: - Summary: Give up replicating anything in S state Key: HBASE-20426 URL: https://issues.apache.org/jira/browse/HBASE-20426 Project: HBase Issue Type: Sub-task Reporter: Duo Zhang When we transit the remote S cluster to DA, and then transit the old A cluster to S, it is possible that we still have some entries which have not been replicated yet for the old A cluster, and then the async replication will be blocked. And this may also lead to data inconsistency after we transit it to DA back later as these entries will be replicated again, but the new data which are replicated from the remote cluster will not be replicated back, which introduce a whole in the replication. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20425) Do not write the cluster id of the current active cluster when writing remote WAL
Duo Zhang created HBASE-20425: - Summary: Do not write the cluster id of the current active cluster when writing remote WAL Key: HBASE-20425 URL: https://issues.apache.org/jira/browse/HBASE-20425 Project: HBase Issue Type: Sub-task Reporter: Duo Zhang The wal entries which are replayed when converting a cluster from S to DA need to be replicated back to the old A cluster. if we write the cluster id of A into the wal entries, we will skip replicating them... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20424) Allow writing WAL to local and remote cluster concurrently
Duo Zhang created HBASE-20424: - Summary: Allow writing WAL to local and remote cluster concurrently Key: HBASE-20424 URL: https://issues.apache.org/jira/browse/HBASE-20424 Project: HBase Issue Type: Sub-task Reporter: Duo Zhang For better performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20423) Allow config stuffs other than table for sync replication
Duo Zhang created HBASE-20423: - Summary: Allow config stuffs other than table for sync replication Key: HBASE-20423 URL: https://issues.apache.org/jira/browse/HBASE-20423 Project: HBase Issue Type: Sub-task Components: Replication Reporter: Duo Zhang In HBASE-19064, for simplify the problem we only allow table replication for sync replication. Now we should add back the others such as namespace, exclude namespace/table, etc. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20422) Synchronous replication for HBase phase II
Duo Zhang created HBASE-20422: - Summary: Synchronous replication for HBase phase II Key: HBASE-20422 URL: https://issues.apache.org/jira/browse/HBASE-20422 Project: HBase Issue Type: Umbrella Components: Replication Reporter: Duo Zhang Fix For: 3.0.0 Address the remaining problems of HBASE-19064. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20421) HBasecontext creates a connection but does not turn it off
Yu Wang created HBASE-20421: --- Summary: HBasecontext creates a connection but does not turn it off Key: HBASE-20421 URL: https://issues.apache.org/jira/browse/HBASE-20421 Project: HBase Issue Type: Bug Components: hbase Affects Versions: 1.2.6 Reporter: Yu Wang Fix For: 1.2.6 HBasecontext creates a connection but does not turn it off -- This message was sent by Atlassian JIRA (v7.6.3#76005)