[jira] [Created] (HBASE-21195) Support Log storage similar to FB LogDevice
jagan created HBASE-21195: - Summary: Support Log storage similar to FB LogDevice Key: HBASE-21195 URL: https://issues.apache.org/jira/browse/HBASE-21195 Project: HBase Issue Type: New Feature Reporter: jagan Log storage, which is write once and sequential data, can be optimized in the following ways, 1. Key generated should be incremental. 2. HFile key index can be range and need not use BloomFilter 3. Instead of compaction, periodic delete of old files based on TTL can be supported -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21194) Add TestCopyTable which exercises MOB feature
Ted Yu created HBASE-21194: -- Summary: Add TestCopyTable which exercises MOB feature Key: HBASE-21194 URL: https://issues.apache.org/jira/browse/HBASE-21194 Project: HBase Issue Type: Test Reporter: Ted Yu Currently TestCopyTable doesn't cover table(s) with MOB feature enabled. We should add variant that enables MOB on the table being copied and verify that MOB content is copied correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21193) Retrying Callable doesn't take max retries from current context; uses defaults instead
stack created HBASE-21193: - Summary: Retrying Callable doesn't take max retries from current context; uses defaults instead Key: HBASE-21193 URL: https://issues.apache.org/jira/browse/HBASE-21193 Project: HBase Issue Type: Bug Reporter: stack This makes it hard to change retry count on a read of meta for instance. I noticed this when trying to change the defaults for a meta read. I made a customer Connection inside in the master with a new Configuration that had rpc retries and timings upped radically. My reads nonetheless were finishing at the usual retry point (31 tries after 60 seconds or so) because it looked like the Retrying Callable that does the read was taking max retries from defaults rather than reading the passed in Configuration. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21192) Add HOW-TO repair damaged AMv2.
stack created HBASE-21192: - Summary: Add HOW-TO repair damaged AMv2. Key: HBASE-21192 URL: https://issues.apache.org/jira/browse/HBASE-21192 Project: HBase Issue Type: Sub-task Components: amv2 Reporter: stack Assignee: stack Need a page or two on how to do various fixups. Will include doc on how to identify particular circumstance, how to run a repair, as well as caveats (e.g. if no log recovery, then region may be missing edits). Add pointer to log messages, especially those that explicitly ask for operator intervention; e.g. Master#inMeta. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21191) Add a holding-pattern if no assign for meta or namespace (Can happen if masterprocwals have been cleared).
stack created HBASE-21191: - Summary: Add a holding-pattern if no assign for meta or namespace (Can happen if masterprocwals have been cleared). Key: HBASE-21191 URL: https://issues.apache.org/jira/browse/HBASE-21191 Project: HBase Issue Type: Sub-task Components: amv2 Reporter: stack Assignee: stack If the masterprocwals have been removed -- operator error, hdfs dataloss, or because we have gotten ourselves into a pathological state where we have hundreds of masterprocwals too process and it is taking too long so we just want to startover -- then master startup will have a dilemma. Master startup needs hbase:meta to be online. If the masterprocwals have been removed, there may be no outstanding assign or a servercrashprocedure with coverage for hbase:meta (I ran into this issue repeatedly in internal testing purging masterprocwals on a large test cluster). Worse, when master startup cannot find an online hbase:meta, it exits after exhausting the RPC retries. So, we need a holding-pattern for master startup if hbase:meta is not online if only so an operator can schedule an assign for meta or so they can assign fixup procedures (HBASE-20786 has discussion on why we cannot just auto-schedule an assign of meta). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-21190) Log files and count of entries in each as we load from the MasterProcWAL store
[ https://issues.apache.org/jira/browse/HBASE-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-21190. --- Resolution: Fixed Fix Version/s: 2.2.0 3.0.0 I pushed this two line change to branch-2.0+ > Log files and count of entries in each as we load from the MasterProcWAL store > -- > > Key: HBASE-21190 > URL: https://issues.apache.org/jira/browse/HBASE-21190 > Project: HBase > Issue Type: Sub-task > Components: amv2 >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3 > > Attachments: HBASE-21190.branch-2.1.001.patch > > > Sometimes this can take a while especially if loads of files. Emit counts of > entries so operator gets sense of scale of procedures being processed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21190) Log files and count of entries in each as we load from the MasterProcWAL store
stack created HBASE-21190: - Summary: Log files and count of entries in each as we load from the MasterProcWAL store Key: HBASE-21190 URL: https://issues.apache.org/jira/browse/HBASE-21190 Project: HBase Issue Type: Sub-task Components: amv2 Reporter: stack Assignee: stack Fix For: 2.1.1, 2.0.3 Sometimes this can take a while especially if loads of files. Emit counts of entries so operator gets sense of scale of procedures being processed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
HBaseCon 2016 and Archives
Hi there! In HBaseCon 2016 West (San Francisco) I presented, together with David Pope "Containerizing Apache HBase Clusters". I think there was some issue with the recording, we were told that due to past issues recording Facebook talks, they didn't publish the video this time. That was a pity. However, I've now checked the archive of HBaseCon and we are completely absent from there. There were other presentations by Facebookers in there and they are listed (Matt Mullins on the panel, for example) and in HBaseCon 2016 East, Mikhail Antonov's talk too, both with slides and recording. I'd like to know if you can: - Fix this, by adding the talk to HBaseCon 2016 East. Here are the slides, so you can at least link that: https://speakerdeck.com/jjmaestro/hbasecon2016-containerizing-apache-hbase-clusters - Find out if the recording took place. Thanks a ton for all your help! Regards, Javier
Re: HBaseCon 2016 and Archives
Sorry, the final link for the slides is https://speakerdeck.com/jjmaestro/hbasecon-2016-west-containerizing-apache-hbase-clusters I've just updated it to reflect that the conference was the West conference. Thanks, Javier On Wed, Sep 12, 2018 at 2:04 PM J. Javier Maestro wrote: > > Hi there! > > In HBaseCon 2016 West (San Francisco) I presented, together with David > Pope "Containerizing Apache HBase Clusters". I think there was some > issue with the recording, we were told that due to past issues > recording Facebook talks, they didn't publish the video this time. > That was a pity. > > However, I've now checked the archive of HBaseCon and we are > completely absent from there. There were other presentations by > Facebookers in there and they are listed (Matt Mullins on the panel, > for example) and in HBaseCon 2016 East, Mikhail Antonov's talk too, > both with slides and recording. > > I'd like to know if you can: > > - Fix this, by adding the talk to HBaseCon 2016 East. Here are the > slides, so you can at least link that: > https://speakerdeck.com/jjmaestro/hbasecon2016-containerizing-apache-hbase-clusters > > - Find out if the recording took place. > > Thanks a ton for all your help! > > Regards, > > Javier -- J. Javier Maestro
[jira] [Created] (HBASE-21189) flaky job should gather machine stats
Sean Busbey created HBASE-21189: --- Summary: flaky job should gather machine stats Key: HBASE-21189 URL: https://issues.apache.org/jira/browse/HBASE-21189 Project: HBase Issue Type: Sub-task Components: test Reporter: Sean Busbey Assignee: Sean Busbey flaky test should gather all the same environment information as our normal nightly tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21188) Print heap and gc informations in our junit ResourceChecker
Duo Zhang created HBASE-21188: - Summary: Print heap and gc informations in our junit ResourceChecker Key: HBASE-21188 URL: https://issues.apache.org/jira/browse/HBASE-21188 Project: HBase Issue Type: Sub-task Reporter: Duo Zhang Assignee: Duo Zhang Fix For: 3.0.0, 2.2.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21187) The HBase UTs are extremely slow on some jenkins node
Duo Zhang created HBASE-21187: - Summary: The HBase UTs are extremely slow on some jenkins node Key: HBASE-21187 URL: https://issues.apache.org/jira/browse/HBASE-21187 Project: HBase Issue Type: Bug Components: test Reporter: Duo Zhang Looking at the flaky dashboard for master branch, the top several UTs are likely to fail at the same time. One of the common things for the failed flaky tests job is that, the execution time is more than one hour, and the successful executions are usually only about half an hour. And I have compared the output for TestRestoreSnapshotFromClientWithRegionReplicas, for a successful run, the DisableTableProcedure can finish within one second, and for the failed run, it can take even more than half a minute. Not sure what is the real problem, but it seems that for the failed runs, there are likely time holes in the output, i.e, there is no log output for several seconds. Like this: {noformat} 2018-09-11 21:08:08,152 INFO [PEWorker-4] procedure2.ProcedureExecutor(1500): Finished pid=490, state=SUCCESS, hasLock=false; CreateTableProcedure table=testRestoreSnapshotAfterTruncate in 12.9380sec 2018-09-11 21:08:15,590 DEBUG [RpcServer.default.FPBQ.Fifo.handler=1,queue=0,port=33663] master.MasterRpcServices(1174): Checking to see if procedure is done pid=490 {noformat} No log output for about 7 seconds. And for a successful run, the same place {noformat} 2018-09-12 07:47:32,488 INFO [PEWorker-7] procedure2.ProcedureExecutor(1500): Finished pid=490, state=SUCCESS, hasLock=false; CreateTableProcedure table=testRestoreSnapshotAfterTruncate in 1.2220sec 2018-09-12 07:47:32,881 DEBUG [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=59079] master.MasterRpcServices(1174): Checking to see if procedure is done pid=490 {noformat} There is no such hole. Maybe there is big GC? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-21186) Document hbase.regionserver.executor.openregion.threads in MTTR section
Sahil Aggarwal created HBASE-21186: -- Summary: Document hbase.regionserver.executor.openregion.threads in MTTR section Key: HBASE-21186 URL: https://issues.apache.org/jira/browse/HBASE-21186 Project: HBase Issue Type: Improvement Components: documentation Reporter: Sahil Aggarwal Assignee: Sahil Aggarwal hbase.regionserver.executor.openregion.threads helps in improving MTTR by increasing assign rpc processing rate at RS from HMaster but is not documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [VOTE] First release candidate for HBase 1.2.7 is available
+1 - signatures, checksums: OK - rat check: OK - built from source: OK - unit tests: OK - ltt with 1M rows: OK - web UI: OK - shell and basic commands: OK Best regards, Balazs On Wed, Sep 12, 2018 at 1:04 AM Zach York wrote: > +1 (non-binding) > > > Checked sums and sigs: OK > RAT check: OK > Built from src: OK (8u92) > Unit tests pass: OK (8u92) > > Thanks, > Zach > > On Tue, Sep 11, 2018 at 9:06 AM Andrew Purtell > wrote: > > > +1 > > > > Checked sums and signatures: ok > > Checked compat report: ok, removed methods in Base64 are allowed by > > consensus exception > > RAT check: ok > > Built from source: ok (7u80) > > Unit tests pass: ok (8u172), note TestMutateRowsRecovery and > > TestCompactionWithThroughputController failed when running the suite but > > passed by themselves > > 1M row LTT: ok (8u172) > > > > > > On Fri, Sep 7, 2018 at 8:54 PM Sean Busbey wrote: > > > > > The first release candidate for HBase 1.2.7 is available for download: > > > > > > https://dist.apache.org/repos/dist/dev/hbase/hbase-1.2.7RC0/ > > > > > > Maven artifacts are also available in a staging repository at: > > > > > > > https://repository.apache.org/content/repositories/orgapachehbase-1232/ > > > > > > Artifacts are signed with my key (0D80DB7C) published in our KEYS > > > file at http://www.apache.org/dist/hbase/KEYS > > > > > > The RC corresponds to the signed tag 1.2.7RC0, which currently points > > > to commit ref > > > > > > ac57c51f7ad25e312b4275665d62b34a5945422f > > > > > > HBase 1.2.7 is the seventh maintenance release in the HBase 1.2 line, > > > continuing on the theme of bringing a stable, reliable database to > > > the Hadoop and NoSQL communities. This release includes over 250 > > > bug fixes done in the 15 months since 1.2.6. > > > > > > This release candidate contains the following incompatible changes, > > > details in the release notes for the specific issue: > > > > > > * HBASE-20884 Replace usage of our Base64 implementation with java's > > > * HBASE-18577 shaded client should not include non-relocated third > party > > > dependencies > > > * HBASE-18142 delete in HBase shell should not delete previous versions > > > of a cell > > > * HBASE-18731 some protected methods of QuotaSettings have been marked > > > IA.Private and deprecated > > > * HBASE-16459 HBase shell no longer recognizes the --format option > > > > > > The detailed source and binary compatibility report vs 1.2.6 has been > > > published for your review, at: > > > > > > https://s.apache.org/hbase-1.2.7-rc0-compat-report > > > > > > The report shows some expected incompatibilities and one false > positive. > > > Details on HBASE-18276. > > > > > > Critical fixes include: > > > > > > * HBASE-18036 Data locality is not maintained after cluster restart > > > * HBASE-18233 We shouldn't wait for readlock in doMiniBatchMutation in > > > case of deadlock > > > * HBASE-9393 Region Server fails to properly close socket resulting in > > > many CLOSE_WAIT to Data Nodes > > > * HBASE-19924 RPC throttling does not work for multi() with request > > > count rater. > > > * HBASE-18192 Replication drops recovered queues on Region Server > > > shutdown > > > * HBASE-18282 ReplicationLogCleaner can delete WALs not yet replicated > > > in case of a KeeperException > > > * HBASE-19796 ReplicationSynUp tool is not replicating data if a WAL is > > > moved to splitting directory > > > * HBASE-17648 HBase table-level synchronization fails between two > > > secured(kerberized) clusters > > > * HBASE-18137 Replication gets stuck for empty WALs > > > * HBASE-18577 shaded client should not include non-relocated third > party > > > dependencies > > > * HBASE-19900 Region level exceptions destroy the result of batch > client > > > operations > > > * HBASE-21007 Memory leak in HBase REST server > > > > > > The full list of fixes included in this release is available at: > > > > > > https://s.apache.org/hbase-1.2.7-jira-release-notes > > > > > > and in the CHANGES.txt file included in the distribution. > > > > > > Please try out this candidate and vote +1/-1 on whether we should > > > release these artifacts as HBase 1.2.7. > > > > > > The VOTE will remain open for at least 72 hours. Given sufficient votes > > > I would like to close it on September 12th, 2018. > > > > > > thanks! > > > > > > -busbey > > > > > > as of this email the posted artifacts have the following SHA512: > > > > > > hbase-1.2.7-bin.tar.gz: > > > 00FC806A 335DFBDD 30720411 9FC0DD19 01AAC2E4 C8220B90 > > > D54263F4 4F49D49A 111C30D0 6E4CC6D3 249C4F5A 7A66064B > > > 9EF97A92 B8E559F9 11705137 C3B652F2 > > > > > > hbase-1.2.7-src.tar.gz: > > > F142B8E2 4F615D32 2B3816B1 71A61D9A 618FDD99 1EE69772 > > > E3D52226 D30E34B8 0F9A469E 5AA00F6F 12AB77D6 C7FC6ADE > > > 8CAB7254 8B2AF8E6 6D00E9EC D0E93AB4 > > > > > >