[jira] [Created] (HBASE-21195) Support Log storage similar to FB LogDevice

2018-09-12 Thread jagan (JIRA)
jagan created HBASE-21195:
-

 Summary: Support Log storage similar to FB LogDevice
 Key: HBASE-21195
 URL: https://issues.apache.org/jira/browse/HBASE-21195
 Project: HBase
  Issue Type: New Feature
Reporter: jagan


Log storage, which is write once and sequential data, can be optimized in the 
following ways,

1. Key generated should be incremental.

2. HFile key index can be range and need not use BloomFilter 

3. Instead of compaction, periodic delete of old files based on TTL can be 
supported



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21194) Add TestCopyTable which exercises MOB feature

2018-09-12 Thread Ted Yu (JIRA)
Ted Yu created HBASE-21194:
--

 Summary: Add TestCopyTable which exercises MOB feature
 Key: HBASE-21194
 URL: https://issues.apache.org/jira/browse/HBASE-21194
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


Currently TestCopyTable doesn't cover table(s) with MOB feature enabled.

We should add variant that enables MOB on the table being copied and verify 
that MOB content is copied correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21193) Retrying Callable doesn't take max retries from current context; uses defaults instead

2018-09-12 Thread stack (JIRA)
stack created HBASE-21193:
-

 Summary: Retrying Callable doesn't take max retries from current 
context; uses defaults instead
 Key: HBASE-21193
 URL: https://issues.apache.org/jira/browse/HBASE-21193
 Project: HBase
  Issue Type: Bug
Reporter: stack


This makes it hard to change retry count on a read of meta for instance.

I noticed this when trying to change the defaults for a meta read. I made a 
customer Connection inside in the master with a new Configuration that had rpc 
retries and timings upped radically. My reads nonetheless were finishing at the 
usual retry point (31 tries after 60 seconds or so) because it looked like the 
Retrying Callable that does the read was taking  max retries from defaults 
rather than reading the passed in Configuration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21192) Add HOW-TO repair damaged AMv2.

2018-09-12 Thread stack (JIRA)
stack created HBASE-21192:
-

 Summary: Add HOW-TO repair damaged AMv2.
 Key: HBASE-21192
 URL: https://issues.apache.org/jira/browse/HBASE-21192
 Project: HBase
  Issue Type: Sub-task
  Components: amv2
Reporter: stack
Assignee: stack


Need a page or two on how to do various fixups. Will include doc on how to 
identify particular circumstance, how to run a repair, as well as caveats (e.g. 
if no log recovery, then region may be missing edits).

Add pointer to log messages, especially those that explicitly ask for operator 
intervention; e.g. Master#inMeta.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21191) Add a holding-pattern if no assign for meta or namespace (Can happen if masterprocwals have been cleared).

2018-09-12 Thread stack (JIRA)
stack created HBASE-21191:
-

 Summary: Add a holding-pattern if no assign for meta or namespace 
(Can happen if masterprocwals have been cleared).
 Key: HBASE-21191
 URL: https://issues.apache.org/jira/browse/HBASE-21191
 Project: HBase
  Issue Type: Sub-task
  Components: amv2
Reporter: stack
Assignee: stack


If the masterprocwals have been removed -- operator error, hdfs dataloss, or 
because we have gotten ourselves into a pathological state where we have 
hundreds of masterprocwals too process and it is taking too long so we just 
want to startover -- then master startup will have a dilemma. Master startup 
needs hbase:meta to be online. If the masterprocwals have been removed, there 
may be no outstanding assign or a servercrashprocedure with coverage for 
hbase:meta (I ran into this issue repeatedly in internal testing purging 
masterprocwals on a large test cluster). Worse, when master startup cannot find 
an online hbase:meta, it exits after exhausting the RPC retries.

So, we need a holding-pattern for master startup if hbase:meta is not online if 
only so an operator can schedule an assign for meta or so they can assign fixup 
procedures (HBASE-20786 has discussion on why we cannot just auto-schedule an 
assign of meta).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-21190) Log files and count of entries in each as we load from the MasterProcWAL store

2018-09-12 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-21190.
---
   Resolution: Fixed
Fix Version/s: 2.2.0
   3.0.0

I pushed this two line change to branch-2.0+

> Log files and count of entries in each as we load from the MasterProcWAL store
> --
>
> Key: HBASE-21190
> URL: https://issues.apache.org/jira/browse/HBASE-21190
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21190.branch-2.1.001.patch
>
>
> Sometimes this can take a while especially if loads of files. Emit counts of 
> entries so operator gets sense of scale of procedures being processed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21190) Log files and count of entries in each as we load from the MasterProcWAL store

2018-09-12 Thread stack (JIRA)
stack created HBASE-21190:
-

 Summary: Log files and count of entries in each as we load from 
the MasterProcWAL store
 Key: HBASE-21190
 URL: https://issues.apache.org/jira/browse/HBASE-21190
 Project: HBase
  Issue Type: Sub-task
  Components: amv2
Reporter: stack
Assignee: stack
 Fix For: 2.1.1, 2.0.3


Sometimes this can take a while especially if loads of files. Emit counts of 
entries so operator gets sense of scale of procedures being processed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


HBaseCon 2016 and Archives

2018-09-12 Thread J. Javier Maestro
Hi there!

In HBaseCon 2016 West (San Francisco) I presented, together with David
Pope "Containerizing Apache HBase Clusters". I think there was some
issue with the recording, we were told that due to past issues
recording Facebook talks, they didn't publish the video this time.
That was a pity.

However, I've now checked the archive of HBaseCon and we are
completely absent from there. There were other presentations by
Facebookers in there and they are listed (Matt Mullins on the panel,
for example) and in HBaseCon 2016 East, Mikhail Antonov's talk too,
both with slides and recording.

I'd like to know if you can:

- Fix this, by adding the talk to HBaseCon 2016 East. Here are the
slides, so you can at least link that:
https://speakerdeck.com/jjmaestro/hbasecon2016-containerizing-apache-hbase-clusters

- Find out if the recording took place.

Thanks a ton for all your help!

Regards,

Javier


Re: HBaseCon 2016 and Archives

2018-09-12 Thread J. Javier Maestro
Sorry, the final link for the slides is

https://speakerdeck.com/jjmaestro/hbasecon-2016-west-containerizing-apache-hbase-clusters

I've just updated it to reflect that the conference was the West conference.

Thanks,

Javier
On Wed, Sep 12, 2018 at 2:04 PM J. Javier Maestro  wrote:
>
> Hi there!
>
> In HBaseCon 2016 West (San Francisco) I presented, together with David
> Pope "Containerizing Apache HBase Clusters". I think there was some
> issue with the recording, we were told that due to past issues
> recording Facebook talks, they didn't publish the video this time.
> That was a pity.
>
> However, I've now checked the archive of HBaseCon and we are
> completely absent from there. There were other presentations by
> Facebookers in there and they are listed (Matt Mullins on the panel,
> for example) and in HBaseCon 2016 East, Mikhail Antonov's talk too,
> both with slides and recording.
>
> I'd like to know if you can:
>
> - Fix this, by adding the talk to HBaseCon 2016 East. Here are the
> slides, so you can at least link that:
> https://speakerdeck.com/jjmaestro/hbasecon2016-containerizing-apache-hbase-clusters
>
> - Find out if the recording took place.
>
> Thanks a ton for all your help!
>
> Regards,
>
> Javier



-- 
J. Javier Maestro 


[jira] [Created] (HBASE-21189) flaky job should gather machine stats

2018-09-12 Thread Sean Busbey (JIRA)
Sean Busbey created HBASE-21189:
---

 Summary: flaky job should gather machine stats
 Key: HBASE-21189
 URL: https://issues.apache.org/jira/browse/HBASE-21189
 Project: HBase
  Issue Type: Sub-task
  Components: test
Reporter: Sean Busbey
Assignee: Sean Busbey


flaky test should gather all the same environment information as our normal 
nightly tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21188) Print heap and gc informations in our junit ResourceChecker

2018-09-12 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-21188:
-

 Summary: Print heap and gc informations in our junit 
ResourceChecker
 Key: HBASE-21188
 URL: https://issues.apache.org/jira/browse/HBASE-21188
 Project: HBase
  Issue Type: Sub-task
Reporter: Duo Zhang
Assignee: Duo Zhang
 Fix For: 3.0.0, 2.2.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21187) The HBase UTs are extremely slow on some jenkins node

2018-09-12 Thread Duo Zhang (JIRA)
Duo Zhang created HBASE-21187:
-

 Summary: The HBase UTs are extremely slow on some jenkins node
 Key: HBASE-21187
 URL: https://issues.apache.org/jira/browse/HBASE-21187
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: Duo Zhang


Looking at the flaky dashboard for master branch, the top several UTs are 
likely to fail at the same time. One of the common things for the failed flaky 
tests job is that, the execution time is more than one hour, and the successful 
executions are usually only about half an hour.

And I have compared the output for 
TestRestoreSnapshotFromClientWithRegionReplicas, for a successful run, the 
DisableTableProcedure can finish within one second, and for the failed run, it 
can take even more than half a minute.

Not sure what is the real problem, but it seems that for the failed runs, there 
are likely time holes in the output, i.e, there is no log output for several 
seconds. Like this:
{noformat}
2018-09-11 21:08:08,152 INFO  [PEWorker-4] procedure2.ProcedureExecutor(1500): 
Finished pid=490, state=SUCCESS, hasLock=false; CreateTableProcedure 
table=testRestoreSnapshotAfterTruncate in 12.9380sec
2018-09-11 21:08:15,590 DEBUG 
[RpcServer.default.FPBQ.Fifo.handler=1,queue=0,port=33663] 
master.MasterRpcServices(1174): Checking to see if procedure is done pid=490
{noformat}

No log output for about 7 seconds.

And for a successful run, the same place
{noformat}
2018-09-12 07:47:32,488 INFO  [PEWorker-7] procedure2.ProcedureExecutor(1500): 
Finished pid=490, state=SUCCESS, hasLock=false; CreateTableProcedure 
table=testRestoreSnapshotAfterTruncate in 1.2220sec
2018-09-12 07:47:32,881 DEBUG 
[RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=59079] 
master.MasterRpcServices(1174): Checking to see if procedure is done pid=490
{noformat}

There is no such hole.

Maybe there is big GC?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21186) Document hbase.regionserver.executor.openregion.threads in MTTR section

2018-09-12 Thread Sahil Aggarwal (JIRA)
Sahil Aggarwal created HBASE-21186:
--

 Summary: Document hbase.regionserver.executor.openregion.threads 
in MTTR section
 Key: HBASE-21186
 URL: https://issues.apache.org/jira/browse/HBASE-21186
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Sahil Aggarwal
Assignee: Sahil Aggarwal


hbase.regionserver.executor.openregion.threads helps in improving MTTR by 
increasing assign rpc processing rate at RS from HMaster but is not documented.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] First release candidate for HBase 1.2.7 is available

2018-09-12 Thread Balazs Meszaros
+1

- signatures, checksums: OK
- rat check: OK
- built from source: OK
- unit tests: OK
- ltt with 1M rows: OK
- web UI: OK
- shell and basic commands: OK

Best regards,
Balazs

On Wed, Sep 12, 2018 at 1:04 AM Zach York 
wrote:

> +1 (non-binding)
>
>
> Checked sums and sigs: OK
> RAT check: OK
> Built from src: OK (8u92)
> Unit tests pass: OK (8u92)
>
> Thanks,
> Zach
>
> On Tue, Sep 11, 2018 at 9:06 AM Andrew Purtell 
> wrote:
>
> > +1
> >
> > Checked sums and signatures: ok
> > Checked compat report: ok, removed methods in Base64 are allowed by
> > consensus exception
> > RAT check: ok
> > Built from source: ok (7u80)
> > Unit tests pass: ok (8u172), note TestMutateRowsRecovery and
> > TestCompactionWithThroughputController failed when running the suite but
> > passed by themselves
> > 1M row LTT: ok (8u172)
> >
> >
> > On Fri, Sep 7, 2018 at 8:54 PM Sean Busbey  wrote:
> >
> > > The first release candidate for HBase 1.2.7 is available for download:
> > >
> > > https://dist.apache.org/repos/dist/dev/hbase/hbase-1.2.7RC0/
> > >
> > > Maven artifacts are also available in a staging repository at:
> > >
> > >
> https://repository.apache.org/content/repositories/orgapachehbase-1232/
> > >
> > > Artifacts are signed with my key (0D80DB7C) published in our KEYS
> > > file at http://www.apache.org/dist/hbase/KEYS
> > >
> > > The RC corresponds to the signed tag 1.2.7RC0, which currently points
> > >  to commit ref
> > >
> > > ac57c51f7ad25e312b4275665d62b34a5945422f
> > >
> > > HBase 1.2.7 is the seventh maintenance release in the HBase 1.2 line,
> > > continuing on the theme of bringing a stable, reliable database to
> > > the Hadoop and NoSQL communities. This release includes over 250
> > > bug fixes done in the 15 months since 1.2.6.
> > >
> > > This release candidate contains the following incompatible changes,
> > > details in the release notes for the specific issue:
> > >
> > > * HBASE-20884 Replace usage of our Base64 implementation with java's
> > > * HBASE-18577 shaded client should not include non-relocated third
> party
> > >   dependencies
> > > * HBASE-18142 delete in HBase shell should not delete previous versions
> > >   of a cell
> > > * HBASE-18731 some protected methods of QuotaSettings have been marked
> > >   IA.Private and deprecated
> > > * HBASE-16459 HBase shell no longer recognizes the --format option
> > >
> > > The detailed source and binary compatibility report vs 1.2.6 has been
> > > published for your review, at:
> > >
> > > https://s.apache.org/hbase-1.2.7-rc0-compat-report
> > >
> > > The report shows some expected incompatibilities and one false
> positive.
> > > Details on HBASE-18276.
> > >
> > > Critical fixes include:
> > >
> > > * HBASE-18036 Data locality is not maintained after cluster restart
> > > * HBASE-18233 We shouldn't wait for readlock in doMiniBatchMutation in
> > >   case of deadlock
> > > * HBASE-9393  Region Server fails to properly close socket resulting in
> > >   many CLOSE_WAIT to Data Nodes
> > > * HBASE-19924 RPC throttling does not work for multi() with request
> > >   count rater.
> > > * HBASE-18192 Replication drops recovered queues on Region Server
> > >   shutdown
> > > * HBASE-18282 ReplicationLogCleaner can delete WALs not yet replicated
> > >   in case of a KeeperException
> > > * HBASE-19796 ReplicationSynUp tool is not replicating data if a WAL is
> > >   moved to splitting directory
> > > * HBASE-17648 HBase table-level synchronization fails between two
> > >   secured(kerberized) clusters
> > > * HBASE-18137 Replication gets stuck for empty WALs
> > > * HBASE-18577 shaded client should not include non-relocated third
> party
> > >   dependencies
> > > * HBASE-19900 Region level exceptions destroy the result of batch
> client
> > >   operations
> > > * HBASE-21007 Memory leak in HBase REST server
> > >
> > > The full list of fixes included in this release is available at:
> > >
> > > https://s.apache.org/hbase-1.2.7-jira-release-notes
> > >
> > > and in the CHANGES.txt file included in the distribution.
> > >
> > > Please try out this candidate and vote +1/-1 on whether we should
> > > release these artifacts as HBase 1.2.7.
> > >
> > > The VOTE will remain open for at least 72 hours. Given sufficient votes
> > > I would like to close it on September 12th, 2018.
> > >
> > > thanks!
> > >
> > > -busbey
> > >
> > > as of this email the posted artifacts have the following SHA512:
> > >
> > > hbase-1.2.7-bin.tar.gz:
> > > 00FC806A 335DFBDD 30720411 9FC0DD19 01AAC2E4 C8220B90
> > > D54263F4 4F49D49A 111C30D0 6E4CC6D3 249C4F5A 7A66064B
> > > 9EF97A92 B8E559F9 11705137 C3B652F2
> > >
> > > hbase-1.2.7-src.tar.gz:
> > > F142B8E2 4F615D32 2B3816B1 71A61D9A 618FDD99 1EE69772
> > > E3D52226 D30E34B8 0F9A469E 5AA00F6F 12AB77D6 C7FC6ADE
> > > 8CAB7254 8B2AF8E6 6D00E9EC D0E93AB4
> > >
> >
>