Re: avoiding hot spot for timestamp prefix key

2015-05-22 Thread Ted Yu
The custom split policy needs to respect the fact that timestamp is the leading part of the rowkey. This would avoid the overlap you mentioned. Cheers On May 21, 2015, at 11:55 PM, Shushant Arora shushantaror...@gmail.com wrote: guid change with every key, patterns is 2015-05-22

Re: avoiding hot spot for timestamp prefix key

2015-05-22 Thread Shushant Arora
guid change with every key, patterns is 2015-05-22 00:02:01#AB12EC945 2015-05-22 00:02:02#CD9870001234AB457 When we specify custom split algorithm , it may happen that keys of same sorting order range say (1-7) lies in region R1 as well as in region R2? Then how .META. table will make

Some questions about memstore flush

2015-05-22 Thread charlse_Li
Dear all, i have some confusion about memstore flush, how the parameter hbase.hregion.memstore.flush.size impacts the time when FlushHandler executes memstore flush?Thanks 2015-05-22 charlse_Li

Re: avoiding hot spot for timestamp prefix key

2015-05-22 Thread Shushant Arora
since custom split policy is based on second part i.e guid so key with first part as 2015-05-22 00:01:02 will be in which region how will that be identified? On Fri, May 22, 2015 at 1:12 PM, Ted Yu yuzhih...@gmail.com wrote: The custom split policy needs to respect the fact that timestamp is

Many rows vs many columns

2015-05-22 Thread Dominik Hübner
Is it better to aggregate more data in a single row with more columns or aiming for having rather short but many rows? I remember having read somewhere that it doesn’t really matter as physical layout would be the same, but cannot find this reference anymore.

Re: Optimizing compactions on super-low-cost HW

2015-05-22 Thread Serega Sheypak
What version of hbase are you on? We are on CDH 5.2.1 HBase 0.98 These hfiles are created on same cluster with MR? (i.e. they are using up i/os) The same cluster :) They are created during night and we get IO degradation when no MR runs. I understand, that MR also gives significant IO pressure.

Re: DNS mismatch between master and regionserver causes doubly registered regionservers

2015-05-22 Thread Esteban Gutierrez
Correct, but settings hbase.regionserver.hostname should be enough if I remember correctly, also you need to define hbase.master.hostname if you are using HBase 1.1 cheers, esteban. -- Cloudera, Inc. On Fri, May 22, 2015 at 12:55 PM, Bryan Beaudreault bbeaudrea...@hubspot.com wrote: Thanks

Re: avoiding hot spot for timestamp prefix key

2015-05-22 Thread Vladimir Rodionov
RegionSplitPolicy only allows you to customize split point (row key). All rows above this split point will go to the first daughter region, below - to the second. The answer on original question is - No, you can not have your custom policy based on a second part of a key. -Vlad On Fri, May 22,

Re: DNS mismatch between master and regionserver causes doubly registered regionservers

2015-05-22 Thread Ted Yu
Bryan: HBASE-12954 introduced config for region server hostname. The following added config for master hostname: HBASE-13481 Master should respect master (old) DNS/bind related configurations I will link the above JIRA to HBASE-12954 Cheers On Fri, May 22, 2015 at 12:55 PM, Bryan Beaudreault

Re: DNS mismatch between master and regionserver causes doubly registered regionservers

2015-05-22 Thread Sean Busbey
On Fri, May 22, 2015 at 2:34 PM, Ted Yu yuzhih...@gmail.com wrote: bq. hbase-1.1.0.1 To my knowledge, latest release was 1.1.0. The release before that was 1.0.1 Can you clarify ? Thanks The 1.1.0.1 release votes all passed. I don't think the announcement has gone out yet because we

Re: DNS mismatch between master and regionserver causes doubly registered regionservers

2015-05-22 Thread Bryan Beaudreault
Thank you guys for the help. I'm reading through the comments now to try to get a handle on why this changed. Looking forward to seeing HBASE-12954 in CDH5.4.3 On Fri, May 22, 2015 at 4:02 PM, Esteban Gutierrez este...@cloudera.com wrote: Correct, but settings hbase.regionserver.hostname

Re: DNS mismatch between master and regionserver causes doubly registered regionservers

2015-05-22 Thread Andrew Purtell
There's always some delay between when release artifacts are sent onward to the mirrors and when the announcements go out, for various reasons. We made three patch releases this week. I have it on good authority the announcements for the current crop of releases will go out this coming Monday.

Re: Can TableSnapshotInputFormat support multiple snapshots as the MR input?

2015-05-22 Thread Shi, Shaofeng
Hi Andrew, this is what we need, thank you! In which version will this feature be released? Our hbase is v0.98, is it possible that just patch this to get the feature? On 5/22/15, 6:06 PM, Andrew Mains andrew.ma...@kontagent.com wrote: In the latest release, no; however I've filed a ticket here

Re: DNS mismatch between master and regionserver causes doubly registered regionservers

2015-05-22 Thread Bryan Beaudreault
HBASE-12954 looks like it would solve my issue, but is not in cdh5.4.0. I also don't think it fixes what I think the real bug is -- it's more of a workaround. In terms of the actual bug, I think one of at least two possible solutions should be considered: 1. Remove the support for

Re: DNS mismatch between master and regionserver causes doubly registered regionservers

2015-05-22 Thread Bryan Beaudreault
Thanks Esteban. So the idea is you set hbase.master.dns.* on the master side, and hbase.regionserver.hostname to a value matching what the master DNS server would return on the regionserver side? On Fri, May 22, 2015 at 3:51 PM, Esteban Gutierrez este...@cloudera.com wrote: Hi Bryan, The

Re: DNS mismatch between master and regionserver causes doubly registered regionservers

2015-05-22 Thread Stack
On Fri, May 22, 2015 at 10:17 AM, Bryan Beaudreault bbeaudrea...@hubspot.com wrote: In our system each server has 2 dns associated with it, one always points to a private address and the other to public or private depending on the context. This issue did not show up in 0.94.x, but is

Re: DNS mismatch between master and regionserver causes doubly registered regionservers

2015-05-22 Thread Stack
On Fri, May 22, 2015 at 10:12 PM, Stack st...@duboce.net wrote: On Fri, May 22, 2015 at 10:17 AM, Bryan Beaudreault bbeaudrea...@hubspot.com wrote: In our system each server has 2 dns associated with it, one always points to a private address and the other to public or private depending on

Re: avoiding hot spot for timestamp prefix key

2015-05-22 Thread Michael Segel
This is why I created HBASE-12853. So you don’t have to specify a custom split policy. Of course the simple solutions are often passed over because of NIH. ;-) To be blunt… You encapsulate the bucketing code so that you have a single API in to HBase regardless of the type of storage

Re: Optimizing compactions on super-low-cost HW

2015-05-22 Thread Michael Segel
Look, to be blunt, you’re screwed. If I read your cluster spec.. it sounds like you have a single i7 (quad core) cpu. That’s 4 cores or 8 threads. Mirroring the OS is common practice. Using the same drives for Hadoop… not so good, but once the sever boots up… not so much I/O. Its not good,

Can TableSnapshotInputFormat support multiple snapshots as the MR input?

2015-05-22 Thread Shi, Shaofeng
Hello, We have a scenario which need merge multiple Hbase tables into one table periodically; To gain better performance and minimal the impact to HBase server, we are evaluating the method of using TableSnapshotInputFormat (http://www.slideshare.net/enissoz/mapreduce-over-snapshots); But from

Re: Optimizing compactions on super-low-cost HW

2015-05-22 Thread Serega Sheypak
We don't have money, these nodes are the cheapest. I totally agree that we need 4-6 HDD, but there is no chance to get it unfortunately. Okay, I'll try yo apply Stack suggestions. 2015-05-22 13:00 GMT+03:00 Michael Segel michael_se...@hotmail.com: Look, to be blunt, you’re screwed. If I read

Re: Coprocessor accessibility

2015-05-22 Thread Ted Yu
In hbase shell, please use: help 'get' You will see how custom attribute can be passed. Cheers On May 22, 2015, at 3:07 AM, Navdeep Agrawal navdeep_agra...@symantec.com wrote: Nice I was looking for something like this . thank you Ted just for curiosity can we set attribute through

Re: Many rows vs many columns

2015-05-22 Thread Ted Yu
Have you looked at opentsdb ? http://opentsdb.net/docs/build/html/user_guide/backends/hbase.html What access pattern to your data are you expecting ? Cheers On Fri, May 22, 2015 at 12:50 AM, Dominik Hübner cont...@dhuebner.com wrote: Is it better to aggregate more data in a single row with

RE: Coprocessor accessibility

2015-05-22 Thread Navdeep Agrawal
Nice I was looking for something like this . thank you Ted just for curiosity can we set attribute through shell .or by default no attribute are set through shell's get . -Original Message- From: Ted Yu [mailto:yuzhih...@gmail.com] Sent: Thursday, May 21, 2015 9:29 PM To:

Re: Can TableSnapshotInputFormat support multiple snapshots as the MR input?

2015-05-22 Thread Andrew Mains
In the latest release, no; however I've filed a ticket here https://issues.apache.org/jira/browse/HBASE-13356 for this feature, and uploaded a patch for review. The patch provides a MultiTableSnapshotInputFormat which can run a list of scans over multiple snapshots. Jobs can be initialized

DNS mismatch between master and regionserver causes doubly registered regionservers

2015-05-22 Thread Bryan Beaudreault
In our system each server has 2 dns associated with it, one always points to a private address and the other to public or private depending on the context. This issue did not show up in 0.94.x, but is showing up on my new 1.x cluster. Basically it goes like this: 1. Regionserver starts up,

Re: DNS mismatch between master and regionserver causes doubly registered regionservers

2015-05-22 Thread Esteban Gutierrez
Hi Bryan, could you please be more specific about the 1.x version that you are using? we have HBASE-13481 and HBASE-12954 so it depends on which version of 1.x you are using. Regarding your account issue, I have created an INFRA JIRA on your behalf to look into your account problem. thanks,

Re: DNS mismatch between master and regionserver causes doubly registered regionservers

2015-05-22 Thread Bryan Beaudreault
Thank you Esteban. I checked two different versions: - hbase-1.0.0-cdh5.4.0 (this is the version I use) - hbase-1.1.0.1 (just wanted to check the latest release) On Fri, May 22, 2015 at 3:13 PM, Esteban Gutierrez este...@cloudera.com wrote: Hi Bryan, could you please be more specific about

Re: DNS mismatch between master and regionserver causes doubly registered regionservers

2015-05-22 Thread Ted Yu
bq. hbase-1.1.0.1 To my knowledge, latest release was 1.1.0. The release before that was 1.0.1 Can you clarify ? Thanks On Fri, May 22, 2015 at 12:23 PM, Bryan Beaudreault bbeaudrea...@hubspot.com wrote: Thank you Esteban. I checked two different versions: - hbase-1.0.0-cdh5.4.0 (this is