LeaseException

2012-07-13 Thread Oleg Ruchovets
Hi , running scan job against hbase I got such exceptions: org.apache.hadoop.hbase.regionserver.LeaseException: org.apache.hadoop.hbase.regionserver.LeaseException: lease '4926582861878965506' does not exist at org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:230)

Re: Maximum number of tables ?

2012-07-13 Thread N Keywal
Hi, There is no real limits as far as I know. As you will have one region per table (at least :-), the number of region will be something to monitor carefully if you need thousands of table. See http://hbase.apache.org/book.html#arch.regions.size. Don't forget that you can add as many column as

Re: HBase Java Client -- ZookeeperWrapper connects to remote server, but then reconnects to localhost?

2012-07-13 Thread Quinton
Hi Paul, I also met such a problem, have you solved this problem. How to work it out?

Re: HDFS + HBASE process high cpu usage

2012-07-13 Thread Asaf Mesika
Thanks a lot! That must have been it. Unfortunately I couldn't really test this command, since the guys from ops rebooted the entire computer room during maintenance, and it fixed the issue. (This room is a lab room of course) Asaf On Jul 13, 2012, at 4:27 AM, Esteban Gutierrez wrote: date

HBase Fault tolerance

2012-07-13 Thread Sever Fundatureanu
Hello, I would like to understand more in-depth how fault tolerance is handled in HBase: 1. So for each put operation an RS first writes to an HLog file and then to the Memstore. If the RS crashes the HLog file is replayed by other servers, correct? My question is how is this HLog file different

Re: Embedded table data model

2012-07-13 Thread Guxiaobo
Hi Ian, What is your suggestion then? Sent from my iPad On 2012-7-13, at 下午12:55, Ian Varley ivar...@salesforce.com wrote: Yes, that's what I mean. It is not the only way to model this, but your question was, Can we embedded the transactions inside the customer table in HBase. On

RES: HBase Fault tolerance

2012-07-13 Thread Cristofer Weber
Hi Sever Coprocessors are still new for me, so I don't have a good answer for your second question. But for your first, (as far as I understand) remember that you can send Puts/Deletes in any order, and Memstore is responsible for keeping your data sorted before flushing to a StoreFile, and

Re: hbase improve random read/write performance

2012-07-13 Thread Adrien Mogenet
You might have a look at the reference guide, or this article ( http://blog.newitfarmer.com/nosql/hbase-nosql/4385/repost-hbasearchitecture) or the Lars one ( http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html) which explain HBase Storage architecture. Take care of block size,

Re: LeaseException

2012-07-13 Thread Daniel Iancu
Hi Oleg Normally, if your client does not process rows fast enough you should get a UnknownScannerException. A higher cache value means a higher time between two next() operations (that renew the lease) so a higher chance to get a USE. Try to optimize the row processing in the client and lower

Re: Embedded table data model

2012-07-13 Thread Michael Segel
First, A caveat... Schema design in HBase is one of the hardest things to teach/learn because its so open. There is more than one correct answer when it comes to creating a good design... Ian's presentation kind of tries to relate HBase schema design to relational modeling. From past

Re: Maximum number of tables ?

2012-07-13 Thread Michael Segel
Currently there is a hardcoded limit on the number of regions that a region server can manage. Its 1500. Note that if the number of regions gets to around 1000 regions per region server, you end up with a performance hit. (YMMV) So if you have 1 region per table, there's a real limit of 1500

Re: Maximum number of tables ?

2012-07-13 Thread Amandeep Khurana
I have come across clusters with 100s of tables but that typically is due to a sub optimal table design. The question here is - why do you need to distribute your data over lots of tables? What's your access pattern and what kind of data are you putting in? Or is this just a theoretical question?

Re: Maximum number of tables ?

2012-07-13 Thread Kevin O'dell
Mike, I just saw a system with 2500 Regions per RS(crazy I know, we are fixing that). I did not think there was a hard coded limit... On Fri, Jul 13, 2012 at 11:50 AM, Amandeep Khurana ama...@gmail.com wrote: I have come across clusters with 100s of tables but that typically is due to a

Re: Maximum number of tables ?

2012-07-13 Thread Michael Segel
I'm going from memory. There was a hardcoded number. I'd have to go back and try to find it. From a practical standpoint, going over 1000 regions per RS will put you on thin ice. Too many regions can kill your system. On Jul 13, 2012, at 12:36 PM, Kevin O'dell wrote: Mike, I just saw

Re: Maximum number of tables ?

2012-07-13 Thread Lars George
It is basically unset: this.regionSplitLimit = conf.getInt(hbase.regionserver.regionSplitLimit, Integer.MAX_VALUE); (from CompactSplitThread.java). The number of regions is OK until you dilute the available heap share too much. So you can have 1000 regions (given the block index,

Too many regions

2012-07-13 Thread Rob Roland
Hi all, The HBase instance I'm managing has grown to the point that it has way too many regions per server - 5 region servers with 1010 regions each on HBase 0.90.4-cdh3u2. I want to bring this region count under control. The cluster is currently running with the default region size of 256 mb,

Re: Too many regions

2012-07-13 Thread Adrien Mogenet
It can be reasonable to turn off the automatic region split if you know your rowkey distribution well and you're able to ensure a great parallelism among your regionservers easily. (ie: manually or through HBase API). Sometimes it's even the best solution to ensure the minimum number of regions

Re: Too many regions

2012-07-13 Thread Rob Roland
In almost every table, the rowkey is either a SHA hash, or a SHA hash and a timestamp, so we have a fairly even distribution of rowkeys now. Is there a best practice for number of regions of a table per server? Meaning, with 5 region servers, 10 regions per table, so 170 regions per region

Re: Too many regions

2012-07-13 Thread Adrien Mogenet
Everyone will tell you that handling less regions is always better. Depending on your setup, data-size and number of records, I would say that 1 to 5 regions per table and server is acceptable. In some setup (one big table for example) you can see up to 100/200 regions per server, which is the

Re: Too many regions

2012-07-13 Thread Bryan Beaudreault
Tables are like a loose organizational structure to allow you to have more granular per-table configurations or just for your own logical separation of data. There aren't any best practices with regards to regions per table. What is more important is regions per region server and regions per