Thanks for these answers ; it was a theoretical question. Actually, a
common pattern in other solutions for batch deletion is to organize data in
- for instance - one table per day and remove the eldest day after day.
That way is more efficient than finding old rows, then delete them (due to
lock
Hi,
There is no real limits as far as I know. As you will have one region
per table (at least :-), the number of region will be something to
monitor carefully if you need thousands of table. See
http://hbase.apache.org/book.html#arch.regions.size.
Don't forget that you can add as many column as
Currently there is a hardcoded limit on the number of regions that a region
server can manage.
Its 1500.
Note that if the number of regions gets to around 1000 regions per region
server, you end up with a performance hit. (YMMV)
So if you have 1 region per table, there's a real limit of 1500
I have come across clusters with 100s of tables but that typically is
due to a sub optimal table design.
The question here is - why do you need to distribute your data over
lots of tables? What's your access pattern and what kind of data are
you putting in? Or is this just a theoretical question?
Mike,
I just saw a system with 2500 Regions per RS(crazy I know, we are fixing
that). I did not think there was a hard coded limit...
On Fri, Jul 13, 2012 at 11:50 AM, Amandeep Khurana ama...@gmail.com wrote:
I have come across clusters with 100s of tables but that typically is
due to a
I'm going from memory. There was a hardcoded number. I'd have to go back and
try to find it.
From a practical standpoint, going over 1000 regions per RS will put you on
thin ice.
Too many regions can kill your system.
On Jul 13, 2012, at 12:36 PM, Kevin O'dell wrote:
Mike,
I just saw
It is basically unset:
this.regionSplitLimit = conf.getInt(hbase.regionserver.regionSplitLimit,
Integer.MAX_VALUE);
(from CompactSplitThread.java).
The number of regions is OK until you dilute the available heap share too much.
So you can have 1000 regions (given the block index,