Hi ,
running scan job against hbase I got such exceptions:
org.apache.hadoop.hbase.regionserver.LeaseException:
org.apache.hadoop.hbase.regionserver.LeaseException: lease
'4926582861878965506' does not exist
at
org.apache.hadoop.hbase.regionserver.Leases.removeLease(Leases.java:230)
Hi,
There is no real limits as far as I know. As you will have one region
per table (at least :-), the number of region will be something to
monitor carefully if you need thousands of table. See
http://hbase.apache.org/book.html#arch.regions.size.
Don't forget that you can add as many column as
Hi Paul, I also met such a problem, have you solved this problem. How to work it
out?
Thanks a lot!
That must have been it.
Unfortunately I couldn't really test this command, since the guys from ops
rebooted the entire computer room during maintenance, and it fixed the issue.
(This room is a lab room of course)
Asaf
On Jul 13, 2012, at 4:27 AM, Esteban Gutierrez wrote:
date
Hello,
I would like to understand more in-depth how fault tolerance is handled in
HBase:
1. So for each put operation an RS first writes to an HLog file and then to
the Memstore. If the RS crashes the HLog file is replayed by other servers,
correct?
My question is how is this HLog file different
Hi Ian,
What is your suggestion then?
Sent from my iPad
On 2012-7-13, at 下午12:55, Ian Varley ivar...@salesforce.com wrote:
Yes, that's what I mean.
It is not the only way to model this, but your question was, Can we embedded
the transactions inside the customer table in HBase.
On
Hi Sever
Coprocessors are still new for me, so I don't have a good answer for your
second question.
But for your first, (as far as I understand) remember that you can send
Puts/Deletes in any order, and Memstore is responsible for keeping your data
sorted before flushing to a StoreFile, and
You might have a look at the reference guide, or this article (
http://blog.newitfarmer.com/nosql/hbase-nosql/4385/repost-hbasearchitecture)
or the Lars one (
http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html)
which explain HBase Storage architecture.
Take care of block size,
Hi Oleg
Normally, if your client does not process rows fast enough you should
get a UnknownScannerException. A higher cache value means a higher time
between two next() operations (that renew the lease) so a higher chance
to get a USE. Try to optimize the row processing in the client and lower
First,
A caveat... Schema design in HBase is one of the hardest things to teach/learn
because its so open. There is more than one correct answer when it comes to
creating a good design...
Ian's presentation kind of tries to relate HBase schema design to relational
modeling.
From past
Currently there is a hardcoded limit on the number of regions that a region
server can manage.
Its 1500.
Note that if the number of regions gets to around 1000 regions per region
server, you end up with a performance hit. (YMMV)
So if you have 1 region per table, there's a real limit of 1500
I have come across clusters with 100s of tables but that typically is
due to a sub optimal table design.
The question here is - why do you need to distribute your data over
lots of tables? What's your access pattern and what kind of data are
you putting in? Or is this just a theoretical question?
Mike,
I just saw a system with 2500 Regions per RS(crazy I know, we are fixing
that). I did not think there was a hard coded limit...
On Fri, Jul 13, 2012 at 11:50 AM, Amandeep Khurana ama...@gmail.com wrote:
I have come across clusters with 100s of tables but that typically is
due to a
I'm going from memory. There was a hardcoded number. I'd have to go back and
try to find it.
From a practical standpoint, going over 1000 regions per RS will put you on
thin ice.
Too many regions can kill your system.
On Jul 13, 2012, at 12:36 PM, Kevin O'dell wrote:
Mike,
I just saw
It is basically unset:
this.regionSplitLimit = conf.getInt(hbase.regionserver.regionSplitLimit,
Integer.MAX_VALUE);
(from CompactSplitThread.java).
The number of regions is OK until you dilute the available heap share too much.
So you can have 1000 regions (given the block index,
Hi all,
The HBase instance I'm managing has grown to the point that it has way too
many regions per server - 5 region servers with 1010 regions each on HBase
0.90.4-cdh3u2. I want to bring this region count under control. The
cluster is currently running with the default region size of 256 mb,
It can be reasonable to turn off the automatic region split if you know
your rowkey distribution well and you're able to ensure a great parallelism
among your regionservers easily. (ie: manually or through HBase API).
Sometimes it's even the best solution to ensure the minimum number of
regions
In almost every table, the rowkey is either a SHA hash, or a SHA hash and a
timestamp, so we have a fairly even distribution of rowkeys now.
Is there a best practice for number of regions of a table per server?
Meaning, with 5 region servers, 10 regions per table, so 170 regions per
region
Everyone will tell you that handling less regions is always better.
Depending on your setup, data-size and number of records, I would say that
1 to 5 regions per table and server is acceptable. In some setup (one big
table for example) you can see up to 100/200 regions per server, which is
the
Tables are like a loose organizational structure to allow you to have more
granular per-table configurations or just for your own logical separation of
data. There aren't any best practices with regards to regions per table. What
is more important is regions per region server and regions per
20 matches
Mail list logo