Re: Feedback on ACM SOCC paper about elasticity and scalability

Ted Yu Sat, 14 May 2011 12:48:28 -0700

The lookup in .META. uses cache by default. 
Take a look at HConnectionManager.getCachedLocation().


Region locations should be kept in local cache which are queried to find which 
region server serves the requested row. 
I actually fixed a bug Friday to deal with evicted cache entries because we use 
SoftValue. This could happen when  client heap runs low. 

As I said before, the current balancer doesn't use number of requests as 
balancing criteria. This would be in next version of balancer. 

Cheers

On May 14, 2011, at 9:26 AM, Thibault Dory <[email protected]> wrote:

> Thank you Ted, I'll try this when I have more time to do new tests. I'm
> currently writing my master's thesis and therefore those are likely to be
> postponed for a little while.
> 
> What do you think about my last hypothesis to explain the big number of
> requests on a single region server?
> 
> Regards
> 
> On Sat, May 14, 2011 at 6:21 PM, Ted Yu <[email protected]> wrote:
> 
>> Thibault:
>> I have done some work, namely HBASE-3507 and HBASE-3647, trying to show
>> read/write request counts per region.
>> They're all in HBase trunk.
>> You may want to load HBase trunk (which would be version 0.92) so that you
>> can observe read/write request counts.
>> 
>> Cheers
>> 
>> On Sat, May 14, 2011 at 9:02 AM, Thibault Dory <[email protected]
>>> wrote:
>> 
>>> On Sat, May 14, 2011 at 5:16 PM, Jean-Daniel Cryans <[email protected]
>>>> wrote:
>>> 
>>>> On Sat, May 14, 2011 at 6:40 AM, Thibault Dory <
>> [email protected]>
>>>> wrote:
>>>>> I'm wondering what are the possible bottlenecks of an HBase cluster,
>>> even
>>>> if
>>>>> there are cache mechanism, the fact that some data are centralized
>>> could
>>>>> lead to a bottleneck (even if its quite theoretical given the load
>>> needed
>>>> to
>>>>> achieve it).
>>>> 
>>>> Isn't that what your paper is about?
>>>> 
>>> 
>>> Yes that is part of the things that could be observed but it looks like a
>>> much bigger budget would be needed to get to clusters big enough to
>> observe
>>> it for HBase. Anyway, the main thing we were interested in is elasticity.
>>> 
>>> 
>>>> 
>>>>> Would it be right to say the following ?
>>>>> 
>>>>>  - The namenode is storing all the meta data and must scale
>> vertically
>>>> if
>>>>> the cluster becomes very big
>>>> 
>>>> The fact that there's only 1 namenode is bad in multiple ways,
>>>> generally people will be more bothered by the fact that it's a single
>>>> point of failure. Larger companies do hit the limits of that single
>>>> machine so Y! worked on "Federated Namenodes" as a way to circumvent
>>>> that. See http://www.slideshare.net/huguk/hdfs-federation-Riak's data
>>>> model representationhadoop-hadoop-summit2011<
>>> http://www.slideshare.net/huguk/hdfs-federation-hadoop-summit2011>
>>>> 
>>>> This work is already available in hadoop's svn trunk.
>>>> 
>>> 
>>> Thanks, I did not know about "Federated Namenodes", this is interesting.
>>> 
>>> 
>>>> 
>>>>>  - There is only one node storing the -ROOT- table and only one node
>>>>> storing the .META. table, if I'm doing a lot of random accesses and
>>> that
>>>> my
>>>>> dataset is VERY large, could I overload those node?
>>>> 
>>>> Again, I believe this is the subject of your paper right?
>>> 
>>> 
>>> Indeed this is part of it but that does not mean that I'm an HBase
>>> specialist, this is why I'm asking to you here as you may have more
>>> experience with big clusters or have a good knowledge of the internal of
>>> HBase. Unfortunately I did not had the time to this before the deadline
>> of
>>> the paper but I'll be granted additional time if it is accepted, so it's
>>> not
>>> too late.
>>> 
>>> 
>>>> Anyways so
>>>> in general in -ROOT- has 1 row, and that row is cached. Even if you
>>>> have thousands of clients that need to update their .META. location
>>>> (this would only happen at the beginning of a MR job or if .META.
>>>> moves), serving from memory is fast.
>>> 
>>> 
>>>> Next you have .META., again the clients cache their region locations
>>>> so once they have it they don't need to talk to .META. until a region
>>>> moves or gets split. Also .META. isn't that big and is usually served
>>>> directly from memory.
>>> 
>>> 
>>>> The BT paper mentions they allow the splitting of .META. when it grows
>>>> a bit too much and this is something we've blocked for the moment in
>>>> HBase.
>>>> 
>>>> J-D
>>>> 
>>> 
>>> Going back to my original problem, the fact that one region server was
>>> always overloaded with requests while the others were only serving a few
>>> requests despite of my requests generated using a uniform distribution, I
>>> would like to know what you think about the idea of Ted Yu saying that it
>>> may be related to the fact that the overloaded region server could be the
>>> one storing the .META. table.
>>> 
>>> At that point in tests, the cluster was made of 24 nodes and was storing
>> 40
>>> millions rows in HBase. As my requests are fully random, there is a high
>>> probability given the total number of entries, that a lot requests issued
>>> by
>>> a client are for entries they did not requested before, leading to a
>> lookup
>>> to the .META. table for almost each request.
>>> Of course this is valid only if the client does not know that an entry it
>>> never asked for is in a region it has already accessed before. Is it the
>>> case? For example if a client ask for the row 10 and sees that it is in
>> the
>>> region 2, will it know that the row 15 is also in the region 2 without
>>> making a new lookup into the .META. table?
>>> 
>>

Re: Feedback on ACM SOCC paper about elasticity and scalability

Reply via email to