Hotspotting was the first thing that came to my mind with the proposed 
balancer. The fservers don't keep all the K/V in memory. You are balancing 
query and live ingest across your resources.




-------- Original message --------
From: Eric Newton <[email protected]> 
Date: 07/29/2015  8:46 PM  (GMT-05:00) 
To: [email protected] 
Subject: Re: Entry-based TableBalancer 

To my knowledge, nobody has written such a balancer.
In the history of the project, we started writing advanced, complicated 
balancers that moved tablets around much too quickly, which degraded 
performance. After that, we wrote much simpler balancers to avoid the chaos. 
We're moving back to more complex balancers, but mostly just to ensure that we 
aren't hotspoting, based on known ingest patterns (date related, for example).
If you write a new balancer, make it slow to move tablets, and very simple.  
Avoid over-optimizing tablet placement.
-Eric
On Wed, Jul 29, 2015 at 8:20 PM, Konstantin Pelykh <[email protected]> wrote:
Hi, 

I'm looking for a tablet balancer which operates based on a number of entries 
per tablet as opposed to a number of tablets per tablet server. My goal is to 
get even distribution of entries across the cluster. 

As an example: 

tablet #1  15M entries
tablet #2   5M entries
tablet #3   8M entries

After balancing tablets I would want to get:

Server 1 hosts: tablet1 
Server 2 hosts: tablet2, tablet3

The idea is pretty simple and I believe such balancer has already been 
developed, so I decided to check before reinventing the wheel. 

Thanks!
Konstantin

--------
Big Data / Lucene and Solr Consultant
LinkedIn: linkedin.com/in/kpelykh
Website: www.kpelykh.com



Reply via email to