Grant Henke created KUDU-3147:
---------------------------------

             Summary: Balance tablets based on range hash buckets
                 Key: KUDU-3147
                 URL: https://issues.apache.org/jira/browse/KUDU-3147
             Project: Kudu
          Issue Type: Improvement
          Components: master, perf
    Affects Versions: 1.12.0
            Reporter: Grant Henke


When a user defines a schema that uses range + hash partitioning its is often 
the case that the tablets in the latest range, based on time or any 
semi-sequential data, are the only tablets that receive writes. Or even if not 
the latest, it is common for a single range to receive a burst of writes if 
backloading.

This is so common, that the default Kudu balancing scheme should consider 
placing/rebalancing the tablets for the hash buckets within each range on as 
many servers as possible in order to support the maximum write throughput. In 
that case, `min(#buckets, #total-cluster-tservers)` tservers will be used to 
handle the writes if the cluster is perfectly balanced. Today, even if 
perfectly balanced, it is possible for all the hash buckets to be on a single 
tserver.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to