This is an automated email from the ASF dual-hosted git repository.
awong pushed a commit to branch branch-1.9.x
in repository https://gitbox.apache.org/repos/asf/kudu.git
The following commit(s) were added to refs/heads/branch-1.9.x by this push:
new 16f8fc8 [known_issues] the scalability of location awareness
16f8fc8 is described below
commit 16f8fc8cccd8bc962eb731d8ae8a0baa0d1369cd
Author: Alexey Serbin <[email protected]>
AuthorDate: Fri Mar 8 15:13:05 2019 -0800
[known_issues] the scalability of location awareness
Added information about poor scalability of the location
awareness implementation in 1.9.0 in terms of number of
concurrent clients connecting to cluster.
Change-Id: I04dad488a377bf4cd36534d648a69d2fb2444fea
Reviewed-on: http://gerrit.cloudera.org:8080/12706
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo <[email protected]>
Reviewed-by: Andrew Wong <[email protected]>
---
docs/known_issues.adoc | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/docs/known_issues.adoc b/docs/known_issues.adoc
index 83890be..05ac621 100644
--- a/docs/known_issues.adoc
+++ b/docs/known_issues.adoc
@@ -104,8 +104,6 @@
== Cluster management
-* Rack awareness is not supported.
-
* Multi-datacenter is not supported.
* Rolling restart is not supported.
@@ -142,6 +140,24 @@
* Maximum number of tablets per table for each tablet server is 60,
post-replication (assuming the default replication factor of 3), at
table-creation time.
+* When enabled, location awareness in its current implementation doesn't scale
+ with the number of clients connecting to a Kudu cluster simultaneously.
+ If the rate of new clients connecting is kept high (e.g., 100 request/second)
+ for a long period of time or there is a short period of time when a huge
+ number of such requests arrive to Kudu masters simultaneously (e.g. 10000
+ requests arrive within one second), Kudu masters might experience RPC queue
+ overflows and overall slowness. The slowness becomes more prominent with
+ the increasing size of the master process in memory, where the major
+ contributing factor is the total number of tablet replicas ever created in
+ the cluster. Eventually, the issue may manifest as write and scan operations
+ timing out. If that happens, it's recommended to use the following
workaround:
+** Disable assignment of locations to clients, adding `--enable_unsafe_flags`
+ and `--master_client_location_assignment_enabled=false` to the list of
+ runtime flags for Kudu masters. This retains the benefits of location
+ awareness for initial placement of tablet replicas and re-replication, but
+ clients will not be able to use location information to choose
+ the closest tablet server for scan operations.
+
== Replication and Backup Limitations
* Kudu does not currently include any built-in features for backup and restore.