Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12706 )

Change subject: [known_issues] the scalability of location awareness
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12706/3/docs/known_issues.adoc
File docs/known_issues.adoc:

http://gerrit.cloudera.org:8080/#/c/12706/3/docs/known_issues.adoc@151
PS3, Line 151: It's not expected to happen during regular usage of small and 
moderate-sized
             :   Kudu clusters
> Yeah, and this now reads as if the number of _live_ replicas is what matter
Right, but as it turned out the size of the address space is the the main 
factor here.  See below for more information.


http://gerrit.cloudera.org:8080/#/c/12706/3/docs/known_issues.adoc@151
PS3, Line 151: It's not expected to happen during regular usage of small and 
moderate-sized
             :   Kudu clusters
> > I found that just around 10K replicas total *and not so huge size of mast
| Does you mean that the 10K replicas were not contributing to a larger master 
address space?

No, I don't mean that.  I meant that even with small number of tablet replicas 
(about 11K total), it's possible to get that behavior even if master's address 
space is less that 700MB.  So, for comparison, originally master took about 
500MB of memory with less than 1K replicas.  In that case the fork was fast 
enough and overall execution time of location assignment was less that 30 ms 
(75 percentile).  However, with just 11K replicas around spread across 100 
tablet servers, with the same request rate eventually it snowballed to many 
timed out requests.

| when the number of replicas per tablet server is within the node density
  limits ...

That isn't true -- even with less than 50K replicas (counting in the 
replication factor) spread among 100 tablet servers that issue manifested 
itself with pretty moderate rate of requests from clients.

Ideally, I don't want to throw any numbers. 20 or 10+, whatever.  YMMV, because 
hardware might be different.



--
To view, visit http://gerrit.cloudera.org:8080/12706
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.9.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I04dad488a377bf4cd36534d648a69d2fb2444fea
Gerrit-Change-Number: 12706
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Greg Solovyev <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Will Berkeley <[email protected]>
Gerrit-Comment-Date: Tue, 12 Mar 2019 03:00:51 +0000
Gerrit-HasComments: Yes

Reply via email to