Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/12706 )
Change subject: [known_issues] the scalability of location awareness ...................................................................... Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/12706/3/docs/known_issues.adoc File docs/known_issues.adoc: http://gerrit.cloudera.org:8080/#/c/12706/3/docs/known_issues.adoc@151 PS3, Line 151: It's not expected to happen during regular usage of small and moderate-sized : Kudu clusters > Yeah, and this now reads as if the number of _live_ replicas is what matter Right, but as it turned out the size of the address space is the the main factor here. See below for more information. http://gerrit.cloudera.org:8080/#/c/12706/3/docs/known_issues.adoc@151 PS3, Line 151: It's not expected to happen during regular usage of small and moderate-sized : Kudu clusters > > I found that just around 10K replicas total *and not so huge size of mast | Does you mean that the 10K replicas were not contributing to a larger master address space? No, I don't mean that. I meant that even with small number of tablet replicas (about 11K total), it's possible to get that behavior even if master's address space is less that 700MB. So, for comparison, originally master took about 500MB of memory with less than 1K replicas. In that case the fork was fast enough and overall execution time of location assignment was less that 30 ms (75 percentile). However, with just 11K replicas around spread across 100 tablet servers, with the same request rate eventually it snowballed to many timed out requests. | when the number of replicas per tablet server is within the node density limits ... That isn't true -- even with less than 50K replicas (counting in the replication factor) spread among 100 tablet servers that issue manifested itself with pretty moderate rate of requests from clients. Ideally, I don't want to throw any numbers. 20 or 10+, whatever. YMMV, because hardware might be different. -- To view, visit http://gerrit.cloudera.org:8080/12706 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: branch-1.9.x Gerrit-MessageType: comment Gerrit-Change-Id: I04dad488a377bf4cd36534d648a69d2fb2444fea Gerrit-Change-Number: 12706 Gerrit-PatchSet: 3 Gerrit-Owner: Alexey Serbin <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Greg Solovyev <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Will Berkeley <[email protected]> Gerrit-Comment-Date: Tue, 12 Mar 2019 03:00:51 +0000 Gerrit-HasComments: Yes
