Alexey Serbin created KUDU-3242:
-----------------------------------
Summary: Investigate performance of GetTableSchema when authz
tokens enabled
Key: KUDU-3242
URL: https://issues.apache.org/jira/browse/KUDU-3242
Project: Kudu
Issue Type: Task
Affects Versions: 1.13.0, 1.11.1, 1.12.0, 1.11.0, 1.10.1, 1.10.0, 1.14.0
Reporter: Alexey Serbin
As shown by benchmarks (see {{ConcurrentGetTableSchemaTest.Rpc}} and
{{ConcurrentGetTableSchemaTest.DirectMethodCall}} test scenarios), processing
{{GetTableSchema()}} RPC takes much more CPU resources when generating authz
tokens. The latter is controlled by the {{\-\-master_support_authz_tokens}}
flag, which is set to {{true}} by default.
Measuring the maximum achievable rate of requests that kudu-master is able to
process at a particular node, the difference is in range from 5 to 15 times
depending on hardware (CPU features, etc.)
Given that the generation of authz tokens is turned on even if authz tokens are
not used/needed (i.e. no fine-grained authz support via Sentry/Ranger is
enabled), this might bring unexpected surprises when upgrading from an earlier
version to 1.11 or later.
As a stop-gap we can disable the generation of authz tokens by default, and it
should be explicitly enabled with enabling fine-grained authz support.
It's necessary to investigate the issue and find a way to address it in the
scope of scalability of Kudu clusters.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)