Alexey Serbin created KUDU-3212:
-----------------------------------
Summary: Location assignment improvements
Key: KUDU-3212
URL: https://issues.apache.org/jira/browse/KUDU-3212
Project: Kudu
Issue Type: Improvement
Components: client, master, tserver
Affects Versions: 1.13.0
Reporter: Alexey Serbin
Current implementation of location assignment has some room for improvement.
As of now, the following is understood:
# Implementation-wise, Kudu masters could use newly introduced
[Subprocess|https://github.com/apache/kudu/tree/master/src/kudu/subprocess]
functionality to run location assignment script. That would be more robust
than using current fork/exec approach to run the script, especially for larger
deployments where Kudu masters might have high request-per-second ratio (many
active threads running, a lot of memory allocated, etc.)
# Conceptually, Kudu tablet servers could have all the necessary information
regarding their location at startup and that information isn't going to change
while tablet server is running. The server/machine they are running at is
provisioned to be in some rack, availability zone, data center, etc. and that
assignment isn't changing while the server is up and running. So, a Kudu
tablet server can be provided with information about its location upon startup;
there is no need to consult Kudu master about this.
# Conceptually, Kudu clients might be aware of their location as well.
To address item 1, it's necessary to update current implementation of location
assignment, so the script should be run by a dedicated subprocess forked off
earlier during master's startup. Ideally, to make it more robust, the
subprocess server can run the location assignment script as a small server that
takes an IP or DNS name on input and provides location label on the output,
maybe line-by-line. The latter assumes chaning the requirement for a location
assignment script, and probably we should introduce a separate flag to to
specify the path to a script that is capable running in such mode. However,
even with current location assignment approach when it's necessary to run a
script per location assignment request, using the {{Subprocess}} functionality
would benefit larger deployments where forking a kudu-master process might be
slow inefficient.
To address item 2, it's necessary to introduce a new tablet server's flag that
is set to the assigned location for the tablet server. The systemd/init.d
startup script for kudu-tserver should populate the flag with proper value.
It's also necessary to introduce a new field in the {{TSHeartbeatRequestPB}}
message to pass the location from tablet server to master. If master sees that
field populated, it should not run the location assignment script, even if it's
specified. This way it would be possible to perform rolling upgrades from
older versions which use centrally managed location assignment script to the
version that implements the new approach.
To address item 3, it's necessary to find a means to specify location for a
Kudu client. Probably, an environment variable can be used for that. The
{{ConnectToMasterRequestPB}} can be extended to include an optional
{{client_location}} field. In addition, if
{{\-\-master_client_location_assignment_enabled}} is set to {{true}}, master
could run the location assignment script to assign a location to a client which
doesn't populate the newly introduced
{{ConnectToMasterRequestPB::client_location}} field.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)