[
https://issues.apache.org/jira/browse/KUDU-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexey Serbin updated KUDU-3212:
--------------------------------
Affects Version/s: 1.10.1
1.12.0
1.11.1
> Location assignment improvements
> --------------------------------
>
> Key: KUDU-3212
> URL: https://issues.apache.org/jira/browse/KUDU-3212
> Project: Kudu
> Issue Type: Improvement
> Components: client, master, tserver
> Affects Versions: 1.10.1, 1.12.0, 1.11.1, 1.13.0
> Reporter: Alexey Serbin
> Priority: Major
> Labels: performance, scalability
>
> Current implementation of location assignment has some room for improvement.
> As of now, the following is understood:
> # Implementation-wise, Kudu masters could use newly introduced
> [Subprocess|https://github.com/apache/kudu/tree/master/src/kudu/subprocess]
> functionality to run location assignment script. That would be more robust
> than using current fork/exec approach to run the script, especially for
> larger deployments where Kudu masters might have high request-per-second
> ratio (many active threads running, a lot of memory allocated, etc.)
> # Conceptually, Kudu tablet servers could have all the necessary information
> regarding their location at startup and that information isn't going to
> change while tablet server is running. The server/machine they are running at
> is provisioned to be in some rack, availability zone, data center, etc. and
> that assignment isn't changing while the server is up and running. So, a
> Kudu tablet server can be provided with information about its location upon
> startup; there is no need to consult Kudu master about this.
> # Conceptually, Kudu clients might be aware of their location as well.
> To address item 1, it's necessary to update current implementation of
> location assignment, so the script should be run by a dedicated subprocess
> forked off earlier during master's startup. Ideally, to make it more robust,
> the subprocess server can run the location assignment script as a small
> server that takes an IP or DNS name on input and provides location label on
> the output, maybe line-by-line. The latter assumes chaning the requirement
> for a location assignment script, and probably we should introduce a separate
> flag to specify the path to a script that is running in such mode. However,
> even with current location assignment approach when it's necessary to run a
> script per every location assignment request, using the {{Subprocess}}
> functionality would benefit larger deployments where fork/exec sequence for a
> {{kudu-master}} process is slow and inefficient.
> To address item 2, it's necessary to introduce a new tablet server's flag
> that is set to the assigned location for the tablet server. The
> systemd/init.d startup script for kudu-tserver should populate the flag with
> proper value. It's also necessary to introduce a new field in the
> {{TSHeartbeatRequestPB}} message to pass the location from tablet server to
> master. If master sees the field populated, it should not run the location
> assignment script, even if the location assignment script is set specified
> (i.e. {{\-\-location_mapping_cmd}} flag is set). This way it would be
> possible to perform rolling upgrades from older versions which use centrally
> managed location assignment script to the version that implements the new
> approach.
> To address item 3, it's necessary to find a means to specify location for a
> Kudu client. Probably, an environment variable can be used for that. The
> {{ConnectToMasterRequestPB}} can be extended to include an optional
> {{client_location}} field. In addition, if
> {{\-\-master_client_location_assignment_enabled}} is set to {{true}}, master
> could run the location assignment script to assign location to a client which
> doesn't populate the newly introduced
> {{ConnectToMasterRequestPB::client_location}} field.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)