Alexey Serbin created KUDU-3212:
-----------------------------------

             Summary: Location assignment improvements
                 Key: KUDU-3212
                 URL: https://issues.apache.org/jira/browse/KUDU-3212
             Project: Kudu
          Issue Type: Improvement
          Components: client, master, tserver
    Affects Versions: 1.13.0
            Reporter: Alexey Serbin


Current implementation of location assignment has some room for improvement.  
As of now, the following is understood:

# Implementation-wise, Kudu masters could use newly introduced 
[Subprocess|https://github.com/apache/kudu/tree/master/src/kudu/subprocess] 
functionality to run location assignment script.  That would be more robust 
than using current fork/exec approach to run the script, especially for larger 
deployments where Kudu masters might have high request-per-second ratio (many 
active threads running, a lot of memory allocated, etc.)
# Conceptually, Kudu tablet servers could have all the necessary information 
regarding their  location at startup and that information isn't going to change 
while tablet server is running. The server/machine they are running at is 
provisioned to be in some rack, availability zone, data center, etc.  and that 
assignment isn't changing while the server is up and running.  So, a Kudu 
tablet server can be provided with information about its location upon startup; 
there is no need to consult Kudu master about this.
# Conceptually, Kudu clients might be aware of their location as well.

To address item 1, it's necessary to update current implementation of location 
assignment, so the script should be run by a dedicated subprocess forked off 
earlier during master's startup.  Ideally, to make it more robust, the 
subprocess server can run the location assignment script as a small server that 
takes an IP or DNS name on input and provides location label on the output, 
maybe line-by-line.  The latter assumes chaning the requirement for a location 
assignment script, and probably we should introduce a separate flag to to 
specify the path to a script that is capable running in such mode.  However, 
even with current location assignment approach when it's necessary to run a 
script per location assignment request, using the {{Subprocess}} functionality 
would benefit larger deployments where forking a kudu-master process might be 
slow inefficient.

To address item 2, it's necessary to introduce a new tablet server's flag that 
is set to the assigned location for the tablet server.  The systemd/init.d 
startup script for kudu-tserver should populate the flag with proper value.  
It's also necessary to introduce a new field in the {{TSHeartbeatRequestPB}} 
message to pass the location from tablet server to master.  If master sees that 
field populated, it should not run the location assignment script, even if it's 
specified.  This way it would be possible to perform rolling upgrades from 
older versions which use centrally managed location assignment script to the 
version that implements the new approach.

To address item 3, it's necessary to find a means to specify location for a 
Kudu client.  Probably, an environment variable can be used for that.   The 
{{ConnectToMasterRequestPB}} can be extended to include an optional 
{{client_location}} field.  In addition, if 
{{\-\-master_client_location_assignment_enabled}} is set to {{true}}, master 
could run the location assignment script to assign a location to a client which 
doesn't populate the newly introduced 
{{ConnectToMasterRequestPB::client_location}} field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to