Grant Henke created KUDU-3135:
---------------------------------
Summary: Add Client Metadata Tokens
Key: KUDU-3135
URL: https://issues.apache.org/jira/browse/KUDU-3135
Project: Kudu
Issue Type: Improvement
Components: client
Affects Versions: 1.12.0
Reporter: Grant Henke
Currently when a distributed task is done using the Kudu client, the
driver/coordinator client needs to open the table to request its current
metadata and locations. Then it can distribute the work to tasks/executors on
remote nodes. In the case of reading data often ScanTokens are used to
distribute the work, and in the case of writing data perhaps just the table
name is required.
The problem is that each parallel task then also needs to open the table to
request the metadata for the table. Using Spark as an example, this happens
when deserializing the scan tokens in KuduRDD
([here|https://github.com/apache/kudu/blob/master/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala#L107-L108])
or when writing rows using the KuduContext
([here|https://github.com/apache/kudu/blob/master/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala#L466]).
This results in a large burst of metadata requests to the leader Kudu master
all at once. Given the Kudu master is only a single server and requests can't
be served from the follower masters, this effectively limits the amount of
parallel tasks that can run in a large Kudu deployment. Even if the follower
masters could service the requests, that still limits scalability in very large
clusters given most deployments would only have 3-5 masters.
Adding a metadata token, similar to a scan token, would be a useful way to
allow the single driver to fetch all the metadata required for the parallel
tasks. The tokens can be serialized and then passed to each task in a similar
fashion to scan tokens.
Of course in a pessimistic case, something may change between generation of the
token and the start of the task. In that case a request would need to be sent
to get the updated metadata. However, that scenario should be rare and likely
would not result in all of the requests happening at the same time.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)