Thanks for starting a discussion on this topic.
> There've been a few discussions recently regarding changes to Kudu's
> metadata storage. There are a number of areas that could benefit from
> improving this layer, and I've been coalescing some of these ideas to lock
> down what changes make sense in the near future. Here's a list of a few
> considerations:
>
> https://docs.google.com/document/d/1jXFqIZvLwkkmSjLC0wy-mDA0l0GmVuZJUMUB_7q5QUE/edit?usp=sharing
I left a few comments in the gdoc.
> Empirically, scalability is not as big of an adoption-bottleneck as it was
> before. Additionally, it's not clear that the listed scalability issues are
> the biggest bottlenecks to larger data volumes. Moving forward, we should
> keep track of user stories that would benefit from such improvements.
I don't really view scalability as an adoption bottleneck; I think
it's more of a barrier to growing existing clusters. As such, I think
it's important that we constantly iterate on scalability, so that
every release is a little bit more scalable than the one before. That
way Kudu users are assured that their deployments can keep growing as
Kudu continues to mature.
As far as specific bottlenecks are concerned, the spreading of LBM
metadata across multiple files is a main contributor to our long
startup times, and that's one of the biggest scalability bottlenecks
AFAIK.
> With these points in mind, it seems the reasonable path forward is go with
> 1. and introduce a flag for users to colocate WALs and metadata.
Makes sense. Will colocation be enabled by default for new clusters?
How about using the flag to define an explicit metadata directory,
with the default being empty ("use the WAL directory")? That'd make it
similar to fs_data_dirs, which is nice. If the metadata directory
can't be found in directory specified by this gflag (or in the WAL
directory, if blank), we can fall back to looking in the first data
directory.