It sounds to me like you're mostly looking for general distributed systems principles? There's a lot of space to explore here, and frankly it is not really related to Flight. A survey text like Designing Data-Intensive Applications or something may be helpful, in case you aren't familiar with the domain (sorry, I'm not quite sure what level of detail you're looking for or are familiar with already).
I don't think there's any Flight application example that is exactly what you are looking for. Most likely, I think you will want to use things in addition to Flight. For instance, a (distributed) database or key-value store (either a traditional store like PostgreSQL, with caveats on scale, or something like etcd) could store common state for metadata and data services; if the client and servers need especially complex coordination, implementing your own gRPC service to handle that (e.g. negotiating what servers to upload data to) and just using Flight for the actual data transfer may be an option. On Wed, Jan 11, 2023, at 15:18, Philip Carinhas wrote: > Trouble we’re having is: > 1. understanding how to implement the distributed aspects of Flight, > 2. how to communicate to the data (DoGet) servers what they should be > storing, and > 3. how to communicate to the metadata servers what the data servers have. > As you say, we have to do this all ourselves, but if someone has done this > before, we’d appreciate any references to examples. > > No direct access: I just mean we don’t want to put static data on the data > servers, on disk. > > *From: *David Li <[email protected]> > *Date: *Wednesday, January 11, 2023 at 1:51 PM > *To: *dl <[email protected]> > *Subject: *Re: Sharded Flight Server > Hi Philip, > > What exactly are you having trouble with? Flight is a protocol, so you'd be > implementing your own metadata and data servers. (Flight doesn't provide you > server implementations, just the means to build them and suggested > conventions to follow.) The 'metadata' server would implement GetFlightInfo, > and would need some way of knowing about the data servers, their locations, > and the available datasets (Flight doesn't implement this for you). The > 'data' server would implement DoGet. > > What do you mean by 'no direct access'? It sounds like the clients do have > access to the server in this scheme. There's also not a defined convention > for clients to distribute writes across servers. > > -David > > On Wed, Jan 11, 2023, at 14:38, Philip Carinhas wrote: >> I’d like to setup a sharded Flight server with one metadata server, and >> several data servers. I’m not finding documentation on how to do this. In >> particular we want to: >> >> 1. Setup a single flight metadata >> 2. Setup several sharded data Flight servers >> >> 3. Setup clients that distribute data to sharded data servers: >> a. All data is pushed to servers by clients, no direct access. This >> assumes that data is placed in memory on the data servers? >
