Without any replication, you essentially just need something that decides how to distribute across your data servers. Whatever information is stored in your tickets should allow metadata on the metadata server to identify storage objects on the data server. Storage medium on the data servers can be independent (nothing else needs to know about it), and your metadata server is essentially going to participate in every data access, so you want that interaction to be minimal or very fast.
Otherwise, I suspect the overall flow to look very much like the diagrams in the arrow flight documentation. On Tue, Jan 17, 2023 at 11:39 Philip Carinhas < [email protected]> wrote: > Aldrin, > > > > Thanks for the detailed answer. I’m familiar with many of the concepts > of distributed storage, but we don’t intend to do any data replication. > > At most, we will have a single metadata server that stores information > about data sources. Regardless, we’ll take a look at the links you provided. > > > > *From: *Aldrin <[email protected]> > *Date: *Friday, January 13, 2023 at 1:44 PM > *To: *[email protected] <[email protected]> > *Subject: *Re: Sharded Flight Server: Continued > > 1. Setup a single flight metadata > > Implement the subset of flight interface that supports metadata > operations (such as GetFlightInfo() mentioned by David) > > > > 2. Setup several sharded data Flight servers > > Implement the subset of flight interface that supports data operations > (such as DoGet() mentioned by David) > > > > 3. Setup clients that distribute data to sharded data servers: > > Even looking at the original thread, I am mildly confused. Between A and > B, I am not sure what you're asserting or what you're trying to clarify. > > > > a. All data is pushed to servers by clients, no direct access. This > assumes that data is placed in memory on the data servers? > > When writing new data (and often when sending updates), the client sends > the data to the server. This does not mean that the data needs to be > in-memory only. From the original thread, though, it seems like that is a > requirement you have? > > Also, this usage of direct access is confusing even after reading your > clarification in the original thread. > > > > B. How does data get distributed to the various data servers? > > In this case, David's advice is particularly true: you just want to see > the various ways other distributed systems handle this. The brief summary > of the fundamental approaches is: > > A) At a client, partition data then send each data partition to > the appropiate data server > > B) At a client, send data to a particular data server. At the data > server, partition data then send each data partition to the appropriate > data server in a peer-to-peer fashion > > C) hybrid of A and B where your client knows some coarse-grained > data distribution and each data server is aware of some fine-grained data > distribution (this is most relevant for data that is partitioned and > replicated) > > > > c. Is this a valid use case? > > This is a fundamental approach for distributed storage systems (many data > servers, few metadata servers). Plus, you're trying to use Arrow, so it > sounds especially valid. > > > > 4. If anyone can point us in the right direction with some code examples, > we would very much appreciate. > > The way to think about it is that Flight defines an interface or protocol > to facilitate communication. The diagram in [1] shows exactly how the > client and servers would interact for a GET request (downloading data), > which should also contextualize David's responses if you haven't seen the > diagram. > > > > One thing you'll need to do is figure out what to put in your "ticket" > that facilitates how your flight servers identify data; this should be > dependent on your storage model (i.e. file names or key names). How your > data servers store data, and how your metadata server stores metadata is > totally up to you, but whatever info you get from the metadata server > should be usable by the data server to actually retrieve data. > > > > I agree with David that there is a lot of content and your questions don't > make it clear how much of that info you need clarification on. In case you > have little experience, here's some quick info to help orient you when > looking through resources: > > 1. Distributed storage systems use the architecture you're mentioning > (metadata server and data servers) > > 2. Key value stores and distributed databases frequently just have many > database servers, where each server has both metadata and data > > 3. If you are just distributing data in disjoint partitions (data values > on one data server do not exist on other data servers), then the design is > much simpler. > > 4. If you are doing replication then things become much more complex and > consistency becomes something that your system needs to accommodate. > > 5. 1-4 are orthogonal to Arrow Flight, which just provides the foundation > for how clients and servers communicate. > > 6. 1-4 requires distributed system design principles be applied to what is > stored in your metadata and how that metadata is propagated (or accessed) > by each client and server participating in some series of requests. > > 7. Best approaches to start understanding consistency are quorum > consistency and vector clocks. > > A. at a glance, [2] seems to concisely explain quorum consistency > which is useful from a server or client that can send requests to all > database servers, and usually is used with a small amount of replication > (12ish servers) > > B. The wiki on vector clocks [3] decently illustrates how to determine > that some events provably come after other events. This approach is > necessary when some peer-to-peer communication happens, otherwise a simple > version number should suffice. > > > > > > [1]: https://arrow.apache.org/docs/format/Flight.html#downloading-data > <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Farrow.apache.org%2Fdocs%2Fformat%2FFlight.html%23downloading-data&data=05%7C01%7Cphilip.carinhas%40zapatacomputing.com%7C8ec28d954dd24d672d3a08daf59e86dd%7C47c84d2a037549a0aea39a4db4172570%7C1%7C0%7C638092358487659454%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=TxkZHII9Gmt8MpwnevNUdnjGoygxmnVtOEMkuHhwZIY%3D&reserved=0> > > [2]: https://tangarts.github.io/consistency-levels-and-quorums.html > <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftangarts.github.io%2Fconsistency-levels-and-quorums.html&data=05%7C01%7Cphilip.carinhas%40zapatacomputing.com%7C8ec28d954dd24d672d3a08daf59e86dd%7C47c84d2a037549a0aea39a4db4172570%7C1%7C0%7C638092358487659454%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=j7t2IGB56HyBOPx%2Fojd7M88m9PgHTUWBv0Gmv1zyhC8%3D&reserved=0> > > [3]: https://en.wikipedia.org/wiki/Vector_clock > <https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FVector_clock&data=05%7C01%7Cphilip.carinhas%40zapatacomputing.com%7C8ec28d954dd24d672d3a08daf59e86dd%7C47c84d2a037549a0aea39a4db4172570%7C1%7C0%7C638092358487659454%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=mpgd7zl6WMpaIuRSaFEWHOxJZ6iUpyF5LXHbwjPG%2Byo%3D&reserved=0> > > > > > Aldrin Montana > > Computer Science PhD Student > > UC Santa Cruz > > > > -- Aldrin Montana Computer Science PhD Student UC Santa Cruz
