This sounds pretty darn nifty! I don't have much of value to offer, but the idea sounds like a great one to me =)
On Sat, Jul 23, 2022 at 5:18 PM Tornike Gurgenidze <togur...@freeuni.edu.ge> wrote: > David, thank you for the reply. > > I recently managed to find the time to get back to the repo. I thought I > would post the status update for anyone interested. > > The project started out as just FlightSql implementation, but I ended up > splitting it into smaller components: > > 1. SparkFlightManager - a lower-level, more of a utility class, that > enables easier development of Spark-backed FlightServers. It is supposed to > take care of FlightServer cluster management, distribution of Spark query > results to the FlightServer nodes, service discovery and so on, permitting > a developer to focus on just expressing the intended business logic in > Spark. There's a reference FlightServer implementation ( > > https://github.com/tokoko/SparkFlightSql/blob/main/src/main/scala/com/tokoko/spark/flight/example/SparkParquetFlightProducer.scala > ) > that illustrates how a simple parquet reader server can be implemented > using SparkFlightManager. > > 2. SparkFlightSql - SparkFlightSqlProducer class that relies on > SparkFlightManager for most of the technical stuff and focuses on simply > mapping Spark Catalog API metadata to the FlightSql specification. > > 3. FlightSql DataSourceV2 - pretty self-explanatory, there's now also the > beginnings of a DataSourceV2 implementation supporting BATCH_READ. > > Once again, if anyone's interested enough to contribute or maybe has a use > case for SparkFlightManager, please feel free to reach out. > -- > Tornike > > On Sun, May 29, 2022 at 5:26 AM David Li <lidav...@apache.org> wrote: > > > Hi Tornike, > > > > I'll have to take a closer look later when I can get back in front of a > > real computer but I just want to say that this is super awesome, and > thank > > you for sharing! > > > > I think we've kicked around the idea of "contrib" projects in the past. > > Maybe this can be the impetus to take up that idea? Regardless I want to > > say that if you have any questions or feedback about Arrow and Flight SQL > > please feel free to post it here. > > > > -David > > > > On Sat, May 28, 2022, at 18:48, Tornike Gurgenidze wrote: > > > Hi, > > > > > > I'm not sure this is the right place to be posting this, so I apologize > > in > > > advance. > > > > > > Recently I started a PoC for Arrow Flight SQL Server with Spark > backend ( > > > https://github.com/tokoko/SparkFlightSql). The main goal is to create > a > > > SparkThriftServer alternative that will benefit from FlightSql protocol > > and > > > will also be distributed in nature, i.e. query results won't have to > pass > > > through a single server. > > > > > > I thought it might be interesting for those of you who are also > familiar > > > with Spark. I don't have much of an experience with Arrow, so I would > > > appreciate any sort of involvement from Arrow community. > > > > > > Regards, > > > Tornike >