I've ran some POCs for this use case.. For the flight ticket I just used a JSON string which included the SQL query. The flight server executes the SQL query and formats the results into an Arrow table. The flight client receives the arrow table.
This was used to benchmark a client calling SQL using pyodbc which receives rows back from the sql server.. vs The client sending the SQL to a flight server co-located with the sql server and receiving columns back from the flight server. I think for 1 million rows with 1000 miles between the client and the datacenter, the flight detour was 10x faster.. -----Original Message----- From: Ankush Jain <anku...@andrew.cmu.edu> Sent: Thursday, April 10, 2025 1:26 PM To: user@arrow.apache.org Cc: user-...@arrow.apache.org; user-h...@arrow.apache.org Subject: Re: Arrow Flight Server Querying using SQL Commands External Email: Use caution with links and attachments Hi, To add to what Aldrin said, > query specific columns of batches/tables sitting on an arrow flight > server Regardless of what transport you use (gRPC or Arrow Flight or FlightSQL), you need a way to execute a SQL query against the data on your Flight server. Unless you are planning on doing your query parsing/planning/execution, you still need a database on your server for this. One option is to ingest that data into duckdb, and expose it via a small subset of the FlightSql API. The sqlflite codebase in reference [1] is an excellent starting point and wraps a DuckDB instance. Just a heads up that that implementation is not thread-safe for concurrent queries, but easy to adapt if you are familiar with cpp. But you will need to ingest your data into some database on the server, and either use its native FlightSql interface or implement your own. Ankush On Thu, Apr 10, 2025 at 4:02 PM Aldrin <octalene....@pm.me> wrote: > > Hello! > > Arrow flight sql is normally used with databases it seems like, but can you > use it with an arrow flight server? > > > Yes and no. Flight SQL is used to communicate with databases, but I don't > think it bypasses a flight server, it's just that the target database is, or > uses, a flight server to communicate with the client. I think any server that > accepts the flight protocol should be considered a flight server. > > That being said, I am not sure which direction I should point you in. If you > don't mind setting up a database with your flight server, it seems like [1] > or [2] will be good references. If you don't want to do actual database > things on your server, then I feel like flight RPC itself should be good > enough for your use case (or at least, I'm not sure how you plan to use > flight SQL without a database). > > If you want database-like interaction, but you don't want an actual database, > I think you want to try and go the route of interfacing with Acero from your > flight server. I can't find a prototype of this off hand, but building it > yourself incrementally shouldn't be extremely difficult, depending on what > features you need. > > [1]: > https://urldefense.com/v3/__https://github.com/voltrondata/SQLFlite__; > !!KSjYCgUGsB4!aOGtuDmm12K8dIOMRwrjdtbTJgqeLdRbqTBFndEpvubwLdeB7ld14CvQ > Mdy76ke1bBQMC6DkLiz-ztIZde9WEafBIA$ > [2]: > https://urldefense.com/v3/__https://github.com/Query-farm/duckdb-airpo > rt-extension/tree/main__;!!KSjYCgUGsB4!aOGtuDmm12K8dIOMRwrjdtbTJgqeLdR > bqTBFndEpvubwLdeB7ld14CvQMdy76ke1bBQMC6DkLiz-ztIZde9m0DkP5g$ > > > # ------------------------------ > # Aldrin > > https://urldefense.com/v3/__https://github.com/drin/__;!!KSjYCgUGsB4!a > OGtuDmm12K8dIOMRwrjdtbTJgqeLdRbqTBFndEpvubwLdeB7ld14CvQMdy76ke1bBQMC6D > kLiz-ztIZde_cnI3CBw$ > https://urldefense.com/v3/__https://gitlab.com/octalene__;!!KSjYCgUGsB > 4!aOGtuDmm12K8dIOMRwrjdtbTJgqeLdRbqTBFndEpvubwLdeB7ld14CvQMdy76ke1bBQM > C6DkLiz-ztIZde_XG1Nu1Q$ > https://urldefense.com/v3/__https://keybase.io/octalene__;!!KSjYCgUGsB > 4!aOGtuDmm12K8dIOMRwrjdtbTJgqeLdRbqTBFndEpvubwLdeB7ld14CvQMdy76ke1bBQM > C6DkLiz-ztIZde_N7MBtaA$ > > On Thursday, April 10th, 2025 at 12:36, Amanda Weirich > <acweiric...@gmail.com> wrote: > > Hi all, > > Has anybody created an Arrow Flight Server capable of responding to arrow > flight sql commands? > > Arrow flight sql is normally used with databases it seems like, but can you > use it with an arrow flight server? > > For reference, we would like to use an arrow flight client to query specific > columns of batches/tables sitting on an arrow flight server - no database > integration involved. Does something like this exist? Does anyone have any > recommendations for doing this? > > Thanks! > > --Amanda > > This message may contain information that is confidential or privileged. If you are not the intended recipient, please advise the sender immediately and delete this message. See http://www.blackrock.com/corporate/compliance/email-disclaimers for further information. Please refer to http://www.blackrock.com/corporate/compliance/privacy-policy for more information about BlackRock’s Privacy Policy. For a list of BlackRock's office addresses worldwide, see http://www.blackrock.com/corporate/about-us/contacts-locations. © 2025 BlackRock, Inc. All rights reserved.