Got it, thank you so much. With that in mind, I think I will do some experiments using the Derby demo FlightSQL server and the JDBC driver in the PR to wrap standard JDBC database connections.
If anyone is interested, I could post some JMH benchmarks comparing queries using JDBC and FlightSQL JDBC on H2/HSQL if I do this? On Fri, Feb 25, 2022 at 8:09 PM James Duong <[email protected]> wrote: > > On Fri, Feb 25, 2022 at 4:55 PM Gavin Ray <[email protected]> wrote: > >> to build a FlightSQL producer that delegates to other databases (possibly >>> using JDBC?). >>> Then building an JSON-based API or service on top of the FlightSqlClient. >>> >> >> Yes, exactly =D >> >> Thank you for the detailed breakdown, that makes things very clear. >> In this case, the JSON API and FlightSQL server would likely be the same >> process, so I think your scenario #1 would be my answer. >> >> You also have the option of using the Arrow JDBC driver to get the >>> benefit of the Arrow Flight protocol but still write your code using JDBC. >>> >> >> I have been following that PR on Github, it's very exciting. >> One thing I am not sure I understand though -- with the JDBC driver for >> Arrow, there still needs to be a FlightSQL Server talking to the database >> right? >> > You're correct. You need a FlightSQL server fronting whatever database > you'd like to expose to use it with the Arrow JDBC driver. Ideally you > could co-locate the FlightSQL Server with the database (which would > effectively make remote calls implemented using Arrow Flight rather than > the database's proprietary protocol). > >> >> So if I understand it, with the FlightSQL JDBC driver, it takes the place >> of the FlightSQL Client and the request flow would be something like: >> >> Client <--> JSON API <--> FlightSQL JDBC Driver <--> FlightSQL Server >> <---> Database >> >> And to connect the FlightSQL Server to the Database would require >> wrapping it in >> JDBC (unless some native/direct wire protocol implementation was written >> for each DB) right? >> > Yes > >> >> Is this more performant than querying through regular JDBC? (Maybe >> because of Arrow's format?) >> > Potentially more performant due to the Arrow Flight protocol. > >> >> >> On Fri, Feb 25, 2022 at 7:33 PM James Duong <[email protected]> >> wrote: >> >>> Hi Gavin, >>> >>> If I'm understanding correctly, what you're thinking of is to build a >>> FlightSQL producer that delegates to other databases (possibly using JDBC?). >>> Then building an JSON-based API or service on top of the FlightSqlClient. >>> >>> There are two potentially remote calls happening here: >>> JSON API (FlightSqlClient) -> FlightSqlServer >>> FlightSqlServer -> each database being federated to. >>> >>> In this scenario, the benefits of Flight vary depending on your network >>> topology: >>> 1. If the JSON API and FlightSqlServer are co-located there is little >>> time being spent on the network sending data through Flight, so it is not >>> very beneficial. >>> 2. If the JSON API is remote and FlighSqlServer is co-located with >>> federated databases, the majority of the network transmission will be using >>> Flight so you should see some performance benefits. >>> 3. If the JSON API, FlightSqlServer, and databases are all remote from >>> each other, you _might_ see benefits to Flight. >>> >>> If you instead built your JSON app to federate queries directly to each >>> database you have one remote call happening >>> JSON API (JDBC) -> federated database >>> This might be faster or slower depending on the network conditions >>> between the JSON API and the federated database. >>> The above would use the JDBC driver's protocol which may not perform as >>> well as Flight. >>> >>> You also have the option of using the Arrow JDBC driver to get the >>> benefit of the Arrow Flight protocol but still write your code using JDBC. >>> >>> On Fri, Feb 25, 2022 at 3:59 PM Gavin Ray <[email protected]> wrote: >>> >>>> Excuse me if this question seems a bit silly, but I'm wondering whether >>>> it would >>>> make sense to use FlightSQL to power a JSON API that talks to multiple >>>> databases. >>>> >>>> I know that the docs around Flight/FlightSQL say that there is a marked >>>> performance improvement over ODBC/JDBC, but I assume this is only if >>>> the service >>>> sending the data over Flight isn't using one of these to interact with >>>> the >>>> datasource, right? >>>> >>>> If I have a JVM application using JDBC and sending the responses as >>>> JSON, would >>>> it still make sense to look towards implementing a FlightSQL Server >>>> because of >>>> the ability to distribute operations across multiple instances and it >>>> being >>>> language-agnostic? Or would I be losing most of the benefits here? >>>> >>>> Not familiar with the Arrow format and project as a whole, so still >>>> trying to >>>> wrap my head around things, sorry! >>>> >>>> Thank you =) >>>> Gavin Ray >>>> >>> >>> >>> -- >>> >>> *James Duong* >>> Lead Software Developer >>> Bit Quill Technologies Inc. >>> Direct: +1.604.562.6082 | [email protected] >>> https://www.bitquilltech.com >>> >>> This email message is for the sole use of the intended recipient(s) and >>> may contain confidential and privileged information. Any unauthorized >>> review, use, disclosure, or distribution is prohibited. If you are not the >>> intended recipient, please contact the sender by reply email and destroy >>> all copies of the original message. Thank you. >>> >> > > -- > > *James Duong* > Lead Software Developer > Bit Quill Technologies Inc. > Direct: +1.604.562.6082 | [email protected] > https://www.bitquilltech.com > > This email message is for the sole use of the intended recipient(s) and > may contain confidential and privileged information. Any unauthorized > review, use, disclosure, or distribution is prohibited. If you are not the > intended recipient, please contact the sender by reply email and destroy > all copies of the original message. Thank you. >
