cocoa-xu opened a new pull request, #1717: URL: https://github.com/apache/arrow-adbc/pull/1717
Hi this PR aims to add support for Google BigQuery. But before I go further with this C/C++ implementation, I'd like to hear suggestions and/or advices from experts for this driver, and I'll be happy to make any changes and see if we need to switch to Go if needed (related issue: https://github.com/apache/arrow-adbc/issues/168). Currently it implements the query functionality as a proof of concept, users can - set supported options in `AdbcStatement` - send queries and read the result table in Arrow format *It's using precompiled google-cloud-cpp SDK for now because we have to patch the BigQuery C++ REST API, the reason for the patch is discussed [here](https://github.com/cocoa-xu/bigquery-rest-cpp?tab=readme-ov-file#why-you-made-this-repo). Of course, I'll be happy to submit a PR to google-cloud-cpp so we don't need to patch these files before compiling, and then we can do the whole build in the CMakeLists.txt in this repo if we want.* Here are some preliminary results using this driver in Elixir using [elixir-explorer/adbc](http://github.com/elixir-explorer/adbc). ```elixir # bigquery.exs Mix.install([{:adbc, "~> 0.3.2-dev", github: "elixir-explorer/adbc"}]) defmodule BigqueryTest do def test do children = [ {Adbc.Database, project_id: "bigquery-poc-418913", driver: "libadbc_driver_bigquery.dylib", process_options: [name: MyApp.DB]}, {Adbc.Connection, database: MyApp.DB, process_options: [name: MyApp.Conn], write_disposition: "WRITE_TRUNCATE", } ] Supervisor.start_link(children, strategy: :one_for_one) dbg(Adbc.Connection.query(MyApp.Conn, "SELECT * FROM google_trends.small_top_terms LIMIT 7")) end end BigqueryTest.test() ``` ``` $ elixir bigquery.exs Adbc.Connection.query(MyApp.Conn, "SELECT * FROM google_trends.small_top_terms LIMIT 7") #=> {:ok, %Adbc.Result{ num_rows: nil, data: %{ "dma_id" => [546, 546, 546, 546, 546, 546, 546], "dma_name" => ["Columbia SC", "Columbia SC", "Columbia SC", "Columbia SC", "Columbia SC", "Columbia SC", "Columbia SC"], "rank" => [15, 15, 15, 15, 15, 15, 15], "refresh_date" => [~D[2024-03-14], ~D[2024-03-14], ~D[2024-03-14], ~D[2024-03-14], ~D[2024-03-14], ~D[2024-03-14], ~D[2024-03-14]], "score" => [nil, nil, nil, nil, nil, nil, nil], "term" => ["Nex Benedict", "Nex Benedict", "Nex Benedict", "Nex Benedict", "Nex Benedict", "Nex Benedict", "Nex Benedict"], "week" => [~D[2020-12-13], ~D[2020-12-20], ~D[2021-02-21], ~D[2021-02-28], ~D[2021-03-07], ~D[2021-03-14], ~D[2021-04-04]] } }} ``` Of course, there're still a few thing to be done if we decided to implement it in C/C++ - [ ] set credentials when initialising `AdbcDatabase`; currently Google Cloud SDK will automatically find and use credentials saved on local storage (generated by `gcloud auth application-default login`) - [ ] implement `GetInfo`, `GetTableSchema` and other functions for BigQuery's `AdbcConnection` and `AdbcStatement` - [ ] add tests for this driver -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
