cocoa-xu commented on PR #1903: URL: https://github.com/apache/arrow-adbc/pull/1903#issuecomment-2171583966
After some experiments, I think there are potentially 3 options for this ClickHouse driver: 1. use [`clickhouse-go`](https://github.com/ClickHouse/clickhouse-go) 2. use Low-level Go Client [`ch-go`](https://github.com/ClickHouse/ch-go) with `chpool` 3. just use clickhouse's HTTP/HTTPS protocol with raw ArrowStream Below are the main pros and cons for each one of them. I'm not quite sure which direction should I go for... the second option, using `ch-go` seems to be a better fit here than the other two. #### 1. clickhouse-go ##### Pros - Rich features out of the box, like setting maximum execution time, TLS, compression method and connection strategies (round-robin, random and in-order). - Support load-balancing and failover More key features on their [README.md#key-features](https://github.com/ClickHouse/clickhouse-go?tab=readme-ov-file#key-features) ##### Cons - Only has row-orientated API for reading, and we have to use reflect for every single value to translate the query results into Arrow format, which can be a cost of some performance. #### 2. ch-go `ch-go` is a low-level Go client for ClickHouse. ##### Pros - Provide columnar read and write interface using their native format. - Fast data block streaming with low network, CPU and memory overhead. More key features here, https://github.com/ClickHouse/ch-go?tab=readme-ov-file#features ##### Cons - There're still some types not supported yet, like missing support for nested types, https://github.com/ClickHouse/ch-go?tab=readme-ov-file#todo - According to their to-do list, sometimes the reads may block forever. But issues like this can be addressed in their future versions. #### 3. HTTP/HTTPS protocol with raw ArrowStream This means implementing a simple wrapper for API calls like the following bash example with curl ```bash curl --user 'default:<password>' \ --data-binary 'SOME QUERY Format ArrowStream' \ https://CHICKHOUSE-ID.clickhouse.cloud:8443 ``` ##### Pros - Read and write in ArrowStream directly, no reflection, no casting. ##### Cons - No load-balancing or failover. No fancy features are available out of the box. - Require the clickhouse instance(s) to be configured with HTTP/HTTPS protocol enable. /cc @josevalim -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
