Hi Airflow community,

I'd like to propose adding a new community provider for ClickHouse.
Integration overview

ClickHouse is an open-source column-oriented database management system
designed for online analytical processing (OLAP). It is widely used for
real-time analytics, AI workloads, o11y and is particularly popular in data
engineering pipelines where large volumes of data need to be queried fast.
Many Airflow users already use ClickHouse as a destination or source in
their data pipelines, and a first-class provider would make that
integration simpler, more consistent, and easier to maintain.

The provider exposes a ClickHouseHook that extends DbApiHook via the
clickhouse-connect HTTP client library, enabling full compatibility with
SQLExecuteQueryOperator and the rest of the Airflow SQL ecosystem out of
the box.
System tests

The provider integrates with a live ClickHouse service. System tests are
included in the PR (tests/system/clickhouse/example_clickhouse.py) and
cover the full lifecycle: create table, insert, read rows, and drop table.
Proposed stewardsBentsi Leviav, @BentsiLeviav
<https://github.com/BentsiLeviav>, [email protected]
committer(s)

@eladkal <https://github.com/eladkal>
Working implementation

#67080 <https://github.com/apache/airflow/pull/67080>
Incubation commitment

We commit to:

   -

   Maintaining the provider and responding to issues within a reasonable
   time
   -

   Meeting the incubation health metrics within 6 months
   -

   Participating in quarterly governance updates


Best regards,
Bentsi Leviav, EM for Connectors & Data Integrations at ClickHouse

Reply via email to