hemantkrsh opened a new pull request, #2891:
URL: https://github.com/apache/iggy/pull/2891

   ## Which issue does this PR close?
   Addresses the ClickHouse sink connector tracked in #2539, implementing 
end-to-end message delivery from iggy topics into ClickHouse tables with 
support for JSON and RowBinary insert formats, authentication, TLS/mTLS, field 
mappings, metadata injection, batching, and retry.
   
   ## Rationale
   
   ClickHouse is one of the most widely used columnar databases for real-time 
analytics. Having a native sink connector allows iggy users to stream messages 
directly into ClickHouse tables without any intermediate tooling.
   
   ## What changed?
   
   - Messages consumed from iggy topics are now insertable into ClickHouse 
tables in two formats: JSON(JSONEachRow) for structured data with field-level 
access, and Rowbinary for high-throughput payload storage
   - JSON inserts support field mappings — specific nested JSON paths can be 
projected to named ClickHouse columns (e.g. address.city → city)
   - Iggy message metadata (stream, topic, partition, offset, checksum, 
timestamps) can be optionally injected as iggy_-prefixed columns alongside the 
payload
   - Batches are chunked to a configurable max_batch_size and retried with 
exponential backoff on transient network/timeout errors; permanent errors abort 
immediately
   - The connector authenticates via username/password credentials, JWT token, 
or no auth; TLS server verification and mTLS client certificates are supported 
without requiring OS trust store changes
   - LZ4 compression is enabled by default to reduce network overhead on insert 
traffic
   - Five integration tests run against a live ClickHouse container 
(testcontainers) and validate end-to-end row insertion for each supported 
format and metadata combination
   
   ## Local Execution
   All unit and integration tests passed locally.
   
   All the quality checks executed.
   ```
   cargo fmt --all
   cargo clippy --all-targets --all-features -- -D warnings
   cargo build
   cargo test
   cargo machete
   cargo sort --workspace
   ```
   Integration tests run against a live ClickHouse 25.3-alpine container via 
testcontainers:
   ```cargo test -p integration -- connectors::clickhouse::clickhouse_sink 
--nocapture```
   
   All 5 integration test variants passed:
   
   `test_json_sink` — JSON insert without metadata
   `test_json_sink_with_metadata` — JSON insert with iggy_ metadata columns
   `test_json_sink_with_field_mappings` — JSON insert with field-to-column 
mappings
   `test_rowbinary_sink` — RowBinary insert without metadata
   `test_rowbinary_sink_with_metadata` — RowBinary insert with iggy_ metadata 
columns
   
   <img width="1389" height="635" alt="Screenshot 2026-03-08 at 2 34 27 PM" 
src="https://github.com/user-attachments/assets/e586b51f-df58-4448-9aba-d4e9ea84c588";
 />
   <img width="1406" height="643" alt="Screenshot 2026-03-08 at 2 34 47 PM" 
src="https://github.com/user-attachments/assets/1cb509e7-2167-4b65-8422-4a6cfc39d694";
 />
   <img width="1394" height="452" alt="Screenshot 2026-03-08 at 2 35 00 PM" 
src="https://github.com/user-attachments/assets/becbe219-952f-412f-b739-cb8e56797d79";
 />
   
   ## AI Usage
   
   If AI tools were used, please answer:
   1. Which tools? 
   Copilot(Claude Sonnet 4.6/Opus 4.6)
   
   3. Scope of usage?
   Autocomplete and generate functions. Mainly used in integration tests and 
documentation.
   
   5. How did you verify the generated code works correctly?
   The unit tests and integration tests pass. Verified data present in 
ClickHouse(local docker).
   
   7. Can you explain every line of the code if asked?
   Yes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to