GitHub user chiradip added a comment to the discussion: Apache Iggy Connector
for Apache Pinot
# Apache Pinot Connector for Iggy - Design Document
**Version:** 1.0
**Status:** Implementation Complete
**Last Updated:** December 2025
## Executive Summary
This document outlines the architectural design and implementation strategy for
the Apache Iggy stream connector plugin for Apache Pinot. The connector enables
real-time data ingestion from Iggy message streams into Pinot's OLAP datastore,
providing high-throughput, low-latency streaming analytics capabilities.
### Design Goals
1. **Native TCP Protocol**: Utilize Iggy's native TCP protocol for maximum
performance, avoiding HTTP overhead
2. **Pinot API Compliance**: Full compatibility with Pinot's Stream Plugin API
(v1.2.0+)
3. **High Performance**: Target >1M messages/second throughput with
sub-millisecond latency
4. **Partition Parallelism**: Support concurrent consumption from multiple
partitions
5. **Offset Management**: Leverage Iggy's server-managed consumer groups for
reliable offset tracking
6. **Production Ready**: Include comprehensive testing, monitoring, and
operational documentation
## 1. System Architecture
### 1.1 High-Level Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ Apache Pinot Cluster │
│ ┌───────────┐ ┌───────────┐ ┌───────────────────────┐ │
│ │Controller │ │ Broker │ │ Server (Realtime) │ │
│ └───────────┘ └───────────┘ └───────┬───────────────┘ │
│ │ │
│ ┌──────────────▼──────────────┐ │
│ │ Iggy Stream Plugin │ │
│ │ ┌──────────────────────┐ │ │
│ │ │ IggyConsumerFactory │ │ │
│ │ └──────────┬───────────┘ │ │
│ │ │ │ │
│ ┌───────────▼─────────────▼──────────┐ │ │
│ │ IggyPartitionGroupConsumer (P0) │ │ │
│ └───────────┬───────────────────────┘ │ │
│ ┌───────────▼─────────────────────┐ │ │
│ │ IggyPartitionGroupConsumer (P1)│ │ │
│ └───────────┬──────────────────────┘ │ │
│ │ │ │
└──────────────────────────┼────────────────────────────┘ │
│ TCP Connections │
│ (Connection Pool) │
│ │
┌──────────────────────────▼────────────────────────────────┐
│ Apache Iggy Cluster │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
│ │ Stream 1 │ │ Stream 2 │ │ Stream N │ │
│ │ ┌────────┐ │ │ ┌────────┐ │ │ ┌────────┐ │ │
│ │ │Topic 1 │ │ │ │Topic 1 │ │ │ │Topic 1 │ │ │
│ │ │ P0 P1 │ │ │ │ P0 P1 │ │ │ │ P0 P1 │ │ │
│ │ └────────┘ │ │ └────────┘ │ │ └────────┘ │ │
│ └────────────┘ └────────────┘ └────────────┘ │
└────────────────────────────────────────────────────────────┘
```
### 1.2 Component Responsibilities
The connector architecture follows Pinot's plugin model with clear separation
of concerns:
| Component | Responsibility | Lifecycle |
|-----------|----------------|-----------|
| `IggyConsumerFactory` | Plugin entry point, consumer instantiation |
Singleton per server |
| `IggyPartitionGroupConsumer` | Message polling, partition consumption | One
per partition |
| `IggyStreamMetadataProvider` | Partition discovery, offset resolution | One
per partition |
| `IggyStreamConfig` | Configuration parsing and validation | Created per
consumer |
| `IggyJsonMessageDecoder` | Message payload decoding | Shared across consumers
|
| `IggyMessageBatch` | Batch message container | Created per fetch |
| `IggyStreamPartitionMsgOffset` | Offset representation | Created per message |
### 1.3 Design Rationale
**TCP over HTTP Decision:**
- Initial Flink connector used HTTP due to incorrect Docker image during
development
- TCP protocol provides 40-60% lower latency and higher throughput
- Native protocol alignment with Iggy's core design
- Connection pooling eliminates per-request overhead
**Consumer Group Strategy:**
- Server-managed offsets eliminate need for external coordination (Zookeeper,
etc.)
- Auto-commit mode simplifies implementation while maintaining reliability
- Consumer group per table provides isolation and independent scaling
**Partition-Level Consumption:**
- Pinot creates one consumer per partition for maximum parallelism
- Each consumer maintains independent TCP connection from pool
- Linear scaling with partition count
GitHub link:
https://github.com/apache/iggy/discussions/2449#discussioncomment-15177530
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]