GitHub user chiradip added a comment to the discussion: Apache Iggy Connector 
for Apache Pinot

# Apache Pinot Connector for Iggy - Design Document

**Version:** 1.0
**Status:** Implementation Complete
**Last Updated:** December 2025

## Executive Summary

This document outlines the architectural design and implementation strategy for 
the Apache Iggy stream connector plugin for Apache Pinot. The connector enables 
real-time data ingestion from Iggy message streams into Pinot's OLAP datastore, 
providing high-throughput, low-latency streaming analytics capabilities.

### Design Goals

1. **Native TCP Protocol**: Utilize Iggy's native TCP protocol for maximum 
performance, avoiding HTTP overhead
2. **Pinot API Compliance**: Full compatibility with Pinot's Stream Plugin API 
(v1.2.0+)
3. **High Performance**: Target >1M messages/second throughput with 
sub-millisecond latency
4. **Partition Parallelism**: Support concurrent consumption from multiple 
partitions
5. **Offset Management**: Leverage Iggy's server-managed consumer groups for 
reliable offset tracking
6. **Production Ready**: Include comprehensive testing, monitoring, and 
operational documentation

## 1. System Architecture

### 1.1 High-Level Architecture

```
┌─────────────────────────────────────────────────────────────┐
│                     Apache Pinot Cluster                     │
│  ┌───────────┐  ┌───────────┐  ┌───────────────────────┐   │
│  │Controller │  │  Broker   │  │   Server (Realtime)   │   │
│  └───────────┘  └───────────┘  └───────┬───────────────┘   │
│                                         │                     │
│                          ┌──────────────▼──────────────┐     │
│                          │  Iggy Stream Plugin         │     │
│                          │  ┌──────────────────────┐   │     │
│                          │  │ IggyConsumerFactory  │   │     │
│                          │  └──────────┬───────────┘   │     │
│                          │             │               │     │
│              ┌───────────▼─────────────▼──────────┐    │     │
│              │ IggyPartitionGroupConsumer (P0)   │    │     │
│              └───────────┬───────────────────────┘    │     │
│              ┌───────────▼─────────────────────┐      │     │
│              │ IggyPartitionGroupConsumer (P1)│      │     │
│              └───────────┬──────────────────────┘     │     │
│                          │                            │     │
└──────────────────────────┼────────────────────────────┘     │
                           │ TCP Connections                  │
                           │ (Connection Pool)                │
                           │                                  │
┌──────────────────────────▼────────────────────────────────┐
│                    Apache Iggy Cluster                     │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐          │
│  │ Stream 1   │  │ Stream 2   │  │ Stream N   │          │
│  │ ┌────────┐ │  │ ┌────────┐ │  │ ┌────────┐ │          │
│  │ │Topic 1 │ │  │ │Topic 1 │ │  │ │Topic 1 │ │          │
│  │ │  P0 P1 │ │  │ │  P0 P1 │ │  │ │  P0 P1 │ │          │
│  │ └────────┘ │  │ └────────┘ │  │ └────────┘ │          │
│  └────────────┘  └────────────┘  └────────────┘          │
└────────────────────────────────────────────────────────────┘
```

### 1.2 Component Responsibilities

The connector architecture follows Pinot's plugin model with clear separation 
of concerns:

| Component | Responsibility | Lifecycle |
|-----------|----------------|-----------|
| `IggyConsumerFactory` | Plugin entry point, consumer instantiation | 
Singleton per server |
| `IggyPartitionGroupConsumer` | Message polling, partition consumption | One 
per partition |
| `IggyStreamMetadataProvider` | Partition discovery, offset resolution | One 
per partition |
| `IggyStreamConfig` | Configuration parsing and validation | Created per 
consumer |
| `IggyJsonMessageDecoder` | Message payload decoding | Shared across consumers 
|
| `IggyMessageBatch` | Batch message container | Created per fetch |
| `IggyStreamPartitionMsgOffset` | Offset representation | Created per message |

### 1.3 Design Rationale

**TCP over HTTP Decision:**
- Initial Flink connector used HTTP due to incorrect Docker image during 
development
- TCP protocol provides 40-60% lower latency and higher throughput
- Native protocol alignment with Iggy's core design
- Connection pooling eliminates per-request overhead

**Consumer Group Strategy:**
- Server-managed offsets eliminate need for external coordination (Zookeeper, 
etc.)
- Auto-commit mode simplifies implementation while maintaining reliability
- Consumer group per table provides isolation and independent scaling

**Partition-Level Consumption:**
- Pinot creates one consumer per partition for maximum parallelism
- Each consumer maintains independent TCP connection from pool
- Linear scaling with partition count

GitHub link: 
https://github.com/apache/iggy/discussions/2449#discussioncomment-15177530

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to