GitHub user Xiao-zhen-Liu added a comment to the discussion: Task ideas for the 
dkNet-AI · Apache Texera Agent Hackathon

# Streaming Texera: Real-Time Workflows on Amber

## Problem
Texera's Amber engine is micro-batch and bounded-input today: every workflow 
assumes its sources finish. Live data — sockets, WebSockets, sensor feeds, chat 
streams, agent event logs — cannot be expressed as a Texera workflow without 
polling hacks.

## Idea
Add a first-class **streaming mode** to Texera workflows. Sources can produce 
data forever; downstream operators react to it in real time; the workflow runs 
indefinitely until the user stops it.

## What users get
- **Live source operators** — connect a workflow to a TCP socket or WebSocket 
feed and watch results update as events arrive.
- **Event-time windows** — tumbling, sliding, and session windows for 
aggregating streams (counts, sums, sessionization).
- **Watermarks** — principled handling of out-of-order events, so windows fire 
correctly under network jitter.
- **Continuous workflows** — a new "running indefinitely" workflow state with 
explicit Stop & Flush vs. Kill controls.
- **Python streaming UDFs** — users yield tuples (and watermarks) forever from 
Python, same SDK ergonomics as today's UDFs.

## Why it matters
- Unlocks live dashboards, alerting pipelines, and real-time AI-agent 
observability — directly aligned with the Agent Hackathon theme.
- Keeps Texera's visual-workflow model intact; no new UI paradigm to learn.
- Bounded workflows are unchanged — streaming is opt-in.

## Demo
A WebSocket feed of agent tool-call events → tumbling 1-minute window → live 
"tool usage by agent" chart in the Texera UI, updating as events arrive.

GitHub link: 
https://github.com/apache/texera/discussions/5059#discussioncomment-16924319

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to