(iggy-website) branch main updated: add connectors runtime blog post

piotr Thu, 05 Jun 2025 23:42:15 -0700

This is an automated email from the ASF dual-hosted git repository.

piotr pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/iggy-website.git



The following commit(s) were added to refs/heads/main by this push:
     new 108828171 add connectors runtime blog post
108828171 is described below

commit 1088281716196a8230c2500c588b6c31cd497c70
Author: spetz <[email protected]>
AuthorDate: Fri Jun 6 08:40:48 2025 +0200

    add connectors runtime blog post
---
 ...9-one-year-of-building-the-message-streaming.md |   2 +-
 blog/2025-06-06-connectors-runtime.md              | 113 +++++++++++++++++++++
 docs/connectors/sink.md                            |  19 ++--
 docs/connectors/source.md                          |   4 +-
 4 files changed, 128 insertions(+), 10 deletions(-)

diff --git a/blog/2024-05-29-one-year-of-building-the-message-streaming.md 
b/blog/2024-05-29-one-year-of-building-the-message-streaming.md
index 6e04edaa2..7c680b350 100644
--- a/blog/2024-05-29-one-year-of-building-the-message-streaming.md
+++ b/blog/2024-05-29-one-year-of-building-the-message-streaming.md
@@ -57,7 +57,7 @@ And at the same time, we've been experimenting a lot with 
some fancy stuff, whic
 
 ## Tooling
 
-The core message streaming server and multiple SDKs might sound as the most 
important parts of the whole ecosystem, but let's not forget about the 
**management tools**. How to quickly connect to the the server, create new 
topics, validate if the messages are being sent correctly, change the user 
permissions or check the node statistics?
+The core message streaming server and multiple SDKs might sound as the most 
important parts of the whole ecosystem, but let's not forget about the 
**management tools**. How to quickly connect to the server, create new topics, 
validate if the messages are being sent correctly, change the user permissions 
or check the node statistics?
 
 This is where our **[CLI](https://github.com/apache/iggy/tree/master/core/cli) 
and [Web UI](https://github.com/iggy-rs/iggy-web-ui) come in handy**. If you're 
a fan of working with the terminal and used to the great developer experience, 
you'll find our CLI a joy to work with.
 
diff --git a/blog/2025-06-06-connectors-runtime.md 
b/blog/2025-06-06-connectors-runtime.md
new file mode 100644
index 000000000..f6e777d2b
--- /dev/null
+++ b/blog/2025-06-06-connectors-runtime.md
@@ -0,0 +1,113 @@
+---
+title: Connectors runtime
+authors:
+  - name: Piotr Gankiewicz
+    title: Apache Iggy founder
+    url: https://github.com/spetz
+    image_url: https://github.com/spetz.png
+tags: []
+hide_table_of_contents: false
+date: 2025-06-06
+---
+## Extending Apache Iggy capabilities
+
+In the world of message streaming, connectors quite often play a crucial role 
in facilitating data exchange between different systems. While Apache Iggy 
remains the core messaging infrastructure, **focusing on extreme efficiency** 
(high throughput, low latency, and minimal resource consumption), it could also 
benefit from a more extensible architecture. This is where connectors come into 
play.
+
+<!--truncate-->
+
+## What are the connectors
+
+If you've ever used e.g. Apache Kafka or one of the Kafka-compatible 
solutions, such as Redpanda, you might've already encountered the concept of 
connectors. For example, there's a dedicated [Connect 
API](https://kafka.apache.org/documentation/#connectapi) allowing you to create 
the custom connectors (plugins) for various data sources.
+
+Typically, connectors are designed to handle data ingestion and transformation 
tasks. They can be used to read data from external sources, transform it, and 
then write it to the message streaming system (**sources**), or the other way 
around (**sinks**) - fetch data from the message streaming system and write it 
to external services (e.g. databases, file systems, etc.).
+
+Let's say that we would like to get the real-time changes from a database 
(Postgres CDC for example) and send them to Apache Iggy, while performing some 
optional data transformation, such as filtering or enriching the data. Or, we 
might want to fetch data from Apache Iggy (e.g. produced by our custom 
services), transform it, and then push it further to the external indexer such 
as Elastic or Quickwit.
+
+**Instead of building all these pipelines from scratch as custom applications, 
we can leverage the power of connectors to simplify the process**. Simply 
download one of the existing plugins, configure it using the provided 
configuration files, and start using it - no need to write any code!
+
+Here's an example of a configuration file for the sink connector named 
`stdout`:
+
+```toml
+# Required configuration for a sink connector
+[sinks.stdout]
+enabled = true
+name = "Stdout sink"
+path = "connectors/libiggy_connector_stdout_sink"
+
+# Collection of the streams from which messages are consumed
+[[sinks.stdout.streams]]
+stream = "example_stream"
+topics = ["example_topic"]
+schema = "json"
+batch_size = 1000
+poll_interval = "5ms"
+consumer_group = "stdout_sink_connector"
+
+# Custom configuration for the sink connector, deserialized to type T from 
`config` field
+[sinks.stdout.config]
+print_payload = true
+
+# Optional data transformation(s) to be applied after consuming messages from 
the stream
+[sinks.stdout.transforms.add_fields]
+enabled = true
+
+# Collection of the fields transforms to be applied after consuming messages 
from the stream
+[[sinks.stdout.transforms.add_fields.fields]]
+key = "message"
+value.static = "hello"
+```
+
+## Rust-based plugins
+
+Since Apache Iggy is implemented in Rust, it was an easy choice to implement 
the connectors in Rust as well. **This allows us to take advantage of the 
Rust's powerful type system and memory safety features**, ensuring that the 
connectors are reliable and efficient. Internally, we use 
**[dlopen2](https://github.com/OpenByteDev/dlopen2)** library to load the 
plugins during the runtime initialization - feel free to check how 
**[Arroyo](https://www.arroyo.dev/blog/rust-plugin-systems/)** use [...]
+
+Thanks to this approach, just like Iggy itself, the connector runtime (which 
is a separate process), is very lightweight and easy to deploy. **The runtime 
uses just a few MBs of memory on its own, while consuming minimal CPU 
resources.** Behind the scenes, we use the shared **[Tokio 
runtime](https://tokio.rs)** to manage the asynchronous tasks and events across 
all connectors, as well as the 
**[tracing](https://docs.rs/tracing/latest/tracing/)** crate for logging and 
tracing purposes.
+
+When running a simple benchmark based on the custom **[Quickwit 
sink](https://github.com/apache/iggy/tree/master/core/connectors/sinks/quickwit_sink)**
 connector, which pulls the data in real time from Iggy stream, does a basic 
data transformation (using the JSON payload format), and then pushes the data 
further to the Quickwit HTTP API, we observed that this plugin **can easily 
handle hundreds of thousands of messages per second, while using ~40MB of 
memory**. And keep in mind, that it' [...]
+
+Last, but not least, It's quite easy to create your own connectors - simply 
implement either `Sink` or `Source` trait, compile the library and configure it 
in the runtime. Please check the 
**[repository](https://github.com/apache/iggy/tree/master/core/connectors/)** 
or the **[documentation](/docs/connectors/introduction)** for more details.
+
+```rust
+#[async_trait]
+pub trait Sink: Send + Sync {
+    /// Invoked when the sink is initialized, allowing it to perform any 
necessary setup.
+    async fn open(&mut self) -> Result<(), Error>;
+
+    /// Invoked every time a batch of messages is received from the configured 
stream(s) and topic(s).
+    async fn consume(
+        &self,
+        topic_metadata: &TopicMetadata,
+        messages_metadata: MessagesMetadata,
+        messages: Vec<ConsumedMessage>,
+    ) -> Result<(), Error>;
+
+    /// Invoked when the sink is closed, allowing it to perform any necessary 
cleanup.
+    async fn close(&mut self) -> Result<(), Error>;
+}
+```
+
+```rust
+#[async_trait]
+pub trait Source: Send + Sync {
+    /// Invoked when the source is initialized, allowing it to perform any 
necessary setup.
+    async fn open(&mut self) -> Result<(), Error>;
+
+    /// Invoked every time a batch of messages is produced to the configured 
stream and topic.
+    async fn poll(&self) -> Result<ProducedMessages, Error>;
+
+    /// Invoked when the source is closed, allowing it to perform any 
necessary cleanup.
+    async fn close(&mut self) -> Result<(), Error>;
+}
+```
+
+## What's next?
+
+Since it's the very early release to showcase what's possible with the 
connector runtime, we plan to focus on improving the performance and stability 
of the runtime itself.
+
+Moreover, we plan to build more connectors, data transformations and schema 
encoders/decoders, to support the seamless transition between the different 
data formats and protocols.
+
+It's also worth mentioning that the runtime uses one of the existing network 
protocols available via Rust SDK to connect to the Iggy server (so-called 
distributed mode). And while it might be the case for most of the deployments 
out there, we might also support e.g. UDS or some sort of IPC to allow the 
connectors deployment on the same machine, next to the streaming server.
+
+Finally, we would love to hear your feedback and suggestions on how to improve 
the runtime and connectors. Please feel free to open an 
**[issue](https://github.com/apache/iggy/issues)**, **[pull 
request](https://github.com/apache/iggy/pulls)** or start a 
**[discussion](https://github.com/apache/iggy/discussions)**.
+
+As always, you are more than welcome to join our **[Discord 
community](https://discord.gg/C5Sux5NcRa)**!
diff --git a/docs/connectors/sink.md b/docs/connectors/sink.md
index f409ca652..7b146d0dd 100644
--- a/docs/connectors/sink.md
+++ b/docs/connectors/sink.md
@@ -9,18 +9,23 @@ sidebar_position: 4
 
 Sink connectors are responsible for writing data from Iggy streams to external 
systems or destinations. They provide a way to integrate Apache Iggy with 
various data sources and destinations, enabling seamless data flow and 
processing.
 
-The sink is represented by the single `Sink` trait, which defines the basic 
interface for all source connectors. It provides methods for initializing the 
sink, writing data to external destonation, and closing the sink.
+The sink is represented by the single `Sink` trait, which defines the basic 
interface for all sink connectors. It provides methods for initializing the 
sink, writing data to external destination, and closing the sink.
 
 ```rust
 #[async_trait]
-pub trait Source: Send + Sync {
-    /// Invoked when the source is initialized, allowing it to perform any 
necessary setup.
+pub trait Sink: Send + Sync {
+    /// Invoked when the sink is initialized, allowing it to perform any 
necessary setup.
     async fn open(&mut self) -> Result<(), Error>;
 
-    /// Invoked every time a batch of messages is produced to the configured 
stream and topic.
-    async fn poll(&self) -> Result<ProducedMessages, Error>;
+    /// Invoked every time a batch of messages is received from the configured 
stream(s) and topic(s).
+    async fn consume(
+        &self,
+        topic_metadata: &TopicMetadata,
+        messages_metadata: MessagesMetadata,
+        messages: Vec<ConsumedMessage>,
+    ) -> Result<(), Error>;
 
-    /// Invoked when the source is closed, allowing it to perform any 
necessary cleanup.
+    /// Invoked when the sink is closed, allowing it to perform any necessary 
cleanup.
     async fn close(&mut self) -> Result<(), Error>;
 }
 ```
@@ -120,7 +125,7 @@ impl StdoutSink {
 }
 ```
 
-And we can invoke the expected macro to expose the FFI interface and allow the 
connector runtime to register the sink within the runtime.
+We can invoke the expected macro to expose the FFI interface and allow the 
connector runtime to register the sink within the runtime.
 
 ```rust
 sink_connector!(StdoutSink);
diff --git a/docs/connectors/source.md b/docs/connectors/source.md
index 9c5818f20..dcc89e596 100644
--- a/docs/connectors/source.md
+++ b/docs/connectors/source.md
@@ -122,7 +122,7 @@ impl RandomSource {
 }
 ```
 
-And we can invoke the expected macro to expose the FFI interface and allow the 
connector runtime to register the source within the runtime.
+Wwe can invoke the expected macro to expose the FFI interface and allow the 
connector runtime to register the source within the runtime.
 
 ```rust
 source_connector!(TestSource);
@@ -211,7 +211,7 @@ As you can see, the `ProducedMessage` can be customized to 
fit your needs, as al
 
 It's also important to note, that the supported format(s) might vary depending 
on the connector implementation. For example, you might use `JSON` as the 
payload format, which can be then easily parsed and processed by downstream 
components such as data transforms, but at the same time, you could support the 
other formats and let the user decide which one to use.
 
-While the final schema of messages (that will be appended to the Iggy stream), 
can be controlled with the the built-in configuration (the particular 
`StreamEncoder` will be used), keep in mind, that it might be sometimes 
difficult/impossible e.g. to transform one format to another e.g. JSON to SBE 
or so, and in such a case, the produced messages will be ignored.
+While the final schema of messages (that will be appended to the Iggy stream), 
can be controlled with the built-in configuration (the particular 
`StreamEncoder` will be used), keep in mind, that it might be sometimes 
difficult/impossible e.g. to transform one format to another e.g. JSON to SBE 
or so, and in such a case, the produced messages will be ignored.
 
 Eventually, compile the source code and update the runtime configuration file 
using the example config above (`config.toml` file by default, unless you 
prefer `yaml` or `json` format instead - just make sure that `path` points to 
the existing plugin).

(iggy-website) branch main updated: add connectors runtime blog post

Reply via email to