Hi VGalaxies, Thank you so much for introducing the new data subscription client feature to the Apache IoTDB community. Your detailed explanation provides a clear understanding of how this feature enhances IoTDB’s capabilities, especially in terms of real-time data processing and system integration efficiency.
Looking forward to more updates and engaging with this new feature! :) Best regrads, Sho 2024年4月8日 11:06,VGalaxies <vgalax...@apache.org> 写道: Hello everyone, I am VGalaxies, a new contributor to Apache IoTDB. I am excited to share with you a new feature that I have been working on for the past few months. The data subscription client is a new way to access data within IoTDB, distinct from the traditional method of querying data using SQL-like syntax. In scenarios where real-time data, quick response to data changes, and building highly event-driven systems are required, data subscription has greater advantages over data querying. For example, in the following two scenarios: 1. Replace extensive polling queries for large amounts of data: Avoid significant impacts on the performance of existing systems when querying frequently or when there are many data points. Also, avoid problems with determining the query scope and ensure downstream receives accurate full data. 2. Facilitate downstream system integration: It's easier to integrate with components such as Flink, Spark, Kafka/DataX, Camel/MySQL, etc. There's no need to customize the logic of IoTDB's data change capture for each big data component separately, simplifying integration component design and making it easier for users. The IoTDB subscription client references some features defined by some message queue products like Kafka. It consists of 3 core concepts: Topic, Consumer, and Consumer Group. - Topic is a logical concept used by the IoTDB subscription client to classify data, serving as a channel for data publication. Producers publish data to specific topics, while consumers subscribe to these topics to receive related data. In the IoTDB subscription client, topics describe the sequence characteristics, time characteristics, presentation forms, and optional custom processing logic of subscribed data. - Consumer is an application or service in the IoTDB subscription client responsible for receiving and processing data published to specific topics. Consumers retrieve data from the queue and perform corresponding processing. The IoTDB subscription client provides two types of consumers: pull consumer and push consumer. - Consumer Group is a collection of consumers. When different consumers in the same consumer group subscribe to the same topic, these consumers share the processing progress of data under this topic. Each data under this topic can only be processed by one consumer within the group, ensuring that data is not processed repeatedly. Based on these concepts, the IoTDB subscription client provides a series of SDKs for creating topics, creating consumers, subscribing to topics, consuming data, committing consumption progress, and obtaining subscription relationships. Here's a comprehensive example[1] demonstrating how to use the subscription client JAVA SDK to consume data from IoTDB. Technically, the data subscription client will rely on IoTDB's existing streaming processing framework (Pipe). Each subscription corresponds to a user-invisible pipe task. Subscription relationships and other metadata information are persistently maintained through the config node. Basic functionality has been developed on the master branch[2], and further iterations will continuously improve it. I hope you are interested in this feature and would like to participate in the development and testing. You can also leave your comments and suggestions in this thread. Appreciate any suggestion/feedback & contribution. Thank you for your attention and support. Best regards, VGalaxies Reference: 1. https://github.com/apache/iotdb/blob/master/example/session/src/main/java/org/apache/iotdb/SubscriptionSessionExample.java 2. https://github.com/apache/iotdb/tree/master