Hi Zihan, Thank you for your detailed and insightful email. Your understanding of the current ThingsBoard IoTDB integration and the challenges of migrating to the 2.x Table Model is spot on. The work plan you outlined is clear and feasible, and your contribution experience across multiple Apache projects gives me confidence in your ability to succeed.
Here are my answers to your three specific questions. 1. Single generic table vs. per-device-profile tables Recommendation: Start with a single generic table * Preserves the flexibility of adding arbitrary telemetry keys at runtime without DDL. * Simple mapping, consistent with Cassandra/PostgreSQL integrations, easy for users to understand. * IoTDB's table model handles high‑cardinality tags (like key) efficiently, and partitioning can further optimize performance. * If bottlenecks arise later, we can consider a dedicated latest‑value table or per‑entity tables as optimizations. 2. Aggregation queries: direct SQL translation vs. adapter Recommendation: Prioritize direct use of IoTDB's built‑in SQL functions * ThingsBoard's aggregation semantics (MIN/MAX/AVG/SUM/COUNT) align with SQL standards and can be mapped directly. * Use time window functions like date_bin to reduce code maintenance. * During testing, focus on interval boundaries and null handling; add a lightweight adapter only if discrepancies arise. 3. Retention and latest‑value reads Recommendation: Leverage IoTDB's native features first * Table TTL directly meets retention requirements, avoiding re‑implementation. * Use ORDER BY time DESC LIMIT 1 for latest‑value queries – simple and efficient. * Implement natively and benchmark performance; if findAllLatest becomes a bottleneck, consider caching or a dedicated latest‑value table later. If you have any further questions, feel free to continue the discussion on the mailing list or chat channel. Looking forward to seeing your formal proposal! Best regards, Xuan Wang 发件人: Zh D <[email protected]> 日期: 星期四, 2026年3月19日 15:25 收件人: [email protected] <[email protected]> 主题: [GSoC 2026] Interest in GSOC-304: ThingsBoard Integration with IoTDB 2.x Table Model Hi all, My name is Zihan Dai, a CS student at the University of Melbourne. I'm writing about GSOC-304. I've been reading the public ThingsBoard persistence layer around TimeseriesDao, TimeseriesLatestDao, TimeseriesService, and TsKvEntry/BasicTsKvEntry, along with the IoTDB ThingsBoard docs for the current adapted build (DATABASE_TS_TYPE=iotdb, DATABASE_TS_LATEST_TYPE=iotdb, IoTDB_DATABASE=root.thingsboard). That setup is clearly tree-oriented: arbitrary telemetry keys fit naturally as path segments under root.thingsboard, while ThingsBoard expects storage and retrieval in terms of timestamped TsKvEntry values, latest lookups (findLatest, findAllLatest, saveLatest), range reads (findAllAsync over ReadTsKvQuery), and dashboard aggregations (NONE/MIN/MAX/AVG/SUM/COUNT). The wider model still includes devices, attributes, relations, and alarms, but the main storage pressure point is telemetry/latest. The 2.x migration looks interesting because it is not just replacing insertRecord(deviceId, time, measurements, values) with another write call. Table mode uses ITableSession / ITableSessionPool (TableSessionBuilder, TableSessionPoolBuilder), CREATE TABLE ... TAG / ATTRIBUTE / FIELD, and session.insert(tablet) backed by insertRelationalTablet(), with SQL queries over built-in time and TAG columns. The main design problem is mapping ThingsBoard's dynamic telemetry keys and mixed value types (BOOLEAN, STRING, LONG, DOUBLE, JSON, with JSON likely serialized as STRING/TEXT) onto fixed table schemas. A single generic table keyed by tenant_id, entity_type, entity_id, and key keeps TsKvEntry mapping simple and avoids DDL churn, but it creates sparse typed columns and high-cardinality key tags. Per-device-profile or per-key tables improve locality and query shape, but they work against ThingsBoard's runtime key flexibility. Here's how I'd phase the work over 12 weeks: Weeks 1-3, Design & Prototype: Study the existing ThingsBoard IoTDB adapter code and the TimeseriesDao/TimeseriesLatestDao interfaces in depth. Build a minimal prototype connecting ITableSession to ThingsBoard's saveTsKvEntity() and findLatest() paths. Settle the schema design question (single generic table vs per-profile tables) with mentor input. Deliver a design doc and a working write+read PoC. Weeks 4-7, Core DAO Implementation: Implement the full TimeseriesDao interface against Table Model -- save(), saveLatest(), findAllAsync() over ReadTsKvQuery, findLatest()/findAllLatest(), and deleteTs(). Handle type mapping (BOOLEAN/STRING/LONG/DOUBLE/JSON -> Table Model column types). Add unit tests against an embedded or dockerized IoTDB 2.x instance. Weeks 8-10, Aggregation & Retention: Implement dashboard aggregation queries (MIN/MAX/AVG/SUM/COUNT) using Table Model SQL. Implement retention management (TTL or explicit cleanup()/savePartition() equivalents). Integration tests with ThingsBoard's telemetry subscription and dashboard rendering paths. Weeks 11-12, Polish & Migration: Write a migration guide for users upgrading from Tree Mode. Performance benchmarking (write throughput, query latency) against the existing Tree Mode adapter. Final code review, documentation, and blog post. On my side, I have two open IoTDB PRs (#17180 and #17212) on logging and resource management. Across the broader Apache ecosystem and beyond, I have 5 merged PRs in Apache Beam (resource leak fixes in KafkaIO, serialization improvements, API changes), 2 merged in Apache ShardingSphere (resource leak and configuration fixes, both merged same-day by PMC), and 1 merged in OpenCV (#28502, documentation fix). A few concrete questions: - For telemetry, do you prefer a single generic table keyed by tenant/entity/key, or separate tables per device profile / schema domain? - For ReadTsKvQuery aggregations used by dashboards, would you expect direct SQL translation using date_bin / date_bin_gapfill, or an adapter that first preserves current ThingsBoard aggregation semantics? - For retention and latest-value reads, should the table-model backend rely on IoTDB table TTL and SQL latest queries, or keep explicit equivalents of ThingsBoard's savePartition() / cleanup() and a dedicated latest-value structure? Thanks, Zihan Dai GitHub: https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FPDGGK&data=05%7C02%7C%7Ca4e46aa59aec4588bf6208de8588a5a7%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C639095019130184891%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=QFoznMs7U7%2BEPFnttL1G3Eku8rGq1cXL%2BJCsgME77%2F0%3D&reserved=0<https://github.com/PDGGK>
