Hi all, My name is Zihan Dai, a CS student at the University of Melbourne. I'm writing about GSOC-304.
I've been reading the public ThingsBoard persistence layer around TimeseriesDao, TimeseriesLatestDao, TimeseriesService, and TsKvEntry/BasicTsKvEntry, along with the IoTDB ThingsBoard docs for the current adapted build (DATABASE_TS_TYPE=iotdb, DATABASE_TS_LATEST_TYPE=iotdb, IoTDB_DATABASE=root.thingsboard). That setup is clearly tree-oriented: arbitrary telemetry keys fit naturally as path segments under root.thingsboard, while ThingsBoard expects storage and retrieval in terms of timestamped TsKvEntry values, latest lookups (findLatest, findAllLatest, saveLatest), range reads (findAllAsync over ReadTsKvQuery), and dashboard aggregations (NONE/MIN/MAX/AVG/SUM/COUNT). The wider model still includes devices, attributes, relations, and alarms, but the main storage pressure point is telemetry/latest. The 2.x migration looks interesting because it is not just replacing insertRecord(deviceId, time, measurements, values) with another write call. Table mode uses ITableSession / ITableSessionPool (TableSessionBuilder, TableSessionPoolBuilder), CREATE TABLE ... TAG / ATTRIBUTE / FIELD, and session.insert(tablet) backed by insertRelationalTablet(), with SQL queries over built-in time and TAG columns. The main design problem is mapping ThingsBoard's dynamic telemetry keys and mixed value types (BOOLEAN, STRING, LONG, DOUBLE, JSON, with JSON likely serialized as STRING/TEXT) onto fixed table schemas. A single generic table keyed by tenant_id, entity_type, entity_id, and key keeps TsKvEntry mapping simple and avoids DDL churn, but it creates sparse typed columns and high-cardinality key tags. Per-device-profile or per-key tables improve locality and query shape, but they work against ThingsBoard's runtime key flexibility. Here's how I'd phase the work over 12 weeks: Weeks 1-3, Design & Prototype: Study the existing ThingsBoard IoTDB adapter code and the TimeseriesDao/TimeseriesLatestDao interfaces in depth. Build a minimal prototype connecting ITableSession to ThingsBoard's saveTsKvEntity() and findLatest() paths. Settle the schema design question (single generic table vs per-profile tables) with mentor input. Deliver a design doc and a working write+read PoC. Weeks 4-7, Core DAO Implementation: Implement the full TimeseriesDao interface against Table Model -- save(), saveLatest(), findAllAsync() over ReadTsKvQuery, findLatest()/findAllLatest(), and deleteTs(). Handle type mapping (BOOLEAN/STRING/LONG/DOUBLE/JSON -> Table Model column types). Add unit tests against an embedded or dockerized IoTDB 2.x instance. Weeks 8-10, Aggregation & Retention: Implement dashboard aggregation queries (MIN/MAX/AVG/SUM/COUNT) using Table Model SQL. Implement retention management (TTL or explicit cleanup()/savePartition() equivalents). Integration tests with ThingsBoard's telemetry subscription and dashboard rendering paths. Weeks 11-12, Polish & Migration: Write a migration guide for users upgrading from Tree Mode. Performance benchmarking (write throughput, query latency) against the existing Tree Mode adapter. Final code review, documentation, and blog post. On my side, I have two open IoTDB PRs (#17180 and #17212) on logging and resource management. Across the broader Apache ecosystem and beyond, I have 5 merged PRs in Apache Beam (resource leak fixes in KafkaIO, serialization improvements, API changes), 2 merged in Apache ShardingSphere (resource leak and configuration fixes, both merged same-day by PMC), and 1 merged in OpenCV (#28502, documentation fix). A few concrete questions: - For telemetry, do you prefer a single generic table keyed by tenant/entity/key, or separate tables per device profile / schema domain? - For ReadTsKvQuery aggregations used by dashboards, would you expect direct SQL translation using date_bin / date_bin_gapfill, or an adapter that first preserves current ThingsBoard aggregation semantics? - For retention and latest-value reads, should the table-model backend rely on IoTDB table TTL and SQL latest queries, or keep explicit equivalents of ThingsBoard's savePartition() / cleanup() and a dedicated latest-value structure? Thanks, Zihan Dai GitHub: https://github.com/PDGGK
