Hi Zihan,
Thanks for getting this out before community bonding wraps ― the 16-page doc, the cardinality math, and the per-DAO Spring activation matrix in §6.0 are exactly the level of detail I was hoping for at this stage. Below is my mentor-side position on the six questions. Please treat these as one input alongside whatever the broader community contributes; community feedback wins where it disagrees with me. Q1 ― Attribute table schema: Option B (scope-as-TAG) Concur with Option B. The load-bearing argument for me is alignment with ThingsBoard's own attribute_kv schema, where attribute_type (scope) and attribute_key are already separate dimensions. Mirroring that on the IoTDB side keeps the mental model identical for any ThingsBoard contributor and shortens onboarding time for future maintainers. The cardinality math is a fine secondary tiebreaker, but the schema-convention argument is what should carry it. On the §4.10 sub-questions: * scope-constant naming: prefer the literal ThingsBoard enum values (CLIENT_SCOPE, SERVER_SCOPE, SHARED_SCOPE) rather than shortened forms. Round-trip equality with ThingsBoard's existing identifiers removes a translation layer and makes log/SQL inspection less ambiguous. * TTL: see Q3. * TAG ordering: see Q4. Q2 ― Label handling: keep out of IoTDB in Phase 1 Concur. HasLabel in current ThingsBoard is a single private String label field on Device/Asset ― there is no Phase-1 user-facing requirement that needs a per-label time-series or a multi-valued tag set on the IoTDB side. Leaving it where it already lives (entity DB) is correct and keeps the IoTDB side focused on the actual time-series + attributes story. Keep §5.4 (entity_labels with current-state contract, no tombstones) in the doc as a v2 sketch. If ThingsBoard upstream ever evolves labels into a multi-tag feature, we don't want to redesign from scratch. Q3 ― TTL='INF' default for entity_attributes Strongly recommend TTL='INF'. Attributes are latest-state configuration, not telemetry. A finite default TTL would silently drop tenant configuration after the window expires ― that is a correctness bug from the operator's perspective, not a tunable trade-off. If a specific deployment ever needs finite retention on an attribute namespace, that should be an explicit per-deployment override rather than the default. Q4 ― TAG column ordering I'd start with (tenant_id, entity_type, entity_id, attribute_scope, key) as the baseline: tenant_id first matches the multi-tenant predicate that hits every read path, and entity_id is the next most selective dimension under any realistic load. That said, I don't want to overcommit on Table Mode internals without input from the IoTDB core side. Flagging this one explicitly for IoTDB Table Mode committers on this list ― does the 2.x storage engine's predicate-pushdown layer prefer a different ordering, or is the "most selective first" heuristic still the right one here? EXPLAIN ANALYZE benchmarks in W6 are a sensible safety net regardless, but starting from an informed baseline beats burning W6 cycles on combinatorial ordering experiments. Q5 ― Benchmark scope Keep the full 5 (IoTDB Table / IoTDB Tree / Cassandra / PostgreSQL / TimescaleDB): * IoTDB Tree is the migration baseline ― required. * Cassandra and PostgreSQL are ThingsBoard's actual production backends ― required. * TimescaleDB is the time-series PostgreSQL that any external write-up will be compared against ― also required, but if W11�C12 timeline pressure forces a cut, this is the first to move into the appendix as "deferred" rather than dropped from scope upfront. Don't trim early. Cut at the end if W11�C12 is tight. Q6 ― AttributesDao activation (for ThingsBoard maintainers) Mentor preference order: (1) > (3) > (2). * Option (1) (new database.attributes.type property mirroring ts.type) is the cleanest separation and the one ThingsBoard maintainers are most likely to accept because it reuses an existing, well-understood pattern. The upstream patch is small and self-contained. * Option (3) (defer AttributesDao to stretch, attributes stay in entity DB Phase 1) is a perfectly acceptable fallback if maintainer feedback on (1) is slow or skeptical. Keeps Phase 1 focused on the value-dense time-series path; AttributesDao can ship as Phase 2 once (1)'s upstream landed. * Option (2) (Spring profile composition without upstream change) is the one I'd push back on. It works in theory but creates a configuration surface no ThingsBoard operator has seen before; it externalises complexity onto deployers in exchange for avoiding a 30-line upstream patch. Not worth the long-term cost. On cross-community engagement: agree with your plan to cross-post a focused summary to ThingsBoard Discussion #15296 once this dev@ thread settles, and to reach the ThingsBoard dev channel on Q6 specifically. I do not personally have warm contacts on the ThingsBoard DAO reviewer team. Opening that to this list as well: if anyone here has working relationships with ThingsBoard DAO reviewers, please reach out off-list. Decision timeline Even if community input is light, we should not block coding on perfect consensus. Proposed cadence: * By 2026-05-22 (Fri) EOD UTC ― any objector on Q1, Q2, Q3 raises it on this thread. * 2026-05-23 (Sat) ― Zihan posts an updated v1.1 design doc with resolved decisions. * 2026-05-25 (Mon) ― coding starts on the basis of v1.1. Q4 can be revisited via EXPLAIN ANALYZE benchmarks in W6 if the IoTDB Table Mode committers do not surface a definitive answer on this thread. Q6 follows the ThingsBoard-side conversation on its own clock; if it does not resolve by end of W4, we default to Option (3) and AttributesDao moves to stretch ― as already flagged in your Wk 5 scope note. Thanks again, Zihan ― really solid foundation for the coding period. Best regards, Xuan Wang Apache IoTDB 发件人: Zh D <[email protected]> 日期: 星期一, 2026年5月18日 13:46 收件人: [email protected] <[email protected]> 主题: [GSoC 2026] GSOC-304 design doc ― feedback requested Hi all, Following up on my May 8 self-introduction, I've completed the refined design document for GSOC-304 (Enhancing ThingsBoard Integration with IoTDB 2.x Table Mode). Posting it now, before community-bonding wraps on May 24, so feedback can land before coding starts on May 25. Full document (16 pages, 12 sections + 3 appendices): https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdrive.google.com%2Ffile%2Fd%2F1jXMCwF_HVvCR5lHDIT_1pv1DiIZt8j5O%2Fview%3Fusp%3Dsharing&data=05%7C02%7C%7C7787e862489f4277525408deb4a0be33%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C639146799665605330%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=CLw2nWgoftwPO0ko5cUnN5D9Qz1Tl2FT8HFPo8Mafio%3D&reserved=0<https://drive.google.com/file/d/1jXMCwF_HVvCR5lHDIT_1pv1DiIZt8j5O/view?usp=sharing> == TL;DR == The doc proposes concrete recommendations for the two open design questions called out by my mentor (Xuan Wang), with evidence from ThingsBoard schema/API/Rule-Engine and IoTDB Table Mode primitives: 1. Attribute table schema → recommend Option B (single `entity_attributes` table with `attribute_scope` as TAG). Backs the PoC's existing choice with cardinality math (Options A/B/C all represent ~150K logical attribute identities under typical multi-tenant deployment) + ThingsBoard's own SQL schema convention (scope and key are separate dimensions in `attribute_kv`) + native scope filtering. 2. Label handling → recommend Phase 1 does not mirror labels into IoTDB. Labels in current ThingsBoard are a singular optional entity field on Device/Asset (`HasLabel`, `private String label`), not a many-tag feature; they fit the existing entity DB. A contingent design (separate `entity_labels` table with current-state contract, no tombstones) is sketched in §5.4 in case the community wants a label index on the IoTDB side. The doc also includes a per-DAO Spring activation matrix in §6.0, the 12-week implementation plan aligned to mentor's phasing, a 5-test-case benchmark plan vs. IoTDB Tree / Cassandra / PostgreSQL / TimescaleDB, migration approach, and risks. Note on scope: the Wk 5 plan assumes the AttributesDao activation question (Q6 below) resolves to option (1) or (2). If maintainers prefer (3), AttributesDao moves to stretch and Wk 5 shifts to telemetry/latest polish. == Asks for community feedback (full context in §12 of the doc) == 1. Do you concur with Option B (scope-as-TAG) for the attribute table? See §4.10 for sub-questions on scope-constant naming, TTL, and TAG ordering. 2. Is the recommendation to keep labels out of IoTDB in Phase 1 acceptable, or should the design include `entity_labels` from day one? See §5.5. 3. Is `TTL='INF'` the right default for `entity_attributes` (latest-state metadata), or is there a real use case for a finite default TTL? 4. Any guidance on optimal TAG column ordering for IoTDB 2.x predicate pruning ― `(tenant_id, entity_type, entity_id, attribute_scope, key)` or scope/key-first ― or should I rely on `EXPLAIN ANALYZE` benchmarks during Wk 6? 5. Is the 5-backend benchmark scope (IoTDB Table / IoTDB Tree / Cassandra / PostgreSQL / TimescaleDB) the right set, or should TimescaleDB be optional? 6. (For ThingsBoard maintainers in particular) AttributesDao activation: do you prefer (1) a new `database.attributes.type` property mirroring the `ts.type` pattern (requires a small upstream patch), (2) Spring profile composition with the existing entity-DB switch, or (3) keep attributes in the entity DB for Phase 1 and treat AttributesDao as a stretch goal? See §6.0. Communication status on Q6: I have not yet engaged ThingsBoard maintainers directly on this question. Once this dev-list thread settles, I plan to cross-post a focused summary of Q6 to ThingsBoard Discussion #15296 and reach out via the ThingsBoard dev channel for maintainer-side input. If anyone here has working contacts with the ThingsBoard DAO reviewer team, introductions would be very welcome. I'd really value input on any of the above before May 25. Specific replies on questions 1, 2, and 6 are most time-critical, since they shape Phase 1 scope and Wk 5 deliverables. Thanks, Zihan Dai GitHub: https://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FPDGGK&data=05%7C02%7C%7C7787e862489f4277525408deb4a0be33%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C639146799665655691%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=6RsuwT4NFNWEaRoCzVSpq4QicI4yZnDoAb4QLPpjls8%3D&reserved=0<https://github.com/PDGGK>
