Hi Bryan, Sorry for the late reply, this issue has been assigned to another contributor.
Best regards, ---------------------- Yuan Tian On Thu, Jun 4, 2026 at 3:59 PM Bryan Yang <[email protected]> wrote: > *Hi Apache IoTDB community,* > > *issue: *https://github.com/apache/iotdb/issues/17799 > > I would like to propose implementing* #17799 *in phases. > > This first PR will address only the *UNPIVOT part of #17799*. *PIVOT and > advanced UNPIVOT* features will be handled in follow-up PRs *after the > Phase 1 PR has been reviewed and merged*. > > The first *PR* should be marked as: *Part of #17799* > > It should not use Fixes #17799 or Closes #17799, *because this PR will not > complete the whole issue.* > Feature Definition > > UNPIVOT converts multiple measurement columns in the IoTDB table model from > a wide-table format into a long-table format. > > Before UNPIVOT: > time, device_id, region, temperature, humidity > > After UNPIVOT: > time, device_id, region, measurement, value > > Here: > > - measurement stores the original source column name, such as > temperature > or humidity. > - value stores the corresponding source column value. > > This is useful for IoT scenarios where measurements are stored as separate > columns, but downstream analytics, export, or visualization prefers a > normalized (measurement, value) format. > Syntax > > UNPIVOT v1 will support a subset of DuckDB simplified UNPIVOT syntax, > with INCLUDE > NULLS added to satisfy the NULL handling requirement in #17799. > UNPIVOT [INCLUDE NULLS] source_relation ON value_column [, value_column > ...] > INTO NAME name_column VALUE value_column_alias; > > Example: > UNPIVOT device_metrics ON temperature, humidity INTO NAME measurement VALUE > value; > > It can also be used in subqueries: > SELECT * FROM ( UNPIVOT device_metrics ON temperature, humidity INTO NAME > measurement VALUE value ); > > And with CTEs: > WITH long_metrics AS ( UNPIVOT device_metrics ON temperature, humidity INTO > NAME measurement VALUE value ) SELECT * FROM long_metrics; > Syntax ComponentsUNPIVOT > > Performs the wide-to-long transformation. > INCLUDE NULLS > > Optional. > > By default, rows whose unpivoted value is NULL are dropped. If INCLUDE > NULLS is > specified, those rows are kept. > source_relation > > The input relation. > > In v1, I propose to support: > > - normal table names > - CTE names > - parenthesized subqueries > > For more complex inputs such as joins, users can wrap them in a subquery or > CTE first. > ON value_column [, ...] > > Specifies the source columns to unpivot. > > In v1, only explicit column names are supported. The following are not > included in this phase: > > - expressions > - CAST > - wildcards > - dynamic column selection > - COLUMNS(* EXCLUDE (...)) > > INTO NAME name_column > > Specifies the generated column that stores the original source column name. > > The generated name column should use the existing text type in the IoTDB > table model, either STRING or TEXT depending on the current internal type > system. > VALUE value_column_alias > > Specifies the generated column that stores the original source column > value. > > Its type is the same as the strictly identical type of all columns listed > in the ON clause. > Output Columns > > The output columns are: > > 1. visible columns from source_relation that are not listed in ON > 2. name_column > 3. value_column_alias > > For example, given source columns: > time, device_id, region, temperature, humidity > > The query: > UNPIVOT device_metrics ON temperature, humidity INTO NAME measurement VALUE > value; > > Produces: > time, device_id, region, measurement, value > > temperature and humidity are no longer output as separate columns. > NULL Semantics > > By default, UNPIVOT drops NULL values. > UNPIVOT device_metrics ON temperature, humidity INTO NAME measurement VALUE > value; > > Is semantically equivalent to: > SELECT time, device_id, region, 'temperature' AS measurement, temperature > AS > value FROM device_metrics WHERE temperature IS NOT NULL UNION ALL SELECT > time, device_id, region, 'humidity' AS measurement, humidity AS value FROM > device_metrics WHERE humidity IS NOT NULL; > > With INCLUDE NULLS, the IS NOT NULL filters are not applied. > > UNPIVOT should not guarantee output row order. Users should use ORDER BY if > they need stable ordering. Tests should also use explicit ordering. > Type Rules > > In v1, all columns listed in the ON clause must have exactly the same type. > > Supported types should follow the existing IoTDB table model scalar types, > such as: > BOOLEAN INT32 INT64 FLOAT DOUBLE TEXT / STRING TIMESTAMP DATE BLOB > > There should be no automatic type coercion in v1. For example: > > - INT32 + INT64 should not be automatically coerced. > - FLOAT + DOUBLE should not be automatically coerced. > - TEXT + STRING should follow the existing IoTDB type system. > > This strict type check should be performed in the UNPIVOT analyzer before > planning. We should not rely only on existing UNION ALL analysis, because > set operations may find a common super type, which would conflict with the > strict v1 semantics. > Validation Rules > > UNPIVOT v1 should validate that: > > - ON columns exist > - ON columns are not duplicated > - ON columns are visible columns > - name_column does not conflict with retained output columns > - value_column_alias does not conflict with retained output columns > - name_column and value_column_alias do not conflict with each other > - quoted identifier and case-insensitive conflicts follow existing IoTDB > identifier rules > > Implementation Plan > > I propose not to rewrite UNPIVOT into UNION ALL in the pure parser or > pre-analyzer rewrite phase, because source visible columns, hidden columns, > types, and CTE/subquery schemas are only known after analysis. > > A safer implementation flow is: > > 1. Add UNPIVOT as a queryPrimary / query body alternative in the > relational SQL grammar. > 2. Parser builds an Unpivot AST node. > 3. Analyzer analyzes source_relation first, then validates ON columns, > visibility, duplicates, output-name conflicts, and strict same-type > rules. > 4. Analyzer determines the static output schema for the Unpivot node. > 5. Planner expands Unpivot into an equivalent UNION ALL plan, or reuses > existing set operation planning where possible. > > Placing UNPIVOT at the queryPrimary / query body level makes it naturally > support top-level usage, subqueries, and CTEs, which matches the issue > requirement and DuckDB simplified syntax. > Test Plan > > The first phase should cover at least: > > - top-level UNPIVOT > - CTE input > - subquery input > - default NULL dropping > - INCLUDE NULLS > - missing ON column > - duplicate ON column > - name / value output column conflict > - mismatched value column types > - INT32 + INT64 not automatically coerced > - quoted identifier / case-insensitive conflicts > > Phased Plan > > I plan to implement this issue in phases. > > Phase 1 will focus only on UNPIVOT v1. It will support: > > - explicit value columns > - static output schema > - default NULL dropping > - optional INCLUDE NULLS > - strict same-type validation > - usage as a top-level query, subquery, or CTE > > After the Phase 1 PR is reviewed and merged, I will start Phase 2 in > follow-up PRs. > > Phase 2 will cover PIVOT and advanced UNPIVOT features, such as: > > - dynamic column selection > - COLUMNS(* EXCLUDE (...)) > - expressions > - CAST > - automatic type coercion > - multiple VALUE columns > > *The reason for waiting until Phase 1 is merged before starting Phase 2 is > that UNPIVOT v1 introduces the basic grammar, AST, analyzer, planner, and > test structure for this family of syntax. It is better to first get > agreement from reviewers on these foundations before adding more complex > features on top of them.* > > This split also keeps the first PR small and reviewable. UNPIVOT v1 has a > static output schema, so it can be analyzed and planned safely. PIVOT, > especially dynamic PIVOT without an explicit IN list, may require runtime > discovery of distinct values, which makes output schema determination more > complex. > > Doing PIVOT before the UNPIVOT foundation is reviewed could increase > implementation risk and make the PR harder to review. > > Best regards, > > Bryan Yang(杨易达) >
