Hi all, I’d like to let you know that a new feature was merged into IoTDB last month (https://github.com/apache/iotdb/pull/15685). This feature introduces “INSERT INTO SELECT” statements based on the table data model, which allow you to read data from existing source tables and insert it into another existing target table.
1. Grammar The SQL syntax is: `INSERT INTO <table_name> [(column1, column2, ...)] <query>` Suppose there are two tables, t1 and t2, with the same schema. Here are some usage examples: * `INSERT INTO t2 SELECT * FROM t1` * `INSERT INTO t2 TABLE t1` * `INSERT INTO t2 (SELECT * FROM t1 WHERE x > 10)` 2. Restrictions There are several syntax checks: * The source and target tables should already exist. The user must have SELECT privilege on the source tables and WRITE privilege on the target table. * The data types of the columns in the query must match those of the target table. * If specific columns of the target table are designated, the column list must include the timestamp column and at least one field column. For example: `INSERT INTO t2(time, x) SELECT time, x FROM t1` 3. Usages “INSERT INTO SELECT” statements can be used in various scenarios, such as: * Data migration: Moving data from one table to another. * Data transformation: Modifying data while inserting it into the target table. * Data aggregation: Inserting the results of complex queries (such as JOINs or aggregations) into a summary table. In addition, “INSERT INTO SELECT” supports inserting data into the same table from which it is queried. This makes it easy to generate or duplicate data for testing purposes. For example: `INSERT INTO t1(time, x) SELECT time + 86400000, x from t1` 4. Implementation The implementation of “INSERT INTO SELECT” statements involves several key components: * Query parsing: Parses and validates the syntax of “INSERT INTO SELECT” statements, and incorporates the query portion into an Insert statement. * Query planning: Constructs an IntoNode in the logical plan and sets the query plan as its child node. * Execution: Transforms the IntoNode into an IntoOperator, which is responsible for pulling data and inserting it into the target table. When multiple data regions are present, the IntoNode can be pushed below the CollectNode, resulting in the following distributed execution plan: +-----------+ | OutputNode| +-----------+ | +-----------------+ | AggregationNode | +-----------------+ | +-------------+ | CollectNode | +-------------+ / \ / \ +----------+ +----------+ | IntoNode | | IntoNode | +----------+ +----------+ | | +---------------------+ +---------------------+ | DeviceTableScanNode | | DeviceTableScanNode | +---------------------+ +---------------------+