Hi all,

I’d like to let you know that a new feature was merged into IoTDB last
month (https://github.com/apache/iotdb/pull/15685). This feature introduces
“INSERT INTO SELECT” statements based on the table data model, which allow
you to read data from existing source tables and insert it into another
existing target table.

1. Grammar
The SQL syntax is:
`INSERT INTO <table_name> [(column1, column2, ...)] <query>`

Suppose there are two tables, t1 and t2, with the same schema. Here are
some usage examples:
* `INSERT INTO t2 SELECT * FROM t1`
* `INSERT INTO t2 TABLE t1`
* `INSERT INTO t2 (SELECT * FROM t1 WHERE x > 10)`

2. Restrictions
There are several syntax checks:
* The source and target tables should already exist. The user must have
SELECT privilege on the source tables and WRITE privilege on the target
table.
* The data types of the columns in the query must match those of the target
table.
* If specific columns of the target table are designated, the column list
must include the timestamp column and at least one field column. For
example: `INSERT INTO t2(time, x) SELECT time, x FROM t1`


3. Usages
“INSERT INTO SELECT” statements can be used in various scenarios, such as:
* Data migration: Moving data from one table to another.
* Data transformation: Modifying data while inserting it into the target
table.
* Data aggregation: Inserting the results of complex queries (such as JOINs
or aggregations) into a summary table.

In addition, “INSERT INTO SELECT” supports inserting data into the same
table from which it is queried. This makes it easy to generate or duplicate
data for testing purposes. For example: `INSERT INTO t1(time, x) SELECT
time + 86400000, x from t1`

4. Implementation
The implementation of “INSERT INTO SELECT” statements involves several key
components:
* Query parsing: Parses and validates the syntax of “INSERT INTO SELECT”
statements, and incorporates the query portion into an Insert statement.
* Query planning: Constructs an IntoNode in the logical plan and sets the
query plan as its child node.
* Execution: Transforms the IntoNode into an IntoOperator, which is
responsible for pulling data and inserting it into the target table.

When multiple data regions are present, the IntoNode can be pushed below
the CollectNode, resulting in the following distributed execution plan:
            +-----------+
            | OutputNode|
            +-----------+
                |
        +-----------------+
        | AggregationNode |
        +-----------------+
                |
            +-------------+
            | CollectNode |
            +-------------+
            /               \
           /                 \
    +----------+              +----------+
    | IntoNode |              | IntoNode |
    +----------+              +----------+
        |                            |
+---------------------+             +---------------------+
| DeviceTableScanNode |   | DeviceTableScanNode |
+---------------------+             +---------------------+

Reply via email to