yangzhg opened a new issue #5270:
URL: https://github.com/apache/incubator-doris/issues/5270


   ## Background
   
   The current Doris table does not support the update operation, but there are 
many scenarios where the data needs to be updated. Due to the batch delete 
function we implemented before, it paved the way for the update.
   
   ## Syntax
   
   We only support single table update
   
   ```
   UPDATE table_name
       SET assignment_list
       [WHERE where_condition]
       [ORDER BY...]
       [LIMIT row_count]
   
   value:
       {expr | DEFAULT}
   
   assignment:
       col_name = value
   
   assignment_list:
       assignment [, assignment].
   ```
   
   ## Design
   
   There are two kinds of updates involved here, one that only contains the 
value column, and the other that contains the key column.
   
   * Include and update of value column
   
     This method is relatively simple. You only need to plan the update 
statement into a query plan like `select * from table insert into table` during 
query planning, and you need to add an update node to modify the data to the 
updated value, or plan Into a statement similar to `select a, b,'xx','xx1' from 
table insert into table` and hand it directly to the tableSink node
   
   * Contains the update of the key column
   
     This kind of relatively complicated, we need to divide the query into two 
parts, and need to use the MERGE semantics in the batch delete, first plan it 
into a query plan similar to `select * from table insert into table`, and then 
generate it first for the modification of the key column One piece of data that 
needs to be deleted, and then replace the value of the key column to generate 
new data that needs to be added, mainly generating two pieces of imported data 
from one piece of original data. Need update node to have the ability to repeat 
data
   
   ## Subtasks
   
   1. Add index to unique table value column to speed up data scanning
   2. FE supports update statements and generates corresponding query plans. Be 
implements update node, which can be divided into two parts here
      1. Support updates that only contain the value column
      2. Support the update of the key column
   
   ## 背景
   
   当前 Doris 表不支持 update 操作,但是有很多场景需要对数据进行更新,由于我们之前实现的批量删除功能,为实现update 铺平了道路。
   
   ## 语法
   
   我们只支持单表更新
   
   ```
   UPDATE table_name
       SET assignment_list
       [WHERE where_condition]
       [ORDER BY ...]
       [LIMIT row_count]
   
   value:
       {expr | DEFAULT}
   
   assignment:
       col_name = value
   
   assignment_list:
       assignment [, assignment] .
   ```
   
   ## 设计方案
   
   这里涉及两种更新,一种使只包含value 列的更新, 一种是包含key列的更新
   
   * 包含及value 列的更新
   
     这种方式比较简单,只需要查询规划时将update 语句规划成类似 `select * from table insert into table`  
的查询规划,并且需新增一个update node,用于修改数据为更新后的值, 或者 规划成 类似 `select a, b, 'xx', 'xx1' from 
table insert into table` 的语句,直接交给 tableSink 节点
   
   * 包含key列的更新
   
     这种相对复杂一点,我们需要将查询分成两部,并且需要用到 批量删除里的MERGE 语义,先规划成类似 `select * from table 
insert into table`  查询规划, 
之后针对key列的修改,先生成一条需要删除的数据,再替换key列值后生成需要新增的数据,主要将一条原始数据生成两条导入数据。需要update node 
有数据重复的能力
   
   ## 子任务
   
   1. unique 表 value 列加索引,加快数据扫描速度
   2. FE 支持 update 语句,并生成相应的查询规划,be实现update node, 这里可以分为两部
      1. 支持只包含value 列的更新
      2. 支持包含key列的更新


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to