imay opened a new issue #2117: [PROPOSAL] Supporting aggregate method of 
REPLACE_IF_NOT_NULL
URL: https://github.com/apache/incubator-doris/issues/2117
 
 
   ## Motivation
   
   Many users have proposed a usage scenario to Doris. There are multiple data 
sources in a table. When a data source needs to be imported, it is often unable 
to obtain all the columns' value. However Doris requires that all columns 
should be included in one import, in this way, in order to import data into 
Doris, users can only join the load data with already existed data in Doris to 
generate all columns data, and then load it into Doris. Or user should keep a 
copy of data other place, and generate data with all columns when ETL stage. 
This makes user very cumbersome and inefficient.
   
   So in order to facilitate data import for users in this case, I intend to 
support a new aggregate method REPLACE_IF_NOT_NULL. In this way, users can 
change some columns' value without changing the other columns' value.
   
   ## Design
   
   > original value: value of column already in Doris
   > added value: newly added value from load data
   > new value: after apply added value, the value of column in Doris
   
   REPLACE_IF_NOT_NULL: the semantics of this aggregation method is to replace 
the origin value if and only if the added value is a non-NULL value.
   
   We currently support the aggregate method of REPLACE. The semantics of 
REPLACE is that no matter what the added value is, the original value will be 
replaced by the added value. Then REPLACE_IF_NOT_NULL is to replace the 
original value if and only if the added value is not NULL. If the added value 
is NULL, the original value will be maintained with no change. When the user 
imports data, he can use NULL to represent the column with no value, then the 
original value can be still retained.
   
   The following two tables compare the difference between REPLACE and 
REPLACE_IF_NOT_NULL behavior
   
   REPLACE_IF_NOT_NULL: 
   
   | original value | added value | new value |
   | --- | --- | --- |
   | NULL | NULL | null |
   | NULL | 1 | 1 |
   | 1 | NULL | 1 |
   | 1 | 2 | 2 |
   
   REPLACE:
   
   | original value | added value | new value |
   | --- | --- | --- |
   | NULL | NULL | NULL |
   | NULL | 1 | 1 |
   | 1 | NULL | NULL |
   | 1 | 2 | 2 |
   
   
   The work does not require too much change.
   
   1. Add a new aggregation method in Frontend. Columns of this aggregation 
type can be created when tables are created.
   2. Support the corresponding operators of this aggregation method on the 
engine side.
   
   ## Defect
   
   Besides both the orignal value and added value are NULL, there is no way to 
make a value of column NULL. So once a user sets a column value to a non-NULL 
value, there is no way to change the value to NULL. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to