imay opened a new issue #2117: [PROPOSAL] Supporting aggregate method of REPLACE_IF_NOT_NULL URL: https://github.com/apache/incubator-doris/issues/2117 ## Motivation Many users have proposed a usage scenario to Doris. There are multiple data sources in a table. When a data source needs to be imported, it is often unable to obtain all the columns' value. However Doris requires that all columns should be included in one import, in this way, in order to import data into Doris, users can only join the load data with already existed data in Doris to generate all columns data, and then load it into Doris. Or user should keep a copy of data other place, and generate data with all columns when ETL stage. This makes user very cumbersome and inefficient. So in order to facilitate data import for users in this case, I intend to support a new aggregate method REPLACE_IF_NOT_NULL. In this way, users can change some columns' value without changing the other columns' value. ## Design > original value: value of column already in Doris > added value: newly added value from load data > new value: after apply added value, the value of column in Doris REPLACE_IF_NOT_NULL: the semantics of this aggregation method is to replace the origin value if and only if the added value is a non-NULL value. We currently support the aggregate method of REPLACE. The semantics of REPLACE is that no matter what the added value is, the original value will be replaced by the added value. Then REPLACE_IF_NOT_NULL is to replace the original value if and only if the added value is not NULL. If the added value is NULL, the original value will be maintained with no change. When the user imports data, he can use NULL to represent the column with no value, then the original value can be still retained. The following two tables compare the difference between REPLACE and REPLACE_IF_NOT_NULL behavior REPLACE_IF_NOT_NULL: | original value | added value | new value | | --- | --- | --- | | NULL | NULL | null | | NULL | 1 | 1 | | 1 | NULL | 1 | | 1 | 2 | 2 | REPLACE: | original value | added value | new value | | --- | --- | --- | | NULL | NULL | NULL | | NULL | 1 | 1 | | 1 | NULL | NULL | | 1 | 2 | 2 | The work does not require too much change. 1. Add a new aggregation method in Frontend. Columns of this aggregation type can be created when tables are created. 2. Support the corresponding operators of this aggregation method on the engine side. ## Defect Besides both the orignal value and added value are NULL, there is no way to make a value of column NULL. So once a user sets a column value to a non-NULL value, there is no way to change the value to NULL.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
