paul-rogers commented on issue #11929: URL: https://github.com/apache/druid/issues/11929#issuecomment-1026436568
Couple of thoughts on syntax. In Druid, there appears to be only one "partition" column and that must be some valid unit of time. The other "cluster" columns are like secondary partitions. So, without ambiguity, we could consider: `PARTITION BY DAY, b, c`. In this case, `DAY` is an implicit alias for the corresponding `TIME_FLOOR` expression. If the time dimension is omitted, then it would be `PARTITION BY ALL, b, c`. while `ALL` is an alias for a time slice of (near) infinite size. If the `PARTITION ` statement is omitted, then it means one big partition over all time with all values. Fine for a test table, maybe not so good for a table at scale. Then, since we would not need the `CLUSTER` keyword, and since `PARTITION` is already reserved, the expression can go back to the original proposed `PARTITION BY` to be consistent with other uses of `PARTITION`. The user can also control sort order within a partition. For that, `ORDER BY` is the most logical choice. Implicitly, the partition columns go before the `ORDER BY` columns. Thus: ```sql INSERT INTO table SELECT ... PARTITION BY DAY, client, region ORDER BY order_id, line_item ``` In the above, the total sort order is `(__time, client, region, order_id, line_item)`. The first key could also be a valid expression of `__time` which, at present, is only `TIME_FLOOR`. This allows special-case partitions such as 2 days or 8 hours without needing keywords for each of these cases. (The analyzer can reject any unit which Druid might not support.) For the replace, we might want to handle the general case, which is an `UPDATE` or `DELETE` statement. To reflect a company name change: ```sql UPDATE table SET company = 'XYZ Corp.' WHERE company = 'ABC LLC' ``` To remove all of Bob's data due to a GPDR request: ```sql DELETE FROM table WHERE userId = 'bob' ``` Druid might not support these yet, but they seem reasonable (if expensive) operations. Given that, we might observe that the standard form of SQL is verb + details. So, for the replace partition, maybe go with the Spark syntax: ```sql INSERT OVERWRITE table PARTITION _time = <time const> SELECT ... PARTITION BY ... ``` The parser should handle the dual use of `PARTITION` because one is always followed by `BY` the other never is. Or, to be more general: ```sql INSERT OVERWRITE table WHERE _time = <time const> SELECT ... PARTITION BY ... ``` The above is logically equivalent to a `DELETE` statement followed by an `INSERT`. Since Druid does not support compound statements, we glom both operations together. The above allows replacing any set of keys, with the time partition as a special case. If the present version only supports replacing a time partition, we simply require that the `WHERE` class have exactly one equality condition on only the `__time` clause. That is, we design the syntax for the general case, but use semantic analysis to restrict usage to the one case supported today. Or, if we really want to call out that we can only replace the "primary" date partition, and none of the other keys: ```sql INSERT OVERWRITE table WHERE PARTITION = <time const> SELECT ... PARTITION BY ... ``` But, the above seems overkill. If we want a new verb, replace `INSERT OVERWRITE` with `REPLACE WITHIN` (`REPLACE ` is a MySQL reserved word while `WITHIN` is a [SQL-2016 reserved word](https://en.wikipedia.org/wiki/SQL_reserved_words)): ```sql REPLACE WITHIN table WHERE _time = <time const> SELECT ... PARTITION BY ... ``` One final issue is that we have the user specify the partition in every statement. In most cases, there is only one correct value: that used yesterday, and the day before and... Today we have to repeat ourselves because there is no other source of information. Suppose we stored the partitioning as part of the schema, as in Hive. Then we get: ```sql INSERT INTO table SELECT ... ``` With partitioning and sorting being that defined in the schema. The above is just plain SQL, of the kind that any tool can produce, which is a nice win. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
