Currently we are not working on CONSTRAINTs AFAICT. But feel free to open a Jira for it, someone might pick it up. You can also work on implementing it if you want.
During writes, Impala would just verify the constraints, i.e. "col_day = DAY(col_time)". Your example also includes query optimization based on constraints, i.e. adding extra predicates on columns with constraints. This seems to be the trickier part, and in some cases might require that the rewrite rule knows the functions in the CONSTRAINT. But I'd also like to point out that the above use case works already with Iceberg tables. E.g.: CREATE TABLE ice_t (ts TIMESTAMP, d INT) PARTITIONED BY SPEC (DAY(ts)) STORED AS ICEBERG; INSERT INTO ice_t VALUES (now(), 42); SELECT * FROM ice_t where ts > '2018-07-18 19:36:42'; During insertion the partition value is automatically calculated, and during the planning of the SELECT statement the Iceberg table is smart enough to apply the ts > '2018-07-18 19:36:42' predicate (rewritten to "day(ts) >= '2018-07-18'") on the partition column. Cheers, Zoltan On Mon, Jun 27, 2022 at 2:14 PM Xiaoqing Gao <285814...@qq.com> wrote: > Hi, are you considering adding the feature of CONSTRAINT syntax? > > when we add syntax CONSTRAINT c1 CHECK (col1_day = DAY(col1_time)) on ddl, > impala identify CONSTRAINT, then convert "col_time between $time_lower and > $time_upper" to "col_day between $day_lower and $day_upper and col_time > between $time_lower and $time_upper". > > E.g. > CREATE TABLE rawdata.event_p1 ( day INT NOT NULL ENCODING BIT_SHUFFLE > COMPRESSION DEFAULT_COMPRESSION, id BIGINT NOT NULL ENCODING BIT_SHUFFLE > COMPRESSION DEFAULT_COMPRESSION, time TIMESTAMP NOT NULL ENCODING > BIT_SHUFFLE COMPRESSION DEFAULT_COMPRESSION, p__ip STRING NULL ENCODING > DICT_ENCODING COMPRESSION DEFAULT_COMPRESSION, CONSTRAINT c1 CHECK (day = > DAY(time)) ) PARTITIONED BY ( day INT ) STORED AS PARQUET > > explain select * from event_p1 where time > '2018-07-18 19:36:42' > > 01:EXCHANGE [UNPARTITIONED] | 00:SCAN HDFS [rawdata.event_ros_p7] > partition predicates: `day` > 17730 and time > '2018-07-18 19:36:42' > partitions=425/439 files=3420 size=1.71TB > > > Thanks > > gaoxiaoqing