xiangmingtao opened a new issue, #60153: URL: https://github.com/apache/doris/issues/60153
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Version I have two scripts for creating materialized views below. One can be created successfully, while the other fails. The following is the failed script. I would like to ask why it fails to create and why the table jck.gs_jbxx is not recognized as the base table. # Script that successfully creates the materialized view DROP MATERIALIZED VIEW IF EXISTS jck.mv_gs_jbxx_base; CREATE MATERIALIZED VIEW jck.mv_gs_jbxx_base BUILD IMMEDIATE REFRESH AUTO ON SCHEDULE EVERY 6 HOUR PARTITION BY (clrq) DISTRIBUTED BY HASH(gsgsuuid) BUCKETS 8 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "storage_medium" = "ssd", "compression" = "LZ4", "disable_auto_compaction" = "false", "enable_single_replica_compaction" = "false", "group_commit_interval_ms" = "5000", "group_commit_data_bytes" = "134217728", "enable_nondeterministic_function" = "true" ) AS SELECT t1.gsgsuuid,t1.gsmc,t1.clrq FROM jck.gs_jbxx t1 left join ( select t.gsgsuuid, group_concat(distinct t.kjzzlxmc order by t.zxsx) jxzzms from jck.gs_jxzz t group by t.gsgsuuid ) t2 on t1.gsgsuuid = t2.gsgsuuid limit 100 ; # Script that fails to create the materialized view DROP MATERIALIZED VIEW IF EXISTS jck.mv_gs_jbxx_base > OK > Time: 0.002s CREATE MATERIALIZED VIEW jck.mv_gs_jbxx_base BUILD IMMEDIATE REFRESH AUTO ON SCHEDULE EVERY 6 HOUR PARTITION BY (clrq) DISTRIBUTED BY HASH(gsgsuuid) BUCKETS 8 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "storage_medium" = "ssd", "compression" = "LZ4", "disable_auto_compaction" = "false", "enable_single_replica_compaction" = "false", "group_commit_interval_ms" = "5000", "group_commit_data_bytes" = "134217728", "enable_nondeterministic_function" = "true" ) AS SELECT t1.gsgsuuid,t1.gsmc,t1.clrq,t2.jxzzms FROM jck.gs_jbxx t1 left join ( select t.gsgsuuid, group_concat(distinct t.kjzzlxmc order by t.zxsx) jxzzms from jck.gs_jxzz t group by t.gsgsuuid ) t2 on t1.gsgsuuid = t2.gsgsuuid limit 100 > 1105 - errCode = 2, detailMessage = Unable to find a suitable base table for partitioning, the fail reason is partition column is not in group by or window partition by, [] > Time: 0.011s ### What's Wrong? 1. Behavior Difference: - When creating the materialized view `jck.mv_gs_jbxx_base` with only `t1.gsgsuuid, t1.gsmc, t1.clrq` in the `SELECT` clause, the creation succeeds without any errors. - When adding `t2.jxzzms` to the `SELECT` clause (the only change), the creation fails immediately with error code 2. 2. Core Error & Confusion: - The error message shows: `Unable to find a suitable base table for partitioning, the fail reason is partition column is not in group by or window partition by`. - I cannot understand why the system fails to recognize `jck.gs_jbxx` as the base table for partitioning (the `clrq` column is directly from `jck.gs_jbxx t1` and is used as the partition column). - The only difference between the successful and failed scripts is adding `t2.jxzzms` to the SELECT list, which should not affect the recognition of the base table or the partition column `clrq`. 3. Key Question: - Why does adding `t2.jxzzms` to the SELECT clause trigger the partitioning/base table recognition error? - Why is the `jck.gs_jbxx` table not identified as the base table for the partition column `clrq` in the failed script? ### What You Expected? 1. Core Expectation: - The materialized view `jck.mv_gs_jbxx_base` should be created successfully even when `t2.jxzzms` is added to the `SELECT` clause (the only change from the working script). - The system should recognize `jck.gs_jbxx` as the base table for partitioning, since the partition column `clrq` is directly retrieved from `jck.gs_jbxx t1` (not from the joined subquery `t2`). 2. Reason for the Expectation: - The partition column `clrq` is explicitly from the main table `jck.gs_jbxx t1`, and adding a column (`t2.jxzzms`) from a left-joined subquery should not invalidate the base table/partition column association. - The subquery `t2` is grouped by `gsgsuuid` (matching the join key with `t1`), and there is no logical reason for this to break the partitioning logic tied to `jck.gs_jbxx`. 3. Expected Outcome: - No error (errCode = 2) related to "unable to find a suitable base table for partitioning" should be thrown. - The materialized view should be built immediately (per `BUILD IMMEDIATE`) with the same partitioning/distribution rules as the successful script, including the `clrq` partition column and `gsgsuuid` hash distribution. ### How to Reproduce? _No response_ ### Anything Else? _No response_ ### Are you willing to submit PR? - [x] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
