dh-cloud opened a new issue, #57707: URL: https://github.com/apache/doris/issues/57707
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Version 4.0 ### What's Wrong? We observed inconsistent behavior between SELECT DISTINCT and COUNT(DISTINCT) when operating on multi-column combinations ### What You Expected? SELECT DISTINCT returns x rows (correct). COUNT(DISTINCT) returns x ### How to Reproduce? Steps to Reproduce Create a table with nullable columns and duplicate combinations: CREATE TABLE test_distinct ( col1 VARCHAR(10), col2 VARCHAR(10) ); INSERT INTO test_distinct VALUES ('A', 'X'), ('A', 'X'), -- Duplicate ('B', NULL), ('B', NULL), -- NULL duplicates ('C', 'Y'), (NULL, 'Z'), (NULL, NULL); Execute the following queries: -- Returns 5 rows (all distinct combinations, including NULLs) SELECT DISTINCT col1, col2 FROM test_distinct; -- Returns 2 (incorrect, should match the 5 unique combinations) SELECT COUNT(DISTINCT col1, col2) FROM test_distinct; Expected vs Actual Behavior Expected: Both queries should agree on the number of distinct combinations (5 in this case) Actual: SELECT DISTINCT returns 5 rows (correct). COUNT(DISTINCT) returns 2 (incorrect, likely due to NULL handling or optimization bugs). ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
