jayzhan211 commented on PR #15851: URL: https://github.com/apache/datafusion/pull/15851#issuecomment-2831689650
> But I want to improve it in an new pr, for not blocking #15591 , is it ok? Sure. > We should ensure all ordering situations can be covered. At least 1 full ordering + 1 partial ordering + 1 no ordering . Like dataset sorted by a,b, can ensure at least following three cases: 1. We need to generate random queries covered all the cases easily AND 2. We need to generate random specific query easily too. Specific query for example, I want full ordering like `select a, b, c from t group by a, b, c`, they are all ordered. Or partial ordering that c is not ordered or both c and b are not ordered. I think the current implementation issue is that the order is defined once in the dataset level, so all the generated query has the ordering information based on the dataset. Dataset1: a, b ordered, c not ordered. Any generated query has the same ordering as dataset defined ``` SELECT xxx from xxx GROUP BY a,b SELECT xxx from xxx GROUP BY a,c SELECT xxx from xxx GROUP BY b ``` But I want ordering specific for each query, assume I want first column ordered and second column unorderd. Dataset2: a,b,c,d,e .... These are random generated query with fixed columns (2). ``` SELECT xxx from xxx GROUP BY a,b (a ordered, b unorderd) SELECT xxx from xxx GROUP BY a,c (a ordered, c unorderd) SELECT xxx from xxx GROUP BY c,d (c ordered, d unorderd) SELECT xxx from xxx GROUP BY d,e (d ordered, e unorderd) ``` I expect is to be **index-specific** ordering not based on the **column name** -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org