antoniopetrole opened a new issue, #928: URL: https://github.com/apache/cloudberry/issues/928
### Apache Cloudberry version Cloudberry 1.6.0 (pre apache release) ### What happened One of our users ran into a query with our data that has some really odd behavior. Whenever the query is ran, the order by clause returns data non-deterministically. For example, the value for col1 should return the value 9217 for the first 36 rows. However on Cloudberry everytime you run it, it seems to return out of order in different ways each time you run it. Sometimes it will only be the first 3 rows that have col1 = 9217, the next run it might be 1 or 10. I made a test case data set you with the query that you can test this against and have included it attached to this issue. I also tested this on Postgres and on Postgres it behaves exactly the same and as expected every time. I tried testing this on my docker multinode bootcamp cluster to no avail, however I was able to recreate it both in our production and dev environments using this test dataset. I imagine the # of segments has some influence on this in some way. [orderby_query.zip](https://github.com/user-attachments/files/18792612/orderby_query.zip) ### What you think should happen instead Results should be deterministic and follow the ORDER BY declaration like in Postgres ### How to reproduce To recreate, just run `psql -f problem_ddl_cloudberry.sql` and then `psql -f problem_query.sql` and on each run you'll see different results. I also included the Postgres DDL version of this as well. A few notes I've taken that might be helpful to know * All 36 rows that we're looking for appear in the results even when out of order * Putting the row number as the first column seems to fix it * Adding/removing columns from the subquery where the UNION operator is seems to randomly fix or keep the bug. You can remove existing columns or add columns like gp_segment_id * If you persist the results of the subquery to a table beforehand and then just include a SELECT * FROM <persisted_table>, this completely fixes the bug as well. So I suspect that the issue lies somewhere within the subquery but I don't know that for sure. ### Examples Runs On Cloudberry (focus on how many rows represent col1 = 9217) ``` col1 | col2 | col3 | col4 | col8 | unid | island_start | rn -------+-------+--------+------+----------------------------+-------------------------+--------------+------- 9217 | 32408 | 172087 | 1 | 2024-11-24 13:07:31.508746 | 1|30|12|259200|0|0 | 1 | 1 21358 | 52711 | 214706 | 1 | 2024-11-24 16:21:59.720578 | 1|2443|4|86400|0|0 | 1 | 2 21358 | 52711 | 214706 | 1 | 2024-11-24 17:20:01.451113 | 1|2443|4|86400|0|0 | 0 | 3 col1 | col2 | col3 | col4 | col8 | unid | island_start | rn -------+-------+--------+------+----------------------------+-------------------------+--------------+------- 9217 | 32408 | 172083 | 1 | 2024-11-24 15:07:28.490719 | 1|2406|4|86400|0|0 | 1 | 1 9217 | 32408 | 172087 | 1 | 2024-11-24 13:07:31.508746 | 1|30|12|259200|0|0 | 1 | 2 21358 | 52711 | 214707 | 1 | 2024-11-24 17:20:01.525726 | 1|253|12|259200|0|0 | 1 | 3 col1 | col2 | col3 | col4 | col8 | unid | island_start | rn -------+-------+--------+------+----------------------------+-------------------------+--------------+------- 9217 | 32408 | 172083 | 1 | 2024-11-24 15:07:28.490719 | 1|2406|4|86400|0|0 | 1 | 1 9217 | 32408 | 172087 | 1 | 2024-11-24 13:07:31.508746 | 1|30|12|259200|0|0 | 1 | 2 9217 | 32408 | 172088 | 1 | 2024-11-24 15:07:28.722085 | 1|30|6|1296000|0|0 | 1 | 3 21358 | 52711 | 214706 | 1 | 2024-11-24 14:19:23.533454 | 1|2443|4|86400|0|0 | 1 | 4 col1 | col2 | col3 | col4 | col8 | unid | island_start | rn -------+-------+--------+------+----------------------------+-------------------------+--------------+------- 21358 | 52711 | 214706 | 1 | 2024-11-24 14:19:23.533454 | 1|2443|4|86400|0|0 | 1 | 1 21358 | 52711 | 214708 | 1 | 2024-11-24 11:36:34.576488 | 1|294|6|1296000|0|0 | 1 | 2 21358 | 52711 | 214708 | 1 | 2024-11-24 14:19:23.757234 | 1|294|6|1296000|0|0 | 0 | 3 ``` ### Operating System Rocky Linux 8.10 (Green Obsidian) ### Anything else _No response_ ### Are you willing to submit PR? - [ ] Yes, I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://github.com/apache/cloudberry/blob/main/CODE_OF_CONDUCT.md). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
