armitage420 commented on PR #6075: URL: https://github.com/apache/hive/pull/6075#issuecomment-3307189021
@thomasrebele Thank you for your input! You are correct—the lexicographical sorting is done on unmasked values. Therefore, a better (and more accurate) fix would be to apply masking before sorting the results. Currently, [sorting](https://github.com/apache/hive/blob/96d1635f98d0cfd5b0cd01115bb738e1c6053b94/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java#L286) is performed for every single query, whereas [masking](https://github.com/apache/hive/blob/96d1635f98d0cfd5b0cd01115bb738e1c6053b94/itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java#L972) is only applied at the end, once we have collected all query results for the entire qfile. To implement the actual fix, we would need to change the test architecture so that masking is done per query, followed by sorting. I'm not sure if this approach would be agreed upon, but if suggested, I can implement it! @deniskuzZ Do let me know what you think! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
