okumin commented on PR #5719: URL: https://github.com/apache/hive/pull/5719#issuecomment-2786679837
I also double-checked the content in the PR description. As you say, GroupByOperator and VectorGroupByOperator have some inconsistencies. - The source of the maximum memory for the operators; the non-vectorized version uses getMaxMemoryAvailable both for LLAP and Tez, and the vectorized version uses getMaxMemoryAvailable for LLAP and max JVM heap for Tez - `hive.map.aggr.hash.force.flush.memory.threshold` vs `hive.map.aggr.hash.percentmemory` for hash table memory(numEntries * width) - The metrics of total memory pressure; the non-vectorized version uses used JVM heap and the vectorized version uses a SoftReference - Only the vectorized version has a threshold for simple count This PR addresses the second problem. I think we can also standardize the first one. I have no idea about the third and fourth ones, as they seem to be intentional. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org