Panagiotis Garefalakis created HIVE-23166:

             Summary: Protect VGB from flushing too often
                 Key: HIVE-23166
             Project: Hive
          Issue Type: Improvement
          Components: llap
    Affects Versions: 4.0.0
            Reporter: Panagiotis Garefalakis
            Assignee: Panagiotis Garefalakis

The existing flush logic in our VectorGroupByOperator is completely static.
 It depends on the: number of HtEntries (*hive.vectorized.groupby.maxentries*) 
and the MAX memory threshold (by default 90% of available memory)
Assuming that we are not memory constrained the periodicity of flushing is 
currently dictated by the static number of entries (1M by default) which can be 
also misconfigured to a very low value.

I am proposing along with maxHtEntries, to also take into account current 
memory usage, to avoid flushing too ofter as it can hurt op throughput for 
particular workloads.

This message was sent by Atlassian Jira

Reply via email to