David Mollitor created ORC-848:
----------------------------------
Summary: Recycle Internal Buffer in StringHashTableDictionary
Key: ORC-848
URL: https://issues.apache.org/jira/browse/ORC-848
Project: ORC
Issue Type: Improvement
Reporter: David Mollitor
Assignee: David Mollitor
{code:java|title=StringHashTableDictionary.java}
private void initHashBuckets(int capacity) {
DynamicIntArray[] buckets = new DynamicIntArray[capacity];
for (int i = 0; i < capacity; i++) {
// We don't need large bucket: If we have more than a handful of
collisions,
// then the table is too small or the function isn't good.
buckets[i] = createBucket();
}
hashBuckets = buckets;
}
{code}
This code was highlighted for me in a JMH run of the perf test. The
{{Dictionary}} is regularly cleared out and is reset back to its default state.
I'm sure most of the time is spent generating {{capacity}} buckets (buffers),
but we can save one buffer initialization by only creating {{buckets}} if the
capacity is different than requested (which is not the case with a
{{clear()}}}).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)