https://bz.apache.org/bugzilla/show_bug.cgi?id=69820
Bug ID: 69820
Summary: Trivial performance improvement to ParameterMap.<init>
Product: Tomcat 9
Version: 9.0.98
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: P2
Component: Catalina
Assignee: [email protected]
Reporter: [email protected]
Target Milestone: -----
Created attachment 40102
--> https://bz.apache.org/bugzilla/attachment.cgi?id=40102&action=edit
JMH test showing impact of HashMap.<init>(Map)
Our large, latency-sensitive application shows that ParameterMap.<init> is
0.25% of our cpu (pretty big) and is on the latency critical path. Previous
efforts to optimize it had some success, particularly the creation of a
specialized ParameterMap.<init>(ParameterMap) - which is distinct from the
typical ParameterMap.<init>(Map). Both constructors rely on the core function
"new LinkedHashMap<>(Map)".
Profiling on a different application revealed that HashMap<>.init(Map) and
similar constructors (including LinkedHashMap) are slow because they are
polymorphic, and the JIT cannot optimize away the virtual method calls within:
Map.entrySet(), Set.iterator(), Iterator.hasNext(), and Iterator.next(). This
can be easily worked around in a case-by-case basis by unrolling the loop
inside the caller method, in this case the two ParameterMap constructors. Note
that correct sizing is important. Example:
Before:
public ParameterMap(Map<K,V> map) {
delegatedMap = new LinkedHashMap<>(map);
unmodifiableDelegatedMap = Collections.unmodifiableMap(delegatedMap);
}
After:
public ParameterMap(ParameterMap<K,V> map) {
int mapSize = map.size();
delegatedMap = new LinkedHashMap<>((int) (mapSize * 1.5));
for (Map.Entry<K, V> entry : map.entrySet()) {
delegatedMap.put(entry.getKey(), entry.getValue());
}
unmodifiableDelegatedMap = Collections.unmodifiableMap(delegatedMap);
}
There is an existing performance benchmark for these constructors,
TestParameterMapPerformance, and running it (unchanged) before-and-after shows
output similar to this when run on my machine:
Before:
Done with standard in 2553ms
Done with optimized in 2156ms
After:
Done with standard in 2140ms
Done with optimized in 1338ms
To summarize: 16% improvement on the standard constructor and 38% improvement
on the optimized constructor, with the only downside being a few extra lines of
code (and probably an explanatory comment).
I've attached a JMH test that demonstrates the wider problem.
And sorry, I know I can implement this myself, but I'm going to be busy
applying it internally and probably can't get to it for a couple months. I'll
attack it later if no one else gets to it first.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]