https://bz.apache.org/bugzilla/show_bug.cgi?id=69820

            Bug ID: 69820
           Summary: Trivial performance improvement to ParameterMap.<init>
           Product: Tomcat 9
           Version: 9.0.98
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Catalina
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: -----

Created attachment 40102
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=40102&action=edit
JMH test showing impact of HashMap.<init>(Map)

Our large, latency-sensitive application shows that ParameterMap.<init> is
0.25% of our cpu (pretty big) and is on the latency critical path.  Previous
efforts to optimize it had some success, particularly the creation of a
specialized ParameterMap.<init>(ParameterMap) - which is distinct from the
typical ParameterMap.<init>(Map).  Both constructors rely on the core function
"new LinkedHashMap<>(Map)".

Profiling on a different application revealed that HashMap<>.init(Map) and
similar constructors (including LinkedHashMap) are slow because they are
polymorphic, and the JIT cannot optimize away the virtual method calls within: 
Map.entrySet(), Set.iterator(), Iterator.hasNext(), and Iterator.next().  This
can be easily worked around in a case-by-case basis by unrolling the loop
inside the caller method, in this case the two ParameterMap constructors.  Note
that correct sizing is important.  Example:

Before:
    public ParameterMap(Map<K,V> map) {
        delegatedMap = new LinkedHashMap<>(map);
        unmodifiableDelegatedMap = Collections.unmodifiableMap(delegatedMap);
    }


After:
    public ParameterMap(ParameterMap<K,V> map) {
        int mapSize = map.size();
        delegatedMap = new LinkedHashMap<>((int) (mapSize * 1.5));
        for (Map.Entry<K, V> entry : map.entrySet()) {
            delegatedMap.put(entry.getKey(), entry.getValue());
        }
        unmodifiableDelegatedMap = Collections.unmodifiableMap(delegatedMap);
    }


There is an existing performance benchmark for these constructors,
TestParameterMapPerformance, and running it (unchanged) before-and-after shows
output similar to this when run on my machine:

Before:
Done with standard in 2553ms
Done with optimized in 2156ms

After:
Done with standard in 2140ms
Done with optimized in 1338ms

To summarize: 16% improvement on the standard constructor and 38% improvement
on the optimized constructor, with the only downside being a few extra lines of
code (and probably an explanatory comment).

I've attached a JMH test that demonstrates the wider problem.

And sorry, I know I can implement this myself, but I'm going to be busy
applying it internally and probably can't get to it for a couple months.  I'll
attack it later if no one else gets to it first.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to