[
https://issues.apache.org/jira/browse/DERBY-3882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Knut Anders Hatlen updated DERBY-3882:
--------------------------------------
Attachment: Cursors.java
check_hash.diff
Here's a patch which implements the simple approach using String.hashCode(). I
suggest that we go for that approach for now since changing the protocol would
be a bigger task and riskier. It seems to be sufficient in order to get the
method off the profiler's list of big CPU consumers in my environment, and we
can always revisit the issue and change the network protocol later if someone
has a load where the simple fix is not sufficient.
The patch makes GenericLanguageConnectionContext.lookupCursorActivation() check
the hash codes of the two strings before calling String.equals(), and it skips
equals() if the hash codes are different, as strings with different hash codes
are never equal. This exploits the fact that the most common implementations of
java.lang.String cache the hash code, so that computing and comparing the hash
codes will be reduced to a simple comparison of two integer fields after
warm-up.
I've also attached a small test class (Cursor.java) to show the effect of the
patch. It repeatedly executes "VALUES 1" in an embedded connection with 50 open
statements, and each statement has a cursor name. "VALUES 1" is executed 2
million times for warm-up and then 2 million times again with the time being
recorded. Running the test 10 times with trunk and 10 times with the patch (on
OpenSolaris, Java version 1.6.0_15), it needed on average ~30% shorter time to
complete with the patched version. Average/min/max time in seconds for the runs
is shown below.
ij> select name, avg(tps) "AVG", min(tps) "MIN", max(tps) "MAX" from results
group by name;
NAME |AVG |MIN |MAX
--------------------------------------------------
d3882 |9.0245 |8.515 |9.867
trunk |12.968401 |11.732 |14.372
2 rows selected
All the regression tests ran cleanly with the patch.
> Expensive cursor name lookup in network server
> ----------------------------------------------
>
> Key: DERBY-3882
> URL: https://issues.apache.org/jira/browse/DERBY-3882
> Project: Derby
> Issue Type: Improvement
> Components: Network Server, SQL
> Affects Versions: 10.4.2.0
> Reporter: Knut Anders Hatlen
> Assignee: Knut Anders Hatlen
> Priority: Minor
> Attachments: check_hash.diff, Cursors.java
>
>
> I have sometimes seen in a profiler that an unreasonably high amount of the
> CPU time is spent in
> GenericLanguageConnectionContext.lookupCursorActivation() when the network
> server is running. That method is used to check that there is no active
> statement in the current transaction with the same cursor name as the
> statement currently being executed, and it is normally only used if the
> executing statement has a cursor name. None of the client-side statements had
> a cursor name when I saw this.
> The method is always called when the network server executes a statement
> because the network server assigns a cursor name to each statement even if no
> cursor name has been set on the client side. If the list of open statements
> is short, the method is relatively cheap. If one uses
> ClientConnectionPoolDataSource with the JDBC statement cache, the list of
> open statements can however be quite long, and lookupCursorActivation() needs
> to spend a fair amount of time iterating over the list and comparing strings.
> The time spent looking for duplicate names in lookupCursorActivation() is
> actually wasted time when it is called from the network server, since the
> network server assigns unique names to the statements it executes, even when
> there are duplicate names on the client. It would be good if we could reduce
> the cost of this operation, or perhaps eliminate it completely when the
> client doesn't use cursor names.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.