[ 
https://issues.apache.org/jira/browse/DERBY-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661099#action_12661099
 ] 

Kristian Waagan commented on DERBY-3981:
----------------------------------------

The patch looks like a good improvement to me.
+1 to commit.

A few questions:
 1) The hashcode isn't saved. Is this because of the reuse of the SQLChar 
object?
 2) Is the hashCode method invoked as part of normal Derby usage for Clobs?
    I'm asking because SQLClob inherits the hashCode method of SQLChar, which 
causes the value to be materialized.
 3) Not a question, but we are overriding both equals and hashCode. For SQLChar 
I believe the relationship between the two methods is correct (i.e. two equal 
objects must have the same hashcode, pad characters are "ignored" in both 
methods).

> Improve distribution of hash codes in SQLBinary and SQLChar
> -----------------------------------------------------------
>
>                 Key: DERBY-3981
>                 URL: https://issues.apache.org/jira/browse/DERBY-3981
>             Project: Derby
>          Issue Type: Improvement
>          Components: Newcomer, Performance, SQL
>    Affects Versions: 10.4.2.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>            Priority: Minor
>         Attachments: d3981.diff, distinct-test.diff
>
>
> SQLBinary.hashCode() and SQLChar.hashCode() use a very simple algorithm that 
> just takes the sum of the values in the array. This gives a poor distribution 
> of hash values because similar values will have a higher probability of 
> mapping to the same hash code, and the higher bits won't be used unless the 
> array is very long. We should change these methods so that they use an 
> algorithm similar to the one used in java.lang.String.hashCode(), described 
> here: 
> <URL:http://java.sun.com/javase/6/docs/api/java/lang/String.html#hashCode()>. 
> This may have a positive effect on the performance of hash scans as it will 
> reduce the likelihood of collisions in the hash table.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to