[ 
http://issues.apache.org/jira/browse/LUCENE-460?page=comments#action_12356325 ] 

Yonik Seeley commented on LUCENE-460:
-------------------------------------

A couple of guidelines off the top of my head...
 - hash codes should strive to be unique across the Query hierarchy, not just 
unique within one specific subclass.  For example, TermQuery(t) and 
SpanTermQuery(t) will generate the exact same hash codes.
- mix bits between different components that have any hashCode parts in 
common... 
   for example RangeQuery will produce the same hashCode whenever 
lowerTerm==upperTerm.
   Also, field[x TO y] will produce the same hashCode for *any* field since the 
fieldname parts of the
  terms will always cancel eachother out.  This will also cause the hashCode of 
field{x TO x} to equal field:x
  The hashCode of FilteredQuery will also cause many collisions because the 
bits aren't mixed inbetween
   the query and the filter.
  Remember that every query as a boost component... never just xor two query 
hashCodes together.
- make things position dependent.
  Currently, field[x TO y] will produce the same hasCode as field[y TO x]... 
not particularly important for RangeQuery, but
   you get the idea. 
- don't be afraid of using "+" instead of "^".  They both take a single CPU 
cycle, but "+" is not quite so easily (accidentally) reversed.
- flipping more than a single bit when hashing a boolean might be a good idea - 
it will make collisions harder.

http://www.concentric.net/~Ttwang/tech/inthash.htm is an interesting link on 
integer hash codes (what we are in effect doing when we combine multiple hash 
codes).  Esp interesting is the section "Parallel Operations"

> hashCode improvements
> ---------------------
>
>          Key: LUCENE-460
>          URL: http://issues.apache.org/jira/browse/LUCENE-460
>      Project: Lucene - Java
>         Type: Improvement
>   Components: Search
>     Versions: CVS Nightly - Specify date in submission
>     Reporter: Yonik Seeley
>     Priority: Minor
>      Fix For: CVS Nightly - Specify date in submission

>
> It would be nice for all Query classes to implement hashCode and equals to 
> enable them to be used as keys when caching.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to