[
http://issues.apache.org/jira/browse/LUCENE-460?page=comments#action_12356325 ]
Yonik Seeley commented on LUCENE-460:
-------------------------------------
A couple of guidelines off the top of my head...
- hash codes should strive to be unique across the Query hierarchy, not just
unique within one specific subclass. For example, TermQuery(t) and
SpanTermQuery(t) will generate the exact same hash codes.
- mix bits between different components that have any hashCode parts in
common...
for example RangeQuery will produce the same hashCode whenever
lowerTerm==upperTerm.
Also, field[x TO y] will produce the same hashCode for *any* field since the
fieldname parts of the
terms will always cancel eachother out. This will also cause the hashCode of
field{x TO x} to equal field:x
The hashCode of FilteredQuery will also cause many collisions because the
bits aren't mixed inbetween
the query and the filter.
Remember that every query as a boost component... never just xor two query
hashCodes together.
- make things position dependent.
Currently, field[x TO y] will produce the same hasCode as field[y TO x]...
not particularly important for RangeQuery, but
you get the idea.
- don't be afraid of using "+" instead of "^". They both take a single CPU
cycle, but "+" is not quite so easily (accidentally) reversed.
- flipping more than a single bit when hashing a boolean might be a good idea -
it will make collisions harder.
http://www.concentric.net/~Ttwang/tech/inthash.htm is an interesting link on
integer hash codes (what we are in effect doing when we combine multiple hash
codes). Esp interesting is the section "Parallel Operations"
> hashCode improvements
> ---------------------
>
> Key: LUCENE-460
> URL: http://issues.apache.org/jira/browse/LUCENE-460
> Project: Lucene - Java
> Type: Improvement
> Components: Search
> Versions: CVS Nightly - Specify date in submission
> Reporter: Yonik Seeley
> Priority: Minor
> Fix For: CVS Nightly - Specify date in submission
>
> It would be nice for all Query classes to implement hashCode and equals to
> enable them to be used as keys when caching.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]