Recent changes that added automatic filter caching to IndexSearcher
uncovered some traps with our queries when it comes to using them as
cache keys. The problem comes from the fact that some of our main
queries are mutable, and modifying them while they are used as cache
keys makes the entry that they are caching invisible (because the hash
code changed too) yet still using memory.

While I think most users would be unaffected as it is rather uncommon
to modify queries after having passed them to IndexSearcher, I would
like to remove this trap by making queries immutable: everything
should be set at construction time except the boost parameter that
could still be changed with the same clone()/setBoost() mechanism as
today.

First I would like to make sure that it sounds good to everyone and
then to discuss what the API should look like. Most of our queries
happen to be immutable already (NumericRangeQuery, TermsQuery,
SpanNearQuery, etc.) but some aren't and the main exceptions are:
 - BooleanQuery,
 - DisjunctionMaxQuery,
 - PhraseQuery,
 - MultiPhraseQuery.

We could take all parameters that are set as setters and move them to
constructor arguments. For the above queries, this would mean (using
varargs for ease of use):

  BooleanQuery(boolean disableCoord, int minShouldMatch,
    BooleanClause... clauses)
  DisjunctionMaxQuery(float tieBreakMul, Query... clauses)

For PhraseQuery and MultiPhraseQuery, the closest to what we have
today would require adding new classes to wrap terms and positions
together, for instance:

class TermAndPosition {
  public final BytesRef term;
  public final int position;
}

so that eg. PhraseQuery would look like:

  PhraseQuery(int slop, String field, TermAndPosition... terms)

MultiPhraseQuery would be the same with several terms at the same position.

Comments/ideas/concerns are highly welcome.

-- 
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to