Recent changes that added automatic filter caching to IndexSearcher uncovered some traps with our queries when it comes to using them as cache keys. The problem comes from the fact that some of our main queries are mutable, and modifying them while they are used as cache keys makes the entry that they are caching invisible (because the hash code changed too) yet still using memory.
While I think most users would be unaffected as it is rather uncommon to modify queries after having passed them to IndexSearcher, I would like to remove this trap by making queries immutable: everything should be set at construction time except the boost parameter that could still be changed with the same clone()/setBoost() mechanism as today. First I would like to make sure that it sounds good to everyone and then to discuss what the API should look like. Most of our queries happen to be immutable already (NumericRangeQuery, TermsQuery, SpanNearQuery, etc.) but some aren't and the main exceptions are: - BooleanQuery, - DisjunctionMaxQuery, - PhraseQuery, - MultiPhraseQuery. We could take all parameters that are set as setters and move them to constructor arguments. For the above queries, this would mean (using varargs for ease of use): BooleanQuery(boolean disableCoord, int minShouldMatch, BooleanClause... clauses) DisjunctionMaxQuery(float tieBreakMul, Query... clauses) For PhraseQuery and MultiPhraseQuery, the closest to what we have today would require adding new classes to wrap terms and positions together, for instance: class TermAndPosition { public final BytesRef term; public final int position; } so that eg. PhraseQuery would look like: PhraseQuery(int slop, String field, TermAndPosition... terms) MultiPhraseQuery would be the same with several terms at the same position. Comments/ideas/concerns are highly welcome. -- Adrien --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org