On Fri, Jun 29, 2012 at 9:02 AM, Arjun Dhar <dhar...@yahoo.com> wrote: > Hi, > I'm new and that is my disclaimer to the stupid question I am about to ask. > > Am trying to form a conceptual picture of the relation between Query <--> > Weight <--> IndexReader, Scorer, Searcher <--> Similarity > > *From what I gather : (and someone please validate or correct me) * > 1. We want *Queries* to be RE-USABLE instances hence *Weight* is a specific > Queries state !?
Queries are independent of a Searcher. When executing a Query, it creates a Weight specifically for that searcher. This contains things things like IDF computations: collection-wide state. > 2. *Searcher* is STATEFUL, and though it processes a *Query*, the state for > that *Searcher* is delegated to the WEIGHT !? Searcher wraps an indexreader (usually a composite indexreader containing multiple segments like a DirectoryReader) to provide search capabilities. It also has extension points that are search specific: one of these is Similarity, but there are others. For example, in 4.0 you can override methods to provide collection-wide stats where the collection is distributed: consisting of indexes across multiple machines > 3. *IndexReader* Reads an Index, and the *Searcher* uses the Reader to > SEARCH, using a QUERY yes. > 4. From the JavaDocs of Weight class ----> "IndexReader dependent state > should reside in the Scorer. " -- Means, when *weights* are calculated, the > final result of the Calculation goes into a STATEFUL object represented by > the *Scorer* which is also Iterable !? This could maybe be clarified to say per-segment state. So if you have an IndexSearcher wrapping a DirectoryReader with 4 index segments, in the typical case the Weight holds the state of the entire collection: e.g. IDF across all 4 segments. The Weight creates 4 Scorers: a Scorer for each segment in that DirectoryReader. Any per-segment information such as the document length normalization ("norms") array resides in each of those Scorers. > 5. *Searcher* can be assigned a *Similarity* algorithm. ... hence using that > algorithm, it calculates *Weight*, which eventually leads to the > construction of an Iterable *Scorer* !? A Similarity is a hook for term weighting. But term weighting is not the entire scoring algorithm in many cases: Scorers don't have to use Similarity to compute things: they can use whatever logic they want. > > 6. While Indexing, its simple there is a direct relation between > IndexWriterConfig <--> Similarity this is for computing document length normalization information ("norms") at indexing time. Currently thats the only way that IndexWriter interacts with Similarity. > > +Q) Apart from the validation of my understanding, is there a Sequence > Diagram explaining the process of calculation, during a Query? have a look at https://builds.apache.org/job/Lucene-trunk/javadoc/ , click "Searching and Scoring in Lucene". I don't think there are any diagrams there, but there is more information available. > > +Q) There are different implementations of Queries. Do they differ in how > they mash up all the other stuff? > Looks like if i mess each of the other entities, I can pretty much produce > whatever Query?! See the link above for more information, especially the section on writing custom queries. -- lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org