Chuck Williams wrote:
I was thinking of the aggressive version with an index-time solution,
although I don't know the Lucene architecture for distributed indexing
and searching well enough to formulate the idea precisely.
Conceptually, I'd like each server that owns a slice of the index in a
distributed environment to have the complete docFreq data, i.e. to have
docFreq's that represent the collection as a whole, not just its index
slice.  If this was achieved at index-time, then the current
implementation would work at query time.  I.e., MultiSearch could send
the queries out to the remote Searcher's and these Searcher's could
consult their local indexes for the correct docFreq's to use.

This is different than what I described. I described keeping a docFreq cache at the central dispatch node, while you describe replicating that cache on every search node. I don't see the advantage in this replication. It is both more efficient to maintain a single cache, and faster to search, since fewer dictionary lookups are involved.


Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to