From: John Wang <[EMAIL PROTECTED]>
[...]
> sub index 1: 1 billion docs
> sub index 2: 1 billion docs
> sub index 3: 1 billion docs
> 
> federating search to these subindexes, you represent an index of 3 
> billiondocs, and all internal doc ids are of type int.

That falls under Daniel's "...unless you wrap your own framework around it". 
The problem with the solution you're describing is that it's not functionally 
equivalent to a single index of 3 billion docs.

If you just create 3 independent indexes and merge the top hits from all 3, the 
ranking of the documents will be messed up. You'll need to make sure that the 
scores from the different indexes can be compared. That's tricky when the score 
depends on the frequency of the terms in the whole corpus.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to