Chris Lamprecht wrote:
One issue is that if you are splitting the index in half (for
example), getting some results from index A and some from index B,
then you need to merge the results somewhere.  But the scores coming
from the two indexes are not related at all, for example, document 100
from index A has score 0.85, document 200 from index B has score 0.90
-- this doesn't necessarily mean that document 200 should be ranked
before document 100.    This is one issue to deal with.

I think this issue has been discussed on this mailing list before. Has anyone else had to deal with this issue with a distributed index? What does Nutch do?

Please see http://issues.apache.org/jira/browse/NUTCH-92 . A solution is described there, but not implemented yet...

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to