On 5/18/06, jason rutherglen <[EMAIL PROTECTED]> wrote:
I used the XML, I think using HTTP is important.
Is this written in Java? Using HTTPClient? Anything you will be able to share? No caching on the client yet, that is a good idea, however my personal goal is to have an index that is updated every 30 seconds or less and so am not sure about caching on the client. The caching can be handled by the Solr servers, that should be fine. If it works correctly then the architecture is very simple requiring 2 layers. The first is a Solr layer, the second is the client layer essentially running many threads in parallel per request. Seems like this would scale cheaply by adding more hardware on both layers.
> If you are using RMI you could either borrow from or subclass Lucene's MultiSearcher that implements this stuff. Yeah this is the real issue, if there are any general outlines of the best way to do this with Solr. Perhaps a separate Solr call for the docFreqs? Or could this be returned in the current /select call? I'm still trying to figure this part out.
Using XML, there would definitely have to be some more API calls to return idf related stuff. I don't think everything can be done in a single call since by the time you score docs against a query you have lost how you arrived at the composite score. It might be nice to be able to turn the distributed idf turned off though... people with large index segments and documents that are randomly distributed probably won't see much of a difference in scoring, but will see a performance increase. We also need to be careful of caching scores at the local level... if a different remote searcher changes, the scores cached on the other become invalid because of the gobal idf (yuck). -Yonik