Hi Chris, My last email msg was in response to your suggestion:
If it's a Lucene class, you may want to start by making a small proof of concept RMI app that just uses the Lucene core classes, once that works then try your changes in Solr.
For which I agree is a good starting point to narrow things down. So my last msg was actual code of non-solr testing of ParallelMultiSearcher with RMI calls. As for actual solr code modification, the following are the relevant pieces: // approximately line 65, the constructor: // SolrIndexSearcher class attributes: // this was the original: // private final IndexSearcher searcher; // replaced with: private final ParallelMultiSearcher searcher; // approximately line 123, the constructor: private SolrIndexSearcher(IndexSchema schema, String name, IndexReader r, boolean closeReader, boolean enableCache) throws Exception { this.schema = schema; this.name = "Searcher@" + Integer.toHexString(hashCode()) + (name!=null ? " "+name : ""); log.info("Opening " + this.name); reader = r; // this is the original: //searcher = new IndexSearcher(r); // replaced with: searcher = _initSearcher(); .... } // and i added this to initialize searcher: private ParallelMultiSearcher _initSearcher() throws Exception { Searchable[] sch = new Searchable[3]; // local indexes that are searchable.. for (int i=0; i<2; i++) { sch[i] = new IndexSearcher("/disk" + i); } // a remote searchable available via RMI sch[2] = (Searchable) Naming.lookup("//somehost.com:1099/searchit"); ParallelMultiSearcher searcher = new ParallelMultiSearcher(sch); return searcher; }
From this src code modification, I do an 'ant compile', repackage solr.war,
install it in the appropriate location, start up the example ('java -jar start.jar'), then submit search queries via curl. Then I submit a simple curl from cmd line: curl http://localhost:8080/solr/select -d version="2.1" -d start=0 -d rows=10 -d indent=on -d submit=search -d q="body:blablabla" Without the RMI as a searchable, the search works just fine, With the RMI as a searchable, I get an exception: java.rmi.MarshalException: error marshalling arguments; nested exception is: java.io.NotSerializableException: org.apache.lucene.search.ParallelMultiSearcher$1 at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:122) at org.apache.lucene.search.RemoteSearchable_Stub.search(Unknown Source) at org.apache.lucene.search.ParallelMultiSearcher.search( ParallelMultiSearcher.java:172) at org.apache.lucene.search.Searcher.search(Searcher.java:116) at org.apache.lucene.search.Searcher.search(Searcher.java:95) at org.apache.solr.search.SolrIndexSearcher.getDocListNC( SolrIndexSearcher.java:794) at org.apache.solr.search.SolrIndexSearcher.getDocListC( SolrIndexSearcher.java:712) at org.apache.solr.search.SolrIndexSearcher.getDocList( SolrIndexSearcher.java:605) at org.apache.solr.request.StandardRequestHandler.handleRequest( StandardRequestHandler.java:106) at org.apache.solr.core.SolrCore.execute(SolrCore.java:585) at org.apache.solr.servlet.SolrServlet.doGet(SolrServlet.java:80) at org.apache.solr.servlet.SolrServlet.doPost(SolrServlet.java:70) at javax.servlet.http.HttpServlet.service(HttpServlet.java:767) at javax.servlet.http.HttpServlet.service(HttpServlet.java:860) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java :408) at org.mortbay.jetty.servlet.ServletHandler.handle( ServletHandler.java:350) at org.mortbay.jetty.servlet.SessionHandler.handle( SessionHandler.java:195) at org.mortbay.jetty.security.SecurityHandler.handle( SecurityHandler.java:164) at org.mortbay.jetty.handler.ContextHandler.handle( ContextHandler.java:536) Looking at the last place on the src code for SolrIndexSearcher.java (line 794), this is the source code it threw from a search call with a newly defined HitCollector: searcher.search(query, new HitCollector() { float minScore=Float.NEGATIVE_INFINITY; // minimum score in the priority queue public void collect(int doc, float score) { if (filt!=null && !filt.exists(doc)) return; if (numHits[0]++ < lastDocRequested || score >= minScore) { // if docs are always delivered in order, we could use "score>minScore" // but might BooleanScorer14 might still be used and deliver docs out-of-order? hq.insert(new ScoreDoc(doc, score)); minScore = ((ScoreDoc)hq.top()).score; } } } If I follow the exception trail, within Lucene it's (repeated from above for context) at org.apache.lucene.search.RemoteSearchable_Stub.search(Unknown Source) at org.apache.lucene.search.ParallelMultiSearcher.search( ParallelMultiSearcher.java:172) at org.apache.lucene.search.Searcher.search(Searcher.java:116) at org.apache.lucene.search.Searcher.search(Searcher.java:95) which has the following src code: Searcher.java:95 public void search(Query query, HitCollector results) throws IOException { search(query, (Filter)null, results); } Searcher.java:116 public void search(Query query, Filter filter, HitCollector results) throws IOException { search(createWeight(query), filter, results); } ParallelMultiSearcher.java:172 public void search(Weight weight, Filter filter, final HitCollector results) throws IOException { for (int i = 0; i < searchables.length; i++) { final int start = starts[i];
HERE: searchables[i].search(weight, filter, new HitCollector() {
public void collect(int doc, float score) { results.collect(doc + start, score); } }); } } I'm wondering if it is a failure to deal with the HitCollector. Any ideas? thanks, Koji On 5/9/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: IndexSearcher. I replaced it with ParallelMultiSearcher, where it is : initialized similar to the client code I mentioned above. : : >From that, it seems like Solr itself needs to marshall and unmarshall the : searcher instance SolrIndexSearcher holds, and because the : ParallelMultiSearcher is initialized with RMI stubs, it fails to proceed : with such marshall/unmarshall internal actions. As mentioned in the first : email, if I use ParallelMultiSearcher to only look at local indexes (no RMI : stub), Solr works just fine. So I'm wondering if there is a way use : SolrIndexSearcher to search both local and remote indexes, even if not : through the RMI solution Lucene's ebook has suggested via its : ParallelMultiSearcher class. As I said, i don't really know a lot about RMI, but I don't think the client code is expected to marshall/unmarshall things -- but the objects you want to pass to remote methods (or recieve back from from remote methods) need to be serializable. Do you know what objects you got serialization exceptions from? (you didn't include any real source -- just psuedocode, so it's not posisble to use the line numbers in your stack trace to look at the code because we don't know exactly what you changed) -Hoss