Hi Chris,

My last email msg was in response to your suggestion:

If it's a Lucene class, you may want to start by making a small proof
of concept RMI app that just uses the Lucene core classes, once that
works then try your changes in Solr.

For which I agree is a good starting point to narrow things down.  So my
last msg was actual code of non-solr testing of ParallelMultiSearcher with
RMI calls.

As for actual solr code modification, the following are the relevant pieces:

// approximately line 65, the constructor:
// SolrIndexSearcher class attributes:
// this was the original:
// private final IndexSearcher searcher;
// replaced with:
private final ParallelMultiSearcher searcher;


// approximately line 123, the constructor:
private SolrIndexSearcher(IndexSchema schema, String name, IndexReader r,
boolean closeReader, boolean enableCache) throws Exception {
   this.schema = schema;
   this.name = "Searcher@" + Integer.toHexString(hashCode()) + (name!=null
? " "+name : "");

   log.info("Opening " + this.name);

   reader = r;

   // this is the original:
   //searcher = new IndexSearcher(r);
   // replaced with:
   searcher = _initSearcher();
....
}

// and i added this to initialize searcher:
private ParallelMultiSearcher _initSearcher() throws Exception {

     Searchable[] sch = new Searchable[3];

     // local indexes that are searchable..
     for (int i=0; i<2; i++) {
        sch[i] = new IndexSearcher("/disk" + i);
     }

     // a remote searchable available via RMI
     sch[2] = (Searchable) Naming.lookup("//somehost.com:1099/searchit");

     ParallelMultiSearcher searcher = new ParallelMultiSearcher(sch);
     return searcher;
}

From this src code modification, I do an 'ant compile', repackage solr.war,
install it in the appropriate location, start up the example ('java -jar
start.jar'), then submit search queries via curl.

Then I submit a simple curl from cmd line:

curl http://localhost:8080/solr/select -d version="2.1" -d start=0 -d
rows=10 -d indent=on -d submit=search -d q="body:blablabla"

Without the RMI as a searchable, the search works just fine,  With the RMI
as a searchable, I get an exception:

java.rmi.MarshalException: error marshalling arguments; nested exception is:

       java.io.NotSerializableException:
org.apache.lucene.search.ParallelMultiSearcher$1
       at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:122)
       at org.apache.lucene.search.RemoteSearchable_Stub.search(Unknown
Source)
       at org.apache.lucene.search.ParallelMultiSearcher.search(
ParallelMultiSearcher.java:172)
       at org.apache.lucene.search.Searcher.search(Searcher.java:116)
       at org.apache.lucene.search.Searcher.search(Searcher.java:95)
       at org.apache.solr.search.SolrIndexSearcher.getDocListNC(
SolrIndexSearcher.java:794)
       at org.apache.solr.search.SolrIndexSearcher.getDocListC(
SolrIndexSearcher.java:712)
       at org.apache.solr.search.SolrIndexSearcher.getDocList(
SolrIndexSearcher.java:605)
       at org.apache.solr.request.StandardRequestHandler.handleRequest(
StandardRequestHandler.java:106)
       at org.apache.solr.core.SolrCore.execute(SolrCore.java:585)
       at org.apache.solr.servlet.SolrServlet.doGet(SolrServlet.java:80)
       at org.apache.solr.servlet.SolrServlet.doPost(SolrServlet.java:70)
       at javax.servlet.http.HttpServlet.service(HttpServlet.java:767)
       at javax.servlet.http.HttpServlet.service(HttpServlet.java:860)
       at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java
:408)
       at org.mortbay.jetty.servlet.ServletHandler.handle(
ServletHandler.java:350)
       at org.mortbay.jetty.servlet.SessionHandler.handle(
SessionHandler.java:195)
       at org.mortbay.jetty.security.SecurityHandler.handle(
SecurityHandler.java:164)
       at org.mortbay.jetty.handler.ContextHandler.handle(
ContextHandler.java:536)

Looking at the last place on the src code for SolrIndexSearcher.java (line
794), this is the source code it threw from a search call with a newly
defined HitCollector:

   searcher.search(query, new HitCollector() {
     float minScore=Float.NEGATIVE_INFINITY;  // minimum score in the
priority queue
     public void collect(int doc, float score) {
       if (filt!=null && !filt.exists(doc)) return;
       if (numHits[0]++ < lastDocRequested || score >= minScore) {
         // if docs are always delivered in order, we could use
"score>minScore"
         // but might BooleanScorer14 might still be used and deliver docs
out-of-order?
         hq.insert(new ScoreDoc(doc, score));
         minScore = ((ScoreDoc)hq.top()).score;
       }
     }
   }

If I follow the exception trail, within Lucene it's
(repeated from above for context)

at org.apache.lucene.search.RemoteSearchable_Stub.search(Unknown Source)
       at org.apache.lucene.search.ParallelMultiSearcher.search(
ParallelMultiSearcher.java:172)
       at org.apache.lucene.search.Searcher.search(Searcher.java:116)
       at org.apache.lucene.search.Searcher.search(Searcher.java:95)

which has the following src code:

Searcher.java:95
public void search(Query query, HitCollector results)
   throws IOException {
 search(query, (Filter)null, results);
}

Searcher.java:116
public void search(Query query, Filter filter, HitCollector results)
   throws IOException {
 search(createWeight(query), filter, results);
}

ParallelMultiSearcher.java:172
public void search(Weight weight, Filter filter, final HitCollector results)
   throws IOException {
 for (int i = 0; i < searchables.length; i++) {

   final int start = starts[i];

HERE:    searchables[i].search(weight, filter, new HitCollector() {
       public void collect(int doc, float score) {
         results.collect(doc + start, score);
       }
     });
 }
}

I'm wondering if it is a failure to deal with the HitCollector.  Any ideas?

thanks,
Koji


On 5/9/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:


: IndexSearcher.  I replaced it with ParallelMultiSearcher, where it is
: initialized similar to the client code I mentioned above.
:
: >From that, it seems like Solr itself needs to marshall and unmarshall
the
: searcher instance SolrIndexSearcher holds, and because the
: ParallelMultiSearcher is initialized with RMI stubs, it fails to proceed
: with such marshall/unmarshall internal actions.  As mentioned in the
first
: email, if I use ParallelMultiSearcher to only look at local indexes (no
RMI
: stub), Solr works just fine.  So I'm wondering if there is a way use
: SolrIndexSearcher to search both local and remote indexes, even if not
: through the RMI solution Lucene's ebook has suggested via its
: ParallelMultiSearcher class.

As I said, i don't really know a lot about RMI, but I don't think the
client code is expected to marshall/unmarshall things -- but the objects
you want to pass to remote methods (or recieve back from from remote
methods) need to be serializable.  Do you know what objects you got
serialization exceptions from? (you didn't include any real source -- just
psuedocode, so it's not posisble to use the line numbers in your stack
trace to look at the code because we don't know exactly what you changed)



-Hoss


Reply via email to