I am using ParallelMultSearcher to search across 10 different servers.

Here is the problem I bumped into:
Sometimes when I search for a term, my program just hang there. No result,
no response.

I decided to take one server (No. 10 in my list of servers) off the list,
searching was as smooth as it was supposed to be. (Very Lucky)
At first, I thought I bumped into some scalability issue, so I put No. 10
server back to the list and took another server off the list.
My program hang there too.

Then I took some other servers off the list and left No 10 in the list. My
program still hang. But if I took No 10 off the list, other servers just
worked fine together.

So I guess, index files on No 10 server were corrupted, or there were some
hardware related issues going on.

However, when I used the ParallelMultiSearcher with ONLY No. 10 server left
in the list. It returned results.

Then I looked back. I found the condition which caused my program to hang.
--- 1). There is a hit on the index files on No. 10 server; in the meantime,
2). There is a hit on any other servers too.

But if there is hit on any servers other than No. 10, my program runs just
fine.

I don't believe it's a hardware related issue, because I can use PM search
with only No 10 in the list.
And I don't believe it's a bug in my code or ParallelMultiSearcher, because
other servers just work fine.

Does anybody have any idea? How should I debug into this problem?

Many thanks,

Wenjie

The following are part of my code and some logging info when it hangs:
-----------code---------------
               // generate the query from the string
               QueryBuilder qb = new QueryBuilder();
               Query query = qb.build(this.queryStr, new
StandardAnalyzer());
               logger.info(query.toString());

               Hits hits = null;
               try{
                       if(searcher != null){
                               Sort sort = new Sort(new SortField[]{new
SortField("date", true), SortField.FIELD_SCORE, new SortField("uid")});
                               logger.debug("BBBBB");   // before search
                               hits = searcher.search(query,sort);    //
searcher is a PM Searcher
                               logger.debug("AAAAA");   // after
                       }
               }catch(IOException e){
                       logger.fatal("Search results can not be returned " +
e.getMessage());
                       respond();
                       throw new RuntimeException(e);
               }


--------logging------------
2006-08-02 16:20:44,261 [Thread-2] (SearchingThread.java:109) INFO  - Query
from : user [0, 50]
2006-08-02 16:20:44,261 [Thread-2] (SearchingThread.java:111) INFO  -
Received: +content:blah blah
2006-08-02 16:20:44,265 [Thread-145] (SearchingThread.java:144) INFO  - Num
of server from DB 2
2006-08-02 16:20:44,277 [Thread-145] (SearchingThread.java:201) INFO  -
rmi://vip.s-index5b:1098/1154550044272_810000
2006-08-02 16:20:44,363 [Thread-145] (SearchingThread.java:201) INFO  -
rmi://vip.s-index1a:1099/1154550044290_57320000
2006-08-02 16:20:44,368 [Thread-145] (SearchingThread.java:273) INFO  -
Number of servers used in Search: 2
2006-08-02 16:20:44,368 [Thread-145] (SearchingThread.java:285) DEBUG  -
BBBBB
2006-08-02 16:20:44,404 [Thread-145] (SearchingThread.java:287) DEBUG  -
AAAAA
2006-08-02 16:20:44,411 [Thread-145] (SearchingThread.java:329) INFO  -
[user] hitCount: 2
It was fine this time, becuase 1a did not have a hit, both results were
returned from 5b, the No 10 server.


2006-08-02 16:21:37,552 [Thread-2] (SearchingThread.java:109) INFO  - Query
from : user [0, 50]
2006-08-02 16:21:37,552 [Thread-2] (SearchingThread.java:111) INFO  -
Received: +content:blah blah
2006-08-02 16:21:37,557 [Thread-149] (SearchingThread.java:144) INFO  - Num
of server from DB 2
2006-08-02 16:21:37,564 [Thread-149] (SearchingThread.java:201) INFO  -
rmi://vip.s-index5b:1098/1154550097561_807000
2006-08-02 16:21:37,572 [Thread-149] (SearchingThread.java:201) INFO  -
rmi://vip.s-index1b:1098/1154550097569_659000
2006-08-02 16:21:37,574 [Thread-149] (SearchingThread.java:273) INFO  -
Number of servers used in Search: 2
2006-08-02 16:21:37,575 [Thread-149] (SearchingThread.java:285) DEBUG  -
BBBBB
I replaced 1a with 1b, It did not print out AAAAA, and not hitCount either.
It's because, 1b has a hit for this query.

2006-08-02 16:21:58,232 [Thread-2] (SearchingThread.java:109) INFO  - Query
from : user [0, 50]
2006-08-02 16:21:58,232 [Thread-2] (SearchingThread.java:111) INFO  -
Received: +content:blah blah
2006-08-02 16:21:58,237 [Thread-151] (SearchingThread.java:144) INFO  - Num
of server from DB 1
2006-08-02 16:21:58,242 [Thread-151] (SearchingThread.java:201) INFO  -
rmi://vip.s-index1b:1098/1154550118239_678000
2006-08-02 16:21:58,244 [Thread-151] (SearchingThread.java:273) INFO  -
Number of servers used in Search: 1
2006-08-02 16:21:58,244 [Thread-151] (SearchingThread.java:285) DEBUG  -
BBBBB
2006-08-02 16:21:58,253 [Thread-151] (SearchingThread.java:287) DEBUG  -
AAAAA
2006-08-02 16:21:58,254 [Thread-151] (SearchingThread.java:329) INFO  -
[user] hitCount: 2
I took 5b off, 1b has two hits

Reply via email to