Hello, --- Paul Harrison <[EMAIL PROTECTED]> wrote:
> I too would love to hear some answers on this one. We have a 100 > million > page implementation on 5 machines, 4 GB of ram, and 2 SATA drives of > 250 GB > each. Part of what I have noticed is that Lucene does some sort of > strange > caching in that if you do subsequent searches on a search the return > results > are quite quick. I too have noticed that different terms have That's probably your OS/FS caching. Lucene doesn't cache anything. > different > search responses and that the problem gets worse with the number of > terms in > the query. Yes, that makes sense. More complex queries will have to dig through the index more than simple ones, consequently taking more time to return hits. > I have also noticed that distributed search has problems. > The > main search machine waits on other machines to serve up their results > before > it will respond. So it appears that your search is only as fast as > your > slowest responding machine or whenever the timeout hits (whichever > comes first). I'm no expert, but this sounds reasonable to me - what if your closest matches happen to be in the index on the slowest search server? Otis > If anyone has any suggestions on tuning the distributed > search or > general suggestions on speeding up retrieval times with a large set, > I am > all ears. > > Thanks, > > Paul > > -----Original Message----- > From: TL [mailto:[EMAIL PROTECTED] > Sent: Thursday, October 13, 2005 12:15 PM > To: [email protected] > Subject: Nutch Search Speed Concern > > Search Speed > > What are the most important factors in nutch/lucene's > search speed? > > I've been testing nutch's search speed on a search > pool with about 100M records (separated evenly into 30 > segments), and discovered that certain search terms > have a signicantly higher search time then others. > Some searches take 30 ms while others takes upwards of > 3000ms. > > At first, there seemed to be a direct relationship > between the total number of results from a given query > and the timeit took to complete. But after further > testing, that relationship did not hold true for all > cases. There seems to be other factors that directly > affect the speed of a search. > > Has anyone else encountered this issue? Or have some > insight to the impact of certain factors on search > speed? > > Thanks. > > - T > > > > __________________________________ > Yahoo! Music Unlimited > Access over 1 million songs. Try it free. > http://music.yahoo.com/unlimited/ > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: > Power Architecture Resource Center: Free content, downloads, > discussions, > and more. http://solutions.newsforge.com/ibmarch.tmpl > _______________________________________________ > Nutch-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/nutch-general >
