1) Index should be optimized (have only one segment) to make search faster.

2) Large search result is a cause of slow searching.
Do you retrieve any stored field of returned hits?

Note:
Search itself only collects documents' IDs, but retrieving any stored field causes full document retrieving. It hardly increases time of large result set retrieving. So splitting returned result into pages and retrieving any stored info _only_for_current_page_ make search much more faster.

That's also good idea to store returned result (IDs and scores or only IDs) into an array and cache it between requests.
Documents could be retrieved with $index->getDocument($id) call.

With best regards,
   Alexander Veremyev.

Simon Gelfand wrote:
Hi Craig,

You can see a test here with 130,000 articles indexed I am getting
slow searching - 5,6 seconds.

I have added paging + max 250 hits displayed + memory caching to speed
browsing after an initial search.

Here is an example:
http://www.articlesbase.com/test-search.php?q=business+consulting+firms (cached)
http://www.articlesbase.com/test-search.php?q=business+ (not cached)

Any ideas of speeding the search itself?

Like limit the amount of results when searching (before it returns
like 30,000 results which I slice), minimum score for a result and so
on?

Simon

On 5/9/07, Craig Slusher <[EMAIL PROTECTED]> wrote:
webshark27,

When you get your articles indexed, it would be really great if you
can share your experience with searching against it. I would love to
know how well the Zend implementation of Lucene handles the load.

On 5/8/07, webshark27 <[EMAIL PROTECTED]> wrote:
>
> Hi Chris,
>
> Thanks for the quick response.
>
> Doesn't the "$doc = new Zend_Search_Lucene_Document();" just overwrite the
> old one?
>
> Also I think the $index->addDocument($doc) is filling up the memory fast, I
> don't know exactly how to play with the MergeFactor, MaxMergeDocs and
> MaxBufferedDocs effects this issue.
>
> I am running 10,000 each time and then commit changes - load the script
> again and running ....
>
>
> Chris Blaise wrote:
> >
> >
> > It's been a few months since I worked with this but I had some weird > > errors that I'm not sure if I figured out was due to running out of memory > > or it if it was due to some weird corruption I was seeing that caused the
> > script to exit.
> >
> > The fix to my problem was to free memory. In your case try setting > > $doc to null when you're finished with it in the loop, right after the
> > $index->addDocument($doc).
> >
> >  Chris
> >
> >
>
> --
> View this message in context: http://www.nabble.com/Zend_Search_Lucene---Best-Practices-for-Indexing-100k%2B-articles-tf3712199s16154.html#a10385215
> Sent from the Zend Framework mailing list archive at Nabble.com.
>
>


--
Craig Slusher
[EMAIL PROTECTED]




Reply via email to