Actually I think this is a false problem:
how many times do you go at page 3 of Google? I never go: if I don't
find something useful I just change keywords.
Depending on how many results you have on a page, the last results
might not even be relevant: I've noticed that after a while the score
of docs drops drastically: I filter out docs with a normalized score
lower then 0.2, so I rarely have more than 100 results

I know I didn't answer your question, actually there is no way afaik
to get paged results, but this how I dealt with "paging"

simo

On Wednesday, January 20, 2010, Markus Wolters <[email protected]> wrote:
> Hello,
>
> as you might know, I am pretty new to Lucene and integrating 2.9.1 right now
> into my current ASP.NET MVC project. And it's really working like a charm.
> (Thanks to Michael)
>
> I am curious about if I've understood it right, how to do paging with
> Lucene. I've implemented it like so:
>
>         IndexSearcher searcher = Searcher;
>
>       // Collect all resulting documents until selected page
>         TopScoreDocCollector collector =
> TopScoreDocCollector.create((pageIndex + 1) * pageSize, false);
>       searcher.Search(query, collector);
>
>         // Get documents for selected page
>       TopDocs hits = collector.TopDocs(pageIndex * pageSize, pageSize);
>
> So in case if someone selects one of the last pages of a huge result, Lucene
> would go over a lot results, even that I just need 'pageSize' results, is
> that right? What about the performance or memory usage? I took a sneak peek
> into the Searcher code and believe to have seen that Lucene is creating a
> que as big as documents to get. So in case of a totalhit-count of let's say
> 20000, a pageSize of 20 and selecting the last page (999), even that I
> actually need just the 20 last documents, Lucene is getting (and even
> allocating mememory for?) all 20000 resulting documents. Is that right?
>
> In other terms, I want to do a MySQL-equivalent to SELECT [...] LIMIT
> pageIndex * pageSize, pageSize.
>
> Markus
>
>
>

-- 
Simone Chiaretta
Microsoft MVP ASP.NET - ASPInsider
Blog: http://codeclimber.net.nz
RSS: http://feeds2.feedburner.com/codeclimber
twitter: @simonech

Any sufficiently advanced technology is indistinguishable from magic
"Life is short, play hard"

Reply via email to