Yes, I see the catch. I actually never noticed that google's maximum resultset is no more than 1000 items. We all know why...
It's just, my pager shows the maximum possible page of result, so if someone would click on it, it would show up, even if it's the 20000th page... Maybe I should re-think about it... Markus -----Ursprüngliche Nachricht----- Von: Simone Chiaretta [mailto:[email protected]] Gesendet: Mittwoch, 20. Januar 2010 09:29 An: [email protected] Betreff: Re: What is the right way for getting paged results with Lucene? Actually I think this is a false problem: how many times do you go at page 3 of Google? I never go: if I don't find something useful I just change keywords. Depending on how many results you have on a page, the last results might not even be relevant: I've noticed that after a while the score of docs drops drastically: I filter out docs with a normalized score lower then 0.2, so I rarely have more than 100 results I know I didn't answer your question, actually there is no way afaik to get paged results, but this how I dealt with "paging" simo On Wednesday, January 20, 2010, Markus Wolters <[email protected]> wrote: > Hello, > > as you might know, I am pretty new to Lucene and integrating 2.9.1 right now > into my current ASP.NET MVC project. And it's really working like a charm. > (Thanks to Michael) > > I am curious about if I've understood it right, how to do paging with > Lucene. I've implemented it like so: > > IndexSearcher searcher = Searcher; > > // Collect all resulting documents until selected page > TopScoreDocCollector collector = > TopScoreDocCollector.create((pageIndex + 1) * pageSize, false); > searcher.Search(query, collector); > > // Get documents for selected page > TopDocs hits = collector.TopDocs(pageIndex * pageSize, pageSize); > > So in case if someone selects one of the last pages of a huge result, Lucene > would go over a lot results, even that I just need 'pageSize' results, is > that right? What about the performance or memory usage? I took a sneak peek > into the Searcher code and believe to have seen that Lucene is creating a > que as big as documents to get. So in case of a totalhit-count of let's say > 20000, a pageSize of 20 and selecting the last page (999), even that I > actually need just the 20 last documents, Lucene is getting (and even > allocating mememory for?) all 20000 resulting documents. Is that right? > > In other terms, I want to do a MySQL-equivalent to SELECT [...] LIMIT > pageIndex * pageSize, pageSize. > > Markus > > > -- Simone Chiaretta Microsoft MVP ASP.NET - ASPInsider Blog: http://codeclimber.net.nz RSS: http://feeds2.feedburner.com/codeclimber twitter: @simonech Any sufficiently advanced technology is indistinguishable from magic "Life is short, play hard"
