>>and "n" searches to get the Documents, ??? Where does the "n" come in? searcher.doc(id) is not a search. It is a call to IndexReader.document() to retrieve a specific document. Try run it. It shouldn't be slow.
----- Original Message ---- From: Alixandre Santana <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Tuesday, 3 July, 2007 5:17:51 PM Subject: Re: Pagination Mark, Thanks for the code. Well..I´m doing the same thing you are: Retrieve some Doc IDs and then use the code - Document doc=searcher.doc(sd[i].doc) - to get the Document itself. But in this case, we are doing a search to get the IDs, and "n" searches to get the Documents, which is not a good practice. Is there another option of do it? Alixandre On 7/3/07, mark harwood <[EMAIL PROTECTED]> wrote: > >>I get the ids then I do look the items in the database using select item.* > >>from item where item.id in ( ids ) > > Hmm. That's likely to confuse the already confused :) > The ids referred to so far are Lucene internal document ids and are typically > only meaningful to Lucene during a single IndexReader session. I wouldn't > recommend storing them in a database because a Lucene document id can point > to an entirely different document after deletes/updates are performed on the > Lucene index and the IndexReader is reopened. > > For the avoidance of further confusion I have extended the "main" method in > my previous example (reposted below in full) to include examples of > 1) Retrieving document content > 2) Retrieving a "next" page (starting from result 11) > The values "1" and "11" used below in the calls to HitPageCollector > constructor define the page start. This value is typically something you > would get the client to pass to you e.g. note the number "10" in this URL > http://www.google.com/search?q=lucene&start=10 which is used to select > results from "10" onwards. Note also that this URL > http://www.google.com/search?q=lucene&&start=10000 does not work because > Google have placed a restriction on the maximum value for "start" - you > should too. > > Cheers > Mark > > > package lucene.pagination; > > import org.apache.lucene.document.Document; > import org.apache.lucene.index.Term; > import org.apache.lucene.search.HitCollector; > import org.apache.lucene.search.IndexSearcher; > import org.apache.lucene.search.Query; > import org.apache.lucene.search.ScoreDoc; > import org.apache.lucene.search.TermQuery; > import org.apache.lucene.util.PriorityQueue; > > /** > * A HitCollector that retrieves a specific page of results > * @author maharwood > */ > public class HitPageCollector extends HitCollector > { > //Demo code showing pagination > public static void main(String[] args) throws Exception > { > IndexSearcher s=new IndexSearcher("/indexes/nasa"); > Query q=new TermQuery(new Term("contents","sea")); > > //Retrieve page 1 (hits 1-10) > HitPageCollector hpc=new HitPageCollector(1,10); > s.search(q,hpc); > ScoreDoc[] sd = hpc.getScores(); > System.out.println("Hits "+ hpc.getStart()+" - "+ hpc.getEnd()+" of > "+hpc.getTotalAvailable()); > for (int i = 0; i < sd.length; i++) > { > Document doc=s.doc(sd[i].doc); > System.out.println(sd[i].score +" "+doc.get("title")); > } > > //Example retrieve page 2 (hits 11-20) > hpc=new HitPageCollector(11,10); > s.search(q,hpc); > sd = hpc.getScores(); > System.out.println("Hits "+ hpc.getStart()+" - "+ hpc.getEnd()+" of > "+hpc.getTotalAvailable()); > for (int i = 0; i < sd.length; i++) > { > Document doc=s.doc(sd[i].doc); > System.out.println(sd[i].score +" "+doc.get("title")); > } > > > s.close(); > } > > int nDocs; > PriorityQueue hq; > float minScore = 0.0f; > int totalHits = 0; > int start; > int maxNumHits; > int totalInThisPage; > > public HitPageCollector(int start, int maxNumHits) > { > this.nDocs = start + maxNumHits; > this.start = start; > this.maxNumHits = maxNumHits; > hq = new HitQueue(nDocs); > } > > public void collect(int doc, float score) > { > totalHits++; > if((hq.size()<nDocs)||(score >= minScore)) > { > ScoreDoc scoreDoc = new ScoreDoc(doc,score); > hq.insert(scoreDoc); // update hit queue > minScore = ((ScoreDoc)hq.top()).score; // reset minScore > } > totalInThisPage=hq.size(); > } > > > public ScoreDoc[] getScores() > { > //just returns the number of hits required from the required start > point > /* > So, given hits: > 1234567890 > and a start of 2 + maxNumHits of 3 should return: > 234 > or, given hits > 12 > should return > 2 > and so, on. > */ > if (start <= 0) > { > throw new IllegalArgumentException("Invalid start :" + start+" - > start should be >=1"); > } > int numReturned = Math.min(maxNumHits, (hq.size() - (start - 1))); > if (numReturned <= 0) > { > return new ScoreDoc[0]; > } > ScoreDoc[] scoreDocs = new ScoreDoc[numReturned]; > ScoreDoc scoreDoc; > for (int i = hq.size() - 1; i >= 0; i--) // put docs in array, > working backwards from lowest count > { > scoreDoc = (ScoreDoc) hq.pop(); > if (i < (start - 1)) > { > break; //off the beginning of the results array > } > if (i < (scoreDocs.length + (start - 1))) > { > scoreDocs[i - (start - 1)] = scoreDoc; //within scope of > results array > } > } > return scoreDocs; > } > > public int getTotalAvailable() > { > return totalHits; > } > > public int getStart() > { > return start; > } > > public int getEnd() > { > return start+totalInThisPage-1; > } > > public class HitQueue extends PriorityQueue > { > public HitQueue(int size) > { > initialize(size); > } > public final boolean lessThan(Object a, Object b) > { > ScoreDoc hitA = (ScoreDoc)a; > ScoreDoc hitB = (ScoreDoc)b; > if (hitA.score == hitB.score) > return hitA.doc > hitB.doc; > else > return hitA.score < hitB.score; > } > } > } > > > > > > > ___________________________________________________________ > Yahoo! Mail is the world's favourite email. Don't settle for less, sign up for > your free account today > http://uk.rd.yahoo.com/evt=44106/*http://uk.docs.yahoo.com/mail/winter07.html > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- Alixandre Santana --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] ___________________________________________________________ Yahoo! Answers - Got a question? Someone out there knows the answer. Try it now. http://uk.answers.yahoo.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]