> Plan A sounds better because I don't want to consider the entire collection > and then remove results from it.
Fine, your choice. > However, the same code has to work with 2 different collections. The first > one has 30.000 docs the other one 90.000. No problem. The number of docs is irrelevant. > How can I get the total amount of docs from a collection ? IndexReader.numDocs(). See also maxDoc() and numDeletedDocs(). -- Ian. > On 29 March 2011 21:48, Ian Lea <[email protected]> wrote: > >> Here are a couple of ideas. >> >> Plan A. >> >> Think of a number, say 10, retrieve n * 10 docids in your search and >> then loop round java.util.Random.nextInt(n * 10) until you've got >> enough. >> >> Plan B. >> >> Reverse your MUST NOT search to get a list of docids that you don't >> want, then loop round Random.nextInt(indexreader.numDocs()), selecting >> those that are not deleted (!indexreader.isDeleted(docid)) and are not >> in your exclusion list. >> >> >> I'm sure there are other ways, probably better. >> >> >> -- >> Ian. >> >> >> On Tue, Mar 29, 2011 at 8:00 PM, Patrick Diviacco >> <[email protected]> wrote: >> > Ok I've solved the first part of the problem. I'm now selecting all >> > documents that do not contain a given term with a BooleanFilter >> > and FilterClause, MUST NOT. >> > >> > I still have to understand how to retrieve random documents and limit the >> > number of retrieved docs to N. >> > >> > thanks >> > >> > On 29 March 2011 20:40, Patrick Diviacco <[email protected]> >> wrote: >> > >> >> Is there a Filter to get a limited number of random collection docs from >> >> the index which DO NOT contain a specific term ? >> >> >> >> i.e. term="pizza" >> >> >> >> I want to run the query against 10 random documents of the collection >> that >> >> do not contain the term "pizza". >> >> >> >> thanks >> >> >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
