Yes. The trick is to use a hash value on each document. The SignatureUpdateProcessor provides a tool for this. Store the hash value in a hex string field.
Now, do wildcard queries on the hash string: hash:a* will randomly choose 1/16 of the documents. hash:00* will pick 1/256 of the documents. On Wed, May 16, 2012 at 6:43 AM, Yuval Dotan <yuvaldo...@gmail.com> wrote: > Hi Guys > We have an environment containing billions of documents. > Faceting over this large result set could take many seconds, and so we > thought we might be able to use statistical sampling of a smaller result > set from the facet, and give an approximate result much quicker. > Is there any way to facet only a random sample of the results? > Thanks > Yuval -- Lance Norskog goks...@gmail.com