RE: grouping results by fields

zzzzz shalev Mon, 30 Jan 2006 15:19:25 -0800

hey chris,
  i was using the hits.doc method while iterating,,,
  you've given me some hope!!!!!! i will look into the FieldCache


Chris Hostetter <[EMAIL PROTECTED]> wrote:
  : currently , i am iterating through about 200-300 of the top docs and
: creating the groups (so, as of now, the groups are partial) , my
: response time HAS to be at most 500-600 milli (query + groupings) or my
: company will probably go with a commercial search engine such as FAST or
: something of the sort

if when you say you are "iterating through about 200-300" docs, you mean
you are using either the Hits.doc method, or the IndexReader.document
method on those 200-300 docs, then I am 99.9999% confident any of the
approaches described so far will be faster then what you are doing now.

: the data i must prove lucene can handle is up to 25 million 2-4Kb
: docs, i dont think it is feasable to creat groups on a result set
: consisting of a few thousand results in a 1/2 second, or am i wrong?

With 25 million docs, caching BitSets for every group may not be feasible
(depending on how many grouping you have) but using either the TermEnum
iterating or FieldCache enumerating described recently it should be
possible.

if you expect the number of results from each search (a few thousand) to
be a small fraction of the total number of docs (25 million) then my money
is on the FieldCache approach because you'll only be iterating over the
matching docs (instead of over every doc and then testing to see if it's
in your results)


-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

  


                
---------------------------------
Do you Yahoo!?
 With a free 1 GB, there's more in store with Yahoo! Mail.

RE: grouping results by fields

Reply via email to