Hi Lucene Users!

 

I've been playing around with dotLucene on a few projects since for about 4
months, and I've found Lucene to be exceptionally powerful, speedy and
thanks to LIA, really easy to use. 

But I've hit a problem that I fear will pose a performance problem for our
architecture and Lucene installation.

 

We have an index of about 100,000 documents with about 30 fields, built from
our database.

Each document in the index contains a TOKENIZED field of Category Names, so
that each document can belong to many categories. The category field is a
tokenized string field.

 

We have a new requirement to not only allow searches across the whole index,
but to return the number of documents in each of the (150) possible
categories. This is like in an Amazon search
(http://amazon.com/s/ref=nb_ss_gw/105-0072880-3737226?url=search-alias%3Daps
&field-keywords=diamond&Go.x=0&Go.y=0&Go=Go), where a category list is
presented on the left with the number of results in each category.

 

So far, I can think of two possible ways to implement this:

 

1.      Create a QueryFilter for the user enterered query, and perform a
category field search for each category.
2.      Create a separate index for each category, and sequentially (or
concurrently) search across all the indexes. 

 

Does anyone know which solution is better than the other? 

 

Both solutions seem taxing to me because they both involve "number of
categories + 1" searches.

 

Regards,

 -V

 

 

Reply via email to