I would suggest benchmarking this before doing any more complex
design. A field with only 10k unique integer or string values will
search very very quickly.
On Thu, May 6, 2010 at 7:54 AM, Nagelberg, Kallin
wrote:
> Hey everyone,
>
> I'm having some difficulty figuring out the best way to optimize for a
> certain query situation. My documents have a many-valued field that stores
> lists of IDs. All in all there are probably about 10,000 distinct IDs
> throughout my index. I need to be able to query and find all documents that
> contain a given set of IDs. Ie, I want to find all documents that contain IDs
> 3, 202, 3030 or 505. Currently I'm implementing this like so:
>
> q= (myfield:3) OR (myfield:202) OR (myfield:3030) OR (myfield:505).
>
> It's possible that there could be upwards of hundreds of terms, although 90%
> of the time it will be under 10. Ideally I would like to do this with a
> filter query, but I have read that it is impossible to cache OR'd terms in a
> fq, though this feature may come soon. The problem is that the combinations
> of OR'd terms will almost always be unique, so the query cache will have a
> very low hit rate. It would be great if the individual terms could be cached
> individually, but I'm not sure how to accomplish that.
>
> Any suggestions would be welcome!
> -Kallin Nagelberg
>
>
--
Lance Norskog
goks...@gmail.com