I'm not familiar enough with Ferret, but I do this sort filtering and
set intersections with Java Lucene, primarily using Solr, from a Ruby
on Rails front-end.
I build up bit sets (using Solr's new OpenBitSet class) that
represent "all items collected" and apply that filter to searches and
also intersect (using bit set ANDing) with other sets such as "all
objects from 1861" and "all poetry genre objects", and so on. I've
also customized Solr to return back facet counts, so given your
example it could show how many books were in stock in each category
and allow you to filter to see all those books easily too. Using
these types of set intersection operations even bypasses the
traditional Lucene search by simply dealing with efficiently
structure sets of document id's.
Erik
On Jun 16, 2006, at 12:45 PM, Sergei Serdyuk wrote:
> Let me illustrate my problem a bit more.
>
> There is an index with 1.2M books in it. Every book has category field
> and every book can be currently in stock, which is stored in stock
> field. Now, I generally expect to have 50-60% of books to be
> stocked. So
> it leaves me with 600,000 books I would need to iterate to find out
> what
> categories are currently stocked.
>
> It sounds like borderline task where one would think a database
> would be
> more appropriate, but ability to do advanced search over this
> collection
> of books is a top priority and database would not provide that.
>
> --
> Sergei Serdyuk
> Red Leaf Software LLC
> web: http://redleafsoft.com
>
> --
> Posted via http://www.ruby-forum.com/.
> _______________________________________________
> Ferret-talk mailing list
> [email protected]
> http://rubyforge.org/mailman/listinfo/ferret-talk
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk