I'm not familiar enough with Ferret, but I do this sort filtering and  
set intersections with Java Lucene, primarily using Solr, from a Ruby  
on Rails front-end.

I build up bit sets (using Solr's new OpenBitSet class) that  
represent "all items collected" and apply that filter to searches and  
also intersect (using bit set ANDing) with other sets such as "all  
objects from 1861" and "all poetry genre objects", and so on.  I've  
also customized Solr to return back facet counts, so given your  
example it could show how many books were in stock in each category  
and allow you to filter to see all those books easily too.  Using  
these types of set intersection operations even bypasses the  
traditional Lucene search by simply dealing with efficiently  
structure sets of document id's.

        Erik


On Jun 16, 2006, at 12:45 PM, Sergei Serdyuk wrote:

> Let me illustrate my problem a bit more.
>
> There is an index with 1.2M books in it. Every book has category field
> and every book can be currently in stock, which is stored in stock
> field. Now, I generally expect to have 50-60% of books to be  
> stocked. So
> it leaves me with 600,000 books I would need to iterate to find out  
> what
> categories are currently stocked.
>
> It sounds like borderline task where one would think a database  
> would be
> more appropriate, but ability to do advanced search over this  
> collection
> of books is a top priority and database would not provide that.
>
> --
> Sergei Serdyuk
> Red Leaf Software LLC
> web: http://redleafsoft.com
>
> -- 
> Posted via http://www.ruby-forum.com/.
> _______________________________________________
> Ferret-talk mailing list
> [email protected]
> http://rubyforge.org/mailman/listinfo/ferret-talk

_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to