On Sat, Jun 4, 2011 at 2:57 AM, Mark Kerzner <[email protected]> wrote:

> Hi,
>
> I need to store, say, 10M-100M documents, with each document having say 100
> fields, like author, creation date, access date, etc., and then I want to
> ask questions like
>
> give me all documents whose author is like abc**, and creation date any
> time
> in 2010 and access date in 2010-2011, and so on, perhaps 10-20 conditions,
> matching a list of some keywords.
>
> What's best, Lucene, Katta, HBase CF with secondary indices, or plain scan
> and compare of every record?
>

I'd say give Lily a spin. Currently, we rely on Solr for search. In the next
few months, we'll take a good look at "HBase-native" secondary indexes as
well.

Lily can be found at www.lilyproject.org.

Thanks,

Steven.
-- 
Steven Noels
http://outerthought.org/
Scalable Smart Data
Makers of Kauri, Daisy CMS and Lily

Reply via email to