Re: What's the best approach to search in Cassandra

2011-06-16 Thread Jake Luciani
Mark,

Solandra doesn't use secondary indexes, the functionality is too limited for
the lucene api.  It maintain's it's own indexes in regular column families.
 I suggest you look at Solr and decide if this is the functionality you
need, Solandra offers the same api but on Cassandra's distributed model.

-Jake

On Thu, Jun 16, 2011 at 12:56 AM, Mark Kerzner markkerz...@gmail.comwrote:

 Jake,

 *You need to maintain a huge number of distinct indexes.*
 *
 *
 *Are we talking about secondary indexes? If yes, this sounds like exactly
 my problem. There is so little documentation! - but I think that if I read
 all there is on GitHub, I can probably start using it.
 *

 Thank you,
 Mark

 On Fri, Jun 3, 2011 at 8:07 PM, Jake Luciani jak...@gmail.com wrote:

 Mark,

 Check out Solandra.  http://github.com/tjake/Solandra


 On Fri, Jun 3, 2011 at 7:56 PM, Mark Kerzner markkerz...@gmail.comwrote:

 Hi,

 I need to store, say, 10M-100M documents, with each document having say
 100 fields, like author, creation date, access date, etc., and then I want
 to ask questions like

 give me all documents whose author is like abc**, and creation date any
 time in 2010 and access date in 2010-2011, and so on, perhaps 10-20
 conditions, matching a list of some keywords.

 What's best, Lucene, Katta, Cassandra CF with secondary indices, or plan
 scan and compare of every record?

 Thanks a bunch!

 Mark




 --
 http://twitter.com/tjake





-- 
http://twitter.com/tjake


Re: What's the best approach to search in Cassandra

2011-06-15 Thread Mark Kerzner
Jake,

*You need to maintain a huge number of distinct indexes.*
*
*
*Are we talking about secondary indexes? If yes, this sounds like exactly my
problem. There is so little documentation! - but I think that if I read all
there is on GitHub, I can probably start using it.
*

Thank you,
Mark

On Fri, Jun 3, 2011 at 8:07 PM, Jake Luciani jak...@gmail.com wrote:

 Mark,

 Check out Solandra.  http://github.com/tjake/Solandra


 On Fri, Jun 3, 2011 at 7:56 PM, Mark Kerzner markkerz...@gmail.comwrote:

 Hi,

 I need to store, say, 10M-100M documents, with each document having say
 100 fields, like author, creation date, access date, etc., and then I want
 to ask questions like

 give me all documents whose author is like abc**, and creation date any
 time in 2010 and access date in 2010-2011, and so on, perhaps 10-20
 conditions, matching a list of some keywords.

 What's best, Lucene, Katta, Cassandra CF with secondary indices, or plan
 scan and compare of every record?

 Thanks a bunch!

 Mark




 --
 http://twitter.com/tjake



Re: What's the best approach to search in Cassandra

2011-06-15 Thread Sasha Dolgy
Datastax has pretty sufficient documentation on their site for secondary
indexes.
On Jun 16, 2011 6:57 AM, Mark Kerzner markkerz...@gmail.com wrote:
 Jake,

 *You need to maintain a huge number of distinct indexes.*
 *
 *
 *Are we talking about secondary indexes? If yes, this sounds like exactly
my
 problem. There is so little documentation! - but I think that if I read
all
 there is on GitHub, I can probably start using it.
 *

 Thank you,
 Mark

 On Fri, Jun 3, 2011 at 8:07 PM, Jake Luciani jak...@gmail.com wrote:

 Mark,

 Check out Solandra. http://github.com/tjake/Solandra


 On Fri, Jun 3, 2011 at 7:56 PM, Mark Kerzner markkerz...@gmail.com
wrote:

 Hi,

 I need to store, say, 10M-100M documents, with each document having say
 100 fields, like author, creation date, access date, etc., and then I
want
 to ask questions like

 give me all documents whose author is like abc**, and creation date any
 time in 2010 and access date in 2010-2011, and so on, perhaps 10-20
 conditions, matching a list of some keywords.

 What's best, Lucene, Katta, Cassandra CF with secondary indices, or plan
 scan and compare of every record?

 Thanks a bunch!

 Mark




 --
 http://twitter.com/tjake



Re: What's the best approach to search in Cassandra

2011-06-04 Thread Paul Loy
I use ElasticSearch myself. Which is a distributed Lucene.

http://www.elasticsearch.org

On Sat, Jun 4, 2011 at 1:56 AM, Mark Kerzner markkerz...@gmail.com wrote:

 Hi,

 I need to store, say, 10M-100M documents, with each document having say 100
 fields, like author, creation date, access date, etc., and then I want to
 ask questions like

 give me all documents whose author is like abc**, and creation date any
 time in 2010 and access date in 2010-2011, and so on, perhaps 10-20
 conditions, matching a list of some keywords.

 What's best, Lucene, Katta, Cassandra CF with secondary indices, or plan
 scan and compare of every record?

 Thanks a bunch!

 Mark




-- 
-
Paul Loy
p...@keteracel.com
http://uk.linkedin.com/in/paulloy


What's the best approach to search in Cassandra

2011-06-03 Thread Mark Kerzner
Hi,

I need to store, say, 10M-100M documents, with each document having say 100
fields, like author, creation date, access date, etc., and then I want to
ask questions like

give me all documents whose author is like abc**, and creation date any time
in 2010 and access date in 2010-2011, and so on, perhaps 10-20 conditions,
matching a list of some keywords.

What's best, Lucene, Katta, Cassandra CF with secondary indices, or plan
scan and compare of every record?

Thanks a bunch!

Mark


Re: What's the best approach to search in Cassandra

2011-06-03 Thread Jake Luciani
Mark,

Check out Solandra.  http://github.com/tjake/Solandra

On Fri, Jun 3, 2011 at 7:56 PM, Mark Kerzner markkerz...@gmail.com wrote:

 Hi,

 I need to store, say, 10M-100M documents, with each document having say 100
 fields, like author, creation date, access date, etc., and then I want to
 ask questions like

 give me all documents whose author is like abc**, and creation date any
 time in 2010 and access date in 2010-2011, and so on, perhaps 10-20
 conditions, matching a list of some keywords.

 What's best, Lucene, Katta, Cassandra CF with secondary indices, or plan
 scan and compare of every record?

 Thanks a bunch!

 Mark




-- 
http://twitter.com/tjake