Re: What's the best approach to search in Cassandra
Mark, Solandra doesn't use secondary indexes, the functionality is too limited for the lucene api. It maintain's it's own indexes in regular column families. I suggest you look at Solr and decide if this is the functionality you need, Solandra offers the same api but on Cassandra's distributed model. -Jake On Thu, Jun 16, 2011 at 12:56 AM, Mark Kerzner markkerz...@gmail.comwrote: Jake, *You need to maintain a huge number of distinct indexes.* * * *Are we talking about secondary indexes? If yes, this sounds like exactly my problem. There is so little documentation! - but I think that if I read all there is on GitHub, I can probably start using it. * Thank you, Mark On Fri, Jun 3, 2011 at 8:07 PM, Jake Luciani jak...@gmail.com wrote: Mark, Check out Solandra. http://github.com/tjake/Solandra On Fri, Jun 3, 2011 at 7:56 PM, Mark Kerzner markkerz...@gmail.comwrote: Hi, I need to store, say, 10M-100M documents, with each document having say 100 fields, like author, creation date, access date, etc., and then I want to ask questions like give me all documents whose author is like abc**, and creation date any time in 2010 and access date in 2010-2011, and so on, perhaps 10-20 conditions, matching a list of some keywords. What's best, Lucene, Katta, Cassandra CF with secondary indices, or plan scan and compare of every record? Thanks a bunch! Mark -- http://twitter.com/tjake -- http://twitter.com/tjake
Re: What's the best approach to search in Cassandra
Jake, *You need to maintain a huge number of distinct indexes.* * * *Are we talking about secondary indexes? If yes, this sounds like exactly my problem. There is so little documentation! - but I think that if I read all there is on GitHub, I can probably start using it. * Thank you, Mark On Fri, Jun 3, 2011 at 8:07 PM, Jake Luciani jak...@gmail.com wrote: Mark, Check out Solandra. http://github.com/tjake/Solandra On Fri, Jun 3, 2011 at 7:56 PM, Mark Kerzner markkerz...@gmail.comwrote: Hi, I need to store, say, 10M-100M documents, with each document having say 100 fields, like author, creation date, access date, etc., and then I want to ask questions like give me all documents whose author is like abc**, and creation date any time in 2010 and access date in 2010-2011, and so on, perhaps 10-20 conditions, matching a list of some keywords. What's best, Lucene, Katta, Cassandra CF with secondary indices, or plan scan and compare of every record? Thanks a bunch! Mark -- http://twitter.com/tjake
Re: What's the best approach to search in Cassandra
Datastax has pretty sufficient documentation on their site for secondary indexes. On Jun 16, 2011 6:57 AM, Mark Kerzner markkerz...@gmail.com wrote: Jake, *You need to maintain a huge number of distinct indexes.* * * *Are we talking about secondary indexes? If yes, this sounds like exactly my problem. There is so little documentation! - but I think that if I read all there is on GitHub, I can probably start using it. * Thank you, Mark On Fri, Jun 3, 2011 at 8:07 PM, Jake Luciani jak...@gmail.com wrote: Mark, Check out Solandra. http://github.com/tjake/Solandra On Fri, Jun 3, 2011 at 7:56 PM, Mark Kerzner markkerz...@gmail.com wrote: Hi, I need to store, say, 10M-100M documents, with each document having say 100 fields, like author, creation date, access date, etc., and then I want to ask questions like give me all documents whose author is like abc**, and creation date any time in 2010 and access date in 2010-2011, and so on, perhaps 10-20 conditions, matching a list of some keywords. What's best, Lucene, Katta, Cassandra CF with secondary indices, or plan scan and compare of every record? Thanks a bunch! Mark -- http://twitter.com/tjake
Re: What's the best approach to search in Cassandra
I use ElasticSearch myself. Which is a distributed Lucene. http://www.elasticsearch.org On Sat, Jun 4, 2011 at 1:56 AM, Mark Kerzner markkerz...@gmail.com wrote: Hi, I need to store, say, 10M-100M documents, with each document having say 100 fields, like author, creation date, access date, etc., and then I want to ask questions like give me all documents whose author is like abc**, and creation date any time in 2010 and access date in 2010-2011, and so on, perhaps 10-20 conditions, matching a list of some keywords. What's best, Lucene, Katta, Cassandra CF with secondary indices, or plan scan and compare of every record? Thanks a bunch! Mark -- - Paul Loy p...@keteracel.com http://uk.linkedin.com/in/paulloy
What's the best approach to search in Cassandra
Hi, I need to store, say, 10M-100M documents, with each document having say 100 fields, like author, creation date, access date, etc., and then I want to ask questions like give me all documents whose author is like abc**, and creation date any time in 2010 and access date in 2010-2011, and so on, perhaps 10-20 conditions, matching a list of some keywords. What's best, Lucene, Katta, Cassandra CF with secondary indices, or plan scan and compare of every record? Thanks a bunch! Mark
Re: What's the best approach to search in Cassandra
Mark, Check out Solandra. http://github.com/tjake/Solandra On Fri, Jun 3, 2011 at 7:56 PM, Mark Kerzner markkerz...@gmail.com wrote: Hi, I need to store, say, 10M-100M documents, with each document having say 100 fields, like author, creation date, access date, etc., and then I want to ask questions like give me all documents whose author is like abc**, and creation date any time in 2010 and access date in 2010-2011, and so on, perhaps 10-20 conditions, matching a list of some keywords. What's best, Lucene, Katta, Cassandra CF with secondary indices, or plan scan and compare of every record? Thanks a bunch! Mark -- http://twitter.com/tjake