DataStax Enterprise integrates Cassandra and Apache Solr, with Solr as a 
secondary index so that the Solr query index can be kept in sync with the 
Cassandra data automatically and even fully reindexed if your index mapping 
changes, as a single request. So, C* provides the fully distributed, durable 
data store, and embedded Solr provides full-featured rich query, including 
faceting, sorting, grouping, and full keyword text and wildcard and fuzzy and 
range queries.

See:
http://www.datastax.com/what-we-offer/products-services/datastax-enterprise

Elasticsearch and Solr are both based on Lucene for the core underlying 
indexing and query layers.

-- Jack Krupansky

From: Jon Haddad 
Sent: Saturday, May 3, 2014 4:03 AM
To: user@cassandra.apache.org 
Subject: Re: Cassandra vs Elasticsearch.

Agreed w/ ES not being the durable data store.  I would recommend treating it 
as ephemeral, and using Cassandra as your source of truth.  Keep in mind if you 
change your ES index mapping, you’ll require a full reindex in order to search 
the data properly.  It’s not like adding a secondary index w/ a DB, where it’ll 
go back and take care of it for you. 

Jon

On May 3, 2014, at 12:31 AM, DuyHai Doan <doanduy...@gmail.com> wrote:


  Hello Tim


  You're absolutely right about ES for the query part. This is the perfect fit 
for complex queries. Now regarding your question:

  "What advantages does Cassandra give me over ES?" --> linear scalability & 
durability. ES is just a super index cluster. I've talked to ES guys. If they 
do not sell ES right now as a "database for complex search" it's because there 
is no strong guarantee about durability for your data. Many people just live 
with it and it's fine. Also, if you store the original data and just pump it 
into ES it's also fine.






  On Sat, May 3, 2014 at 9:14 AM, Tim Uckun <timuc...@gmail.com> wrote:

    Hey all.


    I have been trying out some data stores for time series data and Cassandra 
was the first on my list because so many people are using it for the same 
purpose.  I have read many articles on how to model my time series data and 
tried several variations of schemas which I thought made sense for my data but 
I have really struggled to run some complex queries I need to run.  This has 
led me down a kind of a rabbit hole of trying to create various "materialized 
views" and shotgunning the data into multiple tables which might be able to run 
my queries.


    In the mean time I also took the same data and pumped it into Elasticsearch 
and was able to run almost all the queries I needed without doing anything 
fancy. Just put the data in, and run your query. The new aggregations in ES are 
pretty slick although they don't seem to be 100% accurate compared to running 
the same query in Postgres.


    My question is this.  What advantages does Cassandra give me over ES?  Does 
it compact the data better? Is it faster to query once your data sizes are 
huge? Does it use less bandwidth? Is it easier to administer? 


    I know there must be very compelling reasons to use C* because so many 
companies are depending on it for their bread and butter so I'd love to hear 
your take.


    Thanks.


Reply via email to