Re: Using elasticsearch on cassandra nodes
Brian, We've taken the same approach, with ElasticSearch running on the same nodes as Cassandra. Make sure you have enough memory on your nodes, because ElasticSearch and Cassandra both love memory. We started with 12gb of memory, allocating ElasticSearch and Cassandra 5gb each and it wasn't enough. We ended up upgrading to 48gb of memory, allocating 15gb to Cassandra, 15gb to ElasticSearch and haven't had any issues since. Thanks, Mike Peters On 10/18/2011 9:14 AM, Brian O'Neill wrote: Anthony, We've been looking at elastic search as well. Presently we have SOLR in place, but it is cumbersome dealing with SOLR schemas when indexing information out of Cassandra (since you can't anticipate all the columns ahead of time). What are you using as your bridge between Cassandra and ES? Are you developing a Cassandra river? -brian On Mon, Oct 17, 2011 at 5:29 PM, Anthony Ikeda anthony.ikeda@gmail.com mailto:anthony.ikeda@gmail.com wrote: I've already posted to the elasticsearch groups and thought it prudent to also ask here. We are looking at using elastic search to index our data that we currently store to Cassandra. I was wondering if there are any concerns running elastic search on the same nodes that we use for Cassandra? We have a ring of 6 nodes (2 DCs each with 3 nodes) I was thinking of installing elastic search on 2 nodes in each datacentre - maybe all three. The only reason I'd use the same infrastructure would be because we have the distributed visibility already in place. Has anyone else taken this approach? Pros? Cons? Anthony -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/
Re: Using elasticsearch on cassandra nodes
Rock on. Thanks for the point Aaron. We're giving this a try right now to index our column families. cheers, -brian On Thu, Oct 20, 2011 at 4:26 PM, aaron morton aa...@thelastpickle.comwrote: Solr can use a dynamic schema… https://github.com/apache/lucene-solr/blob/trunk/solr/example/solr/conf/schema.xml#L538 But you may still want to define a schema so you can adjust the index and query time processing/typing of the field values. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 20/10/2011, at 2:20 AM, Brian O'Neill wrote: Anthony, We're in exactly the same boat. We are waiting on DataStax Enterprise to see if it can ease the pain of SOLR schemas. In the meantime, I just submitted a native REST layer for Cassandra. https://issues.apache.org/jira/browse/CASSANDRA-3380 (Hopefully, it will get integrated soon. Vote it up ;) With a simple REST layer, I'm making the case that we can use Cassandra just like CouchDB. (so we don't have to deploy both) Extending that assertion, I think I could enhance the REST layer to provide a stream of changes just like CouchDB does. Elastic Search could tap into that stream as a river. Just like this… http://www.elasticsearch.org/guide/reference/river/couchdb.html That combination would be pretty powerful. If we can't get that setup, we may fallback to an AOPish strategy as well. Definitely let me know where you end up. I'll share our findings as well. cheers, -brian Brian O'Neill Lead Architect, Software Development Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406 p: 215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/ From: Anthony Ikeda anthony.ikeda@gmail.com Reply-To: user@cassandra.apache.org Date: Tue, 18 Oct 2011 14:18:17 -0700 To: user@cassandra.apache.org Subject: Re: Using elasticsearch on cassandra nodes At the moment we are only prototyping so we haven't bridged the two at all. We had planned on creating a write-through operation that allowed us to filter the calls (AOP perhaps?) to manage the indexing as we stored it in Cassandra. We are still trying to work out if we go the elastic search route or not as DataStax will be releasing DataStax Enterprise 2.0 early next year with Solr built in and as you said the index schemas seem to be difficult to deal with - I really don't want to have to configure Solr, the no schema approach sounds much faster to get up and running. Anthony On Tue, Oct 18, 2011 at 6:14 AM, Brian O'Neill b...@alumni.brown.eduwrote: Anthony, We've been looking at elastic search as well. Presently we have SOLR in place, but it is cumbersome dealing with SOLR schemas when indexing information out of Cassandra (since you can't anticipate all the columns ahead of time). What are you using as your bridge between Cassandra and ES? Are you developing a Cassandra river? -brian On Mon, Oct 17, 2011 at 5:29 PM, Anthony Ikeda anthony.ikeda@gmail.com wrote: I've already posted to the elasticsearch groups and thought it prudent to also ask here. We are looking at using elastic search to index our data that we currently store to Cassandra. I was wondering if there are any concerns running elastic search on the same nodes that we use for Cassandra? We have a ring of 6 nodes (2 DCs each with 3 nodes) I was thinking of installing elastic search on 2 nodes in each datacentre - maybe all three. The only reason I'd use the same infrastructure would be because we have the distributed visibility already in place. Has anyone else taken this approach? Pros? Cons? Anthony -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/ -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/
Re: Using elasticsearch on cassandra nodes
Solr can use a dynamic schema… https://github.com/apache/lucene-solr/blob/trunk/solr/example/solr/conf/schema.xml#L538 But you may still want to define a schema so you can adjust the index and query time processing/typing of the field values. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 20/10/2011, at 2:20 AM, Brian O'Neill wrote: Anthony, We're in exactly the same boat. We are waiting on DataStax Enterprise to see if it can ease the pain of SOLR schemas. In the meantime, I just submitted a native REST layer for Cassandra. https://issues.apache.org/jira/browse/CASSANDRA-3380 (Hopefully, it will get integrated soon. Vote it up ;) With a simple REST layer, I'm making the case that we can use Cassandra just like CouchDB. (so we don't have to deploy both) Extending that assertion, I think I could enhance the REST layer to provide a stream of changes just like CouchDB does. Elastic Search could tap into that stream as a river. Just like this… http://www.elasticsearch.org/guide/reference/river/couchdb.html That combination would be pretty powerful. If we can't get that setup, we may fallback to an AOPish strategy as well. Definitely let me know where you end up. I'll share our findings as well. cheers, -brian Brian O'Neill Lead Architect, Software Development Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406 p: 215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/ From: Anthony Ikeda anthony.ikeda@gmail.com Reply-To: user@cassandra.apache.org Date: Tue, 18 Oct 2011 14:18:17 -0700 To: user@cassandra.apache.org Subject: Re: Using elasticsearch on cassandra nodes At the moment we are only prototyping so we haven't bridged the two at all. We had planned on creating a write-through operation that allowed us to filter the calls (AOP perhaps?) to manage the indexing as we stored it in Cassandra. We are still trying to work out if we go the elastic search route or not as DataStax will be releasing DataStax Enterprise 2.0 early next year with Solr built in and as you said the index schemas seem to be difficult to deal with - I really don't want to have to configure Solr, the no schema approach sounds much faster to get up and running. Anthony On Tue, Oct 18, 2011 at 6:14 AM, Brian O'Neill b...@alumni.brown.edu wrote: Anthony, We've been looking at elastic search as well. Presently we have SOLR in place, but it is cumbersome dealing with SOLR schemas when indexing information out of Cassandra (since you can't anticipate all the columns ahead of time). What are you using as your bridge between Cassandra and ES? Are you developing a Cassandra river? -brian On Mon, Oct 17, 2011 at 5:29 PM, Anthony Ikeda anthony.ikeda@gmail.com wrote: I've already posted to the elasticsearch groups and thought it prudent to also ask here. We are looking at using elastic search to index our data that we currently store to Cassandra. I was wondering if there are any concerns running elastic search on the same nodes that we use for Cassandra? We have a ring of 6 nodes (2 DCs each with 3 nodes) I was thinking of installing elastic search on 2 nodes in each datacentre - maybe all three. The only reason I'd use the same infrastructure would be because we have the distributed visibility already in place. Has anyone else taken this approach? Pros? Cons? Anthony -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/
Re: Using elasticsearch on cassandra nodes
Anthony, We're in exactly the same boat. We are waiting on DataStax Enterprise to see if it can ease the pain of SOLR schemas. In the meantime, I just submitted a native REST layer for Cassandra. https://issues.apache.org/jira/browse/CASSANDRA-3380 (Hopefully, it will get integrated soon. Vote it up ;) With a simple REST layer, I'm making the case that we can use Cassandra just like CouchDB. (so we don't have to deploy both) Extending that assertion, I think I could enhance the REST layer to provide a stream of changes just like CouchDB does. Elastic Search could tap into that stream as a river. Just like this http://www.elasticsearch.org/guide/reference/river/couchdb.html That combination would be pretty powerful. If we can't get that setup, we may fallback to an AOPish strategy as well. Definitely let me know where you end up. I'll share our findings as well. cheers, -brian Brian O'Neill Lead Architect, Software Development Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406 p: 215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/ From: Anthony Ikeda anthony.ikeda@gmail.com Reply-To: user@cassandra.apache.org Date: Tue, 18 Oct 2011 14:18:17 -0700 To: user@cassandra.apache.org Subject: Re: Using elasticsearch on cassandra nodes At the moment we are only prototyping so we haven't bridged the two at all. We had planned on creating a write-through operation that allowed us to filter the calls (AOP perhaps?) to manage the indexing as we stored it in Cassandra. We are still trying to work out if we go the elastic search route or not as DataStax will be releasing DataStax Enterprise 2.0 early next year with Solr built in and as you said the index schemas seem to be difficult to deal with - I really don't want to have to configure Solr, the no schema approach sounds much faster to get up and running. Anthony On Tue, Oct 18, 2011 at 6:14 AM, Brian O'Neill b...@alumni.brown.edu wrote: Anthony, We've been looking at elastic search as well. Presently we have SOLR in place, but it is cumbersome dealing with SOLR schemas when indexing information out of Cassandra (since you can't anticipate all the columns ahead of time). What are you using as your bridge between Cassandra and ES? Are you developing a Cassandra river? -brian On Mon, Oct 17, 2011 at 5:29 PM, Anthony Ikeda anthony.ikeda@gmail.com wrote: I've already posted to the elasticsearch groups and thought it prudent to also ask here. We are looking at using elastic search to index our data that we currently store to Cassandra. I was wondering if there are any concerns running elastic search on the same nodes that we use for Cassandra? We have a ring of 6 nodes (2 DCs each with 3 nodes) I was thinking of installing elastic search on 2 nodes in each datacentre - maybe all three. The only reason I'd use the same infrastructure would be because we have the distributed visibility already in place. Has anyone else taken this approach? Pros? Cons? Anthony -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 tel:215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/
Re: Using elasticsearch on cassandra nodes
Anthony, We've been looking at elastic search as well. Presently we have SOLR in place, but it is cumbersome dealing with SOLR schemas when indexing information out of Cassandra (since you can't anticipate all the columns ahead of time). What are you using as your bridge between Cassandra and ES? Are you developing a Cassandra river? -brian On Mon, Oct 17, 2011 at 5:29 PM, Anthony Ikeda anthony.ikeda@gmail.comwrote: I've already posted to the elasticsearch groups and thought it prudent to also ask here. We are looking at using elastic search to index our data that we currently store to Cassandra. I was wondering if there are any concerns running elastic search on the same nodes that we use for Cassandra? We have a ring of 6 nodes (2 DCs each with 3 nodes) I was thinking of installing elastic search on 2 nodes in each datacentre - maybe all three. The only reason I'd use the same infrastructure would be because we have the distributed visibility already in place. Has anyone else taken this approach? Pros? Cons? Anthony -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/
Re: Using elasticsearch on cassandra nodes
At the moment we are only prototyping so we haven't bridged the two at all. We had planned on creating a write-through operation that allowed us to filter the calls (AOP perhaps?) to manage the indexing as we stored it in Cassandra. We are still trying to work out if we go the elastic search route or not as DataStax will be releasing DataStax Enterprise 2.0 early next year with Solr built in and as you said the index schemas seem to be difficult to deal with - I really don't want to have to configure Solr, the no schema approach sounds much faster to get up and running. Anthony On Tue, Oct 18, 2011 at 6:14 AM, Brian O'Neill b...@alumni.brown.eduwrote: Anthony, We've been looking at elastic search as well. Presently we have SOLR in place, but it is cumbersome dealing with SOLR schemas when indexing information out of Cassandra (since you can't anticipate all the columns ahead of time). What are you using as your bridge between Cassandra and ES? Are you developing a Cassandra river? -brian On Mon, Oct 17, 2011 at 5:29 PM, Anthony Ikeda anthony.ikeda@gmail.com wrote: I've already posted to the elasticsearch groups and thought it prudent to also ask here. We are looking at using elastic search to index our data that we currently store to Cassandra. I was wondering if there are any concerns running elastic search on the same nodes that we use for Cassandra? We have a ring of 6 nodes (2 DCs each with 3 nodes) I was thinking of installing elastic search on 2 nodes in each datacentre - maybe all three. The only reason I'd use the same infrastructure would be because we have the distributed visibility already in place. Has anyone else taken this approach? Pros? Cons? Anthony -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/
Using elasticsearch on cassandra nodes
I've already posted to the elasticsearch groups and thought it prudent to also ask here. We are looking at using elastic search to index our data that we currently store to Cassandra. I was wondering if there are any concerns running elastic search on the same nodes that we use for Cassandra? We have a ring of 6 nodes (2 DCs each with 3 nodes) I was thinking of installing elastic search on 2 nodes in each datacentre - maybe all three. The only reason I'd use the same infrastructure would be because we have the distributed visibility already in place. Has anyone else taken this approach? Pros? Cons? Anthony