Thanks for sharing your experience! I also found a similar solution in TitanDB[1], but that also seem to be intended for development use. I think the consensus here seems to be that one should not be embedding Cassandra into another JVM.
> For production, we have to support single node clusters (not > embedded though), and it has been challenging for pretty much > all the reasons you find people saying not to do so. What challenges did you face with single-node Cassandra deployment? [1]: https://github.com/thinkaurelius/titan/blob/titan10/titan-cassandra/src/main/java/com/thinkaurelius/titan/diskstorage/cassandra/utils/CassandraDaemonWrapper.java On Sun, Feb 14, 2016 at 11:05 AM, John Sanda <john.sa...@gmail.com> wrote: > The project I work on day to day uses an embedded instance of Cassandra, > but it is intended for primarily for development. We embed Cassandra in a > WildFly (i.e., JBoss) server. It is packaged and deployed as an EAR. I > personally do not do this. I use and recommend ccm > <https://github.com/pcmanus/ccm> for development. If you do you WildFly, > there is also wildfly-cassandra > <https://github.com/hawkular/wildfly-cassandra> which deploys Cassandra > as a custom WildFly extension. In other words it is deployed in WildFly > like other subsystems like EJB, web, etc, not like an application. There > isn't a whole lot of active development on this, but it could be another > option. > > For production, we have to support single node clusters (not embedded > though), and it has been challenging for pretty much all the reasons you > find people saying not to do so. > > As for failure detection and cluster membership changes, are you using the > Datastax driver? You can register an event listener with the driver to > receive notifications for those things. > > On Sat, Feb 13, 2016 at 6:33 PM, Jonathan Haddad <j...@jonhaddad.com> > wrote: > >> +1 to what jack said. Don't mess with embedded till you understand the >> basics of the db. You're not making your system any less complex, I'd say >> you're most likely going to shoot yourself in the foot. >> On Sat, Feb 13, 2016 at 2:22 PM Jack Krupansky <jack.krupan...@gmail.com> >> wrote: >> >>> HA requires an odd number of replicas - 3, 5, 7 - so that split-brain >>> can be avoided. Two nodes would not support HA. You need to be able to >>> reach a quorum, which is defined as n/2+1 where n is the number of >>> replicas. IOW, you cannot update the data if a quorum cannot be reached. >>> The data on any given node needs to be replicated on at least two other >>> nodes. >>> >>> Embedded Cassandra is only for extremely sophisticated developers - not >>> those who are new to Cassandra, with a "superficial understanding". >>> >>> As a general proposition, you should not be running application code on >>> Cassandra nodes. >>> >>> That said, if any of the senior Cassandra developers wish to personally >>> support your efforts towards embedded clusters, they are certainly free to >>> do so. we'll see if any of them step forward. >>> >>> >>> -- Jack Krupansky >>> >>> On Sat, Feb 13, 2016 at 3:47 PM, Binil Thomas < >>> binil.thomas.pub...@gmail.com> wrote: >>> >>>> Hi all, >>>> >>>> TL;DR: I have a very superficial understanding of Cassandra and am >>>> currently evaluating it for a project. >>>> >>>> * Can Cassandra be embedded into another JVM application? >>>> * Can such embedded instances form a cluster? >>>> * Can the application use the the failure detection and cluster >>>> membership dissemination infrastructure of embedded Cassandra? >>>> >>>> ---- >>>> >>>> I am in the process of re-packaging a SaaS system written in Java to be >>>> deployed on-premise by customers. The SaaS system currently uses AWS >>>> DynamoDB. The data storage needs for this application are modest, but I >>>> would like to keep the deployment complexity to a minimum. Here are three >>>> different usecases the on-premise system should support: >>>> >>>> 1. single-node deployments with minimal complexity >>>> 2. two-node HA deployments; the data and processing needs dictated by >>>> the load on the system are well under what a single node can do, but the >>>> second node is there to satisfy the HA requirement as a hot standby >>>> 3. a multi-node clustered deployment, where higher operational >>>> complexity is justified >>>> >>>> I am considering Cassandra for these usecases. >>>> >>>> For usecase #1, I hope to embed Cassandra into the same JVM as my >>>> application. I read on the web that CassandraDaemon can be used this way. >>>> Is that accurate? What other applications embed Cassandra this way? I >>>> *think* JetBrains Upsource does, but do you know other ones? (Incidentally, >>>> my Java application embeds Jetty webserver also). >>>> >>>> For usecase #2, I am hoping that I can deploy two instances of this >>>> ensemble and have the embedded Cassandra instances form a cluster. If I >>>> configure every write to be replicated on both nodes synchronously, then it >>>> will satisfy the HA needs of this usecase. Is it feasible to form clusters >>>> of embedded Cassandra instances? >>>> >>>> For usecase #3, I can form a large cluster of the ensemble where all >>>> writes are replicated synchronously to a quorum of nodes. >>>> >>>> Finally, in usecase #2 and #3, I'd like to use the failure detection >>>> and cluster membership dissemination infrastructure of Cassandra from >>>> within my application. Is it possible to be notified of membership changes >>>> when embedding Cassandra? I could use a separate library to do this (say, >>>> with JGroups or Akka) but I fear that if this library and the embedded >>>> Cassandra instances disagrees, it could lead to subtle bugs. >>>> >>>> Thanks, >>>> Binil >>>> >>>> PS: Cross-posted at >>>> http://stackoverflow.com/questions/35384983/forming-a-cluster-of-embedded-cassandra-instances >>>> >>>> >>> > > > -- > > - John >