Interesting. Can this be used in conjunction with bare metal? As in does it 
present containers in place if the “real” node until the node is up and running?


--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Mar 9, 2018, 10:56 AM -0500, Vangelis Koukis <vkou...@arrikto.com>, wrote:
> Hello all,
>
> My name is Vangelis Koukis and I am a Founder and the CTO of Arrikto.
>
> I'm writing to share our thoughts on how people run distributed,
> stateful applications such as Cassandra on modern infrastructure,
> and would love to get the community's feedback and comments.
>
> The fundamental question is: Where does a Cassandra node find its data?
> Does it run over local storage, e.g., a super-fast NVMe device, or does
> it run over some sort of external, managed storage, e.g., EBS on AWS?
>
> Going in one of the two directions is a tradeoff between flexibility on
> one hand, and performance/cost on the other.
>
> * External storage, e.g., EBS:
>
> Easy backups as thin/instant EBS snapshots, and easy node recovery
> in the case of instance failure by re-attaching the EBS data volume
> to a newly-created instance. But then, I/O bandwidth, I/O latency,
> and cost suffer.
>
> * Local NVMe:
>
> Blazing fast, with very low latency, excellent bandwidth, a
> fraction of the cost, but then it is not obvious how one backs up
> their data, or recovers from node failure.
>
> At Arrikto we are building decentralized storage to tackle this problem
> for cloud-native apps. Our software, Rok, allows you to run stateful
> apps directly over fast, local NVMe storage on-prem or on the cloud, and
> still be able to snapshot the containers and distribute them
> efficiently: across machines of the same cluster, or across distinct
> locations and administrative domains over a decentralized network.
>
> Rok runs on the side of Cassandra, which accesses local storage
> directly. It only has to intervene during snapshot-based node recovery,
> which is transparent to the application. It does not invoke an
> application-wide data recovery and rebalancing operation, which would
> put load on the whole cluster and impact application responsiveness.
> Instead, it performs block-level recovery of this specific node from the
> Rok snapshot store, e.g., S3, with predictable performance.
>
> This solves four important issues we have seen people running Cassandra
> at scale face today:
>
> * Node recovery / node migration:
>
> If you lose an entire Cassandra node, then your database will
> continue operating normally, as Rok in combination with your
> Container Orchestrator (e.g., Kubernetes) will present another
> Cassandra node. This node will have the data of the latest
> snapshot that resides on the Rok snapshot store. In this case,
> Cassandra only has to recover the changed parts, which is just a
> small fraction of the node data, and does not cause CPU load on
> the whole cluster. Similarly, you can migrate a Cassandra node
> from one physical host to another, without depending on external,
> EBS-like storage.
>
> * Backup and recovery:
>
> You can use Rok to take a full backup of your whole application,
> along with the DB, as a group-consistent snapshot of its VMs or
> containers, and store it externally. This does not depend on app-
> or Cassandra-specific functionality.
>
> * Data mobility:
>
> You can synchronize these snapshots to different locations, e.g.,
> across regions or cloud providers, and across administrative
> domains, i.e., share them with others without giving them direct
> access to your Cassandra DB. You can then spawn your entire
> application stack in the new location.
>
> * Testing / analytics:
>
> Being able to spawn a copy of your Cassandra DB as a thin clone
> means you can have test & dev workflows running in parallel, on
> independent, mutable clones, with real data underneath. Similarly,
> your analytics team can run their lengthy reporting and analytics
> workloads on an independent clone of your transactional DB, on
> completely distinct hardware, or even on a different location.
>
> So far, initial validation of our solution with early adopters shows
> significant performance gains at a fraction of the cost of external
> storage, while enabling a multi-region setup.
>
> Here are some numbers and a whitepaper to support this:
> https://journal.arrikto.com/why-your-cassandra-needs-local-nvme-and-rok-1787b9fc286d
> http://arrikto.com/wp-content/uploads/2018/03/20180206-rok_decentralized_storage_for_the_cloud_native_world.pdf
>
> If the above sounds interesting, we are eager to hear from you, learn
> about your potential use cases, and include you in our beta test
> program.
>
> Thank you,
> Vangelis.
>
> --
> Vangelis Koukis
> CTO, Arrikto Inc.
> 3505 El Camino Real, Palo Alto, CA 94306
> www.arrikto.com

Reply via email to