Re: Multiple cassandra instances per physical node
I had the exact same question, but I think this is what Nate was thinking: If you're running multiple nodes on a single server, vnodes give you no control over which instance has which key (whereas you can assign initial tokens). Therefore you could have two of your three replicas on the same physical server which, if it goes down, you can't read or write at quorum. However, can't you use the topology snitch to put both nodes in the same rack? Won't that prevent the issue and still allow you to maintain quorum if a single server goes down? If I have a 20-node cluster with 2 nodes on each physical server, can I use 10 racks to properly segment my partitions? On Sun, May 24, 2015 at 5:38 PM, Jonathan Haddad j...@jonhaddad.com wrote: What impact would vnodes have on strong consistency? I think the problem you're describing exists with or without them. On Sat, May 23, 2015 at 2:30 PM Nate McCall n...@thelastpickle.com wrote: So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes (each with 5 data disks, 1 commit log disk) and either give each its own container IP or change the listen ports. Will this work? What are the risks? Will/should Cassandra support this better in the future? Don't use vnodes if any operations need strong consistency (reading or writing at quorum). Otherwise, at RF=3, if you loose a single node you will only have one 1 replica left for some portion of the ring. -- - Nate McCall Austin, TX @zznate Co-Founder Sr. Technical Consultant Apache Cassandra Consulting http://www.thelastpickle.com -- *Ken Hancock *| System Architect, Advanced Advertising SeaChange International 50 Nagog Park Acton, Massachusetts 01720 ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC http://www.schange.com/en-US/Company/InvestorRelations.aspx Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hanc...@schange.com | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks[image: LinkedIn] http://www.linkedin.com/in/kenhancock [image: SeaChange International] http://www.schange.com/This e-mail and any attachments may contain information which is SeaChange International confidential. The information enclosed is intended only for the addressees herein and may not be copied or forwarded without permission from SeaChange International.
Re: Multiple cassandra instances per physical node
@Sean - You can manually change the ports used by Datastax agent using the address.yaml file in the agent install directory. +1 on using racks to separate it out... but it will increase operational complexity somewhat On 26 May 2015 at 08:11, Nate McCall n...@thelastpickle.com wrote: If you're running multiple nodes on a single server, vnodes give you no control over which instance has which key (whereas you can assign initial tokens). Therefore you could have two of your three replicas on the same physical server which, if it goes down, you can't read or write at quorum. Yep. You *will* have overlapping ranges on each physical server so long as Vnodes 'number of nodes in the cluster'. However, can't you use the topology snitch to put both nodes in the same rack? Won't that prevent the issue and still allow you to maintain quorum if a single server goes down? If I have a 20-node cluster with 2 nodes on each physical server, can I use 10 racks to properly segment my partitions? That's a good point, yes. I'd still personally prefer the operational simplicity of simply spacing out token assignments though, but YMMV. -- - Nate McCall Austin, TX @zznate Co-Founder Sr. Technical Consultant Apache Cassandra Consulting http://www.thelastpickle.com -- Ben Bromhead Instaclustr | www.instaclustr.com | @instaclustr http://twitter.com/instaclustr | (650) 284 9692
Re: Multiple cassandra instances per physical node
If you're running multiple nodes on a single server, vnodes give you no control over which instance has which key (whereas you can assign initial tokens). Therefore you could have two of your three replicas on the same physical server which, if it goes down, you can't read or write at quorum. Yep. You *will* have overlapping ranges on each physical server so long as Vnodes 'number of nodes in the cluster'. However, can't you use the topology snitch to put both nodes in the same rack? Won't that prevent the issue and still allow you to maintain quorum if a single server goes down? If I have a 20-node cluster with 2 nodes on each physical server, can I use 10 racks to properly segment my partitions? That's a good point, yes. I'd still personally prefer the operational simplicity of simply spacing out token assignments though, but YMMV. -- - Nate McCall Austin, TX @zznate Co-Founder Sr. Technical Consultant Apache Cassandra Consulting http://www.thelastpickle.com
Re: Multiple cassandra instances per physical node
If I have a 20-node cluster with 2 nodes on each physical server, can I use 10 racks to properly segment my partitions? Yes. On Sun, May 24, 2015 at 5:38 PM, Jonathan Haddad j...@jonhaddad.com wrote: What impact would vnodes have on strong consistency? I think the problem you're describing exists with or without them. On Sat, May 23, 2015 at 2:30 PM Nate McCall n...@thelastpickle.com wrote: So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes (each with 5 data disks, 1 commit log disk) and either give each its own container IP or change the listen ports. Will this work? What are the risks? Will/should Cassandra support this better in the future? Don't use vnodes if any operations need strong consistency (reading or writing at quorum). Otherwise, at RF=3, if you loose a single node you will only have one 1 replica left for some portion of the ring. -- - Nate McCall Austin, TX @zznate Co-Founder Sr. Technical Consultant Apache Cassandra Consulting http://www.thelastpickle.com -- *Ken Hancock *| System Architect, Advanced Advertising SeaChange International 50 Nagog Park Acton, Massachusetts 01720 ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC http://www.schange.com/en-US/Company/InvestorRelations.aspx Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hanc...@schange.com | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks[image: LinkedIn] http://www.linkedin.com/in/kenhancock [image: SeaChange International] http://www.schange.com/This e-mail and any attachments may contain information which is SeaChange International confidential. The information enclosed is intended only for the addressees herein and may not be copied or forwarded without permission from SeaChange International. -- http://twitter.com/tjake
Re: Multiple cassandra instances per physical node
What impact would vnodes have on strong consistency? I think the problem you're describing exists with or without them. On Sat, May 23, 2015 at 2:30 PM Nate McCall n...@thelastpickle.com wrote: So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes (each with 5 data disks, 1 commit log disk) and either give each its own container IP or change the listen ports. Will this work? What are the risks? Will/should Cassandra support this better in the future? Don't use vnodes if any operations need strong consistency (reading or writing at quorum). Otherwise, at RF=3, if you loose a single node you will only have one 1 replica left for some portion of the ring. -- - Nate McCall Austin, TX @zznate Co-Founder Sr. Technical Consultant Apache Cassandra Consulting http://www.thelastpickle.com
Re: Multiple cassandra instances per physical node
So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes (each with 5 data disks, 1 commit log disk) and either give each its own container IP or change the listen ports. Will this work? What are the risks? Will/should Cassandra support this better in the future? Don't use vnodes if any operations need strong consistency (reading or writing at quorum). Otherwise, at RF=3, if you loose a single node you will only have one 1 replica left for some portion of the ring. -- - Nate McCall Austin, TX @zznate Co-Founder Sr. Technical Consultant Apache Cassandra Consulting http://www.thelastpickle.com
RE: Multiple cassandra instances per physical node
We run 2 nodes (from 2 different rings) on the same physical host. One is for a random ring; the other is byteordered to support some alphabetic range queries. Each instance has its own binary install, data directory and ports. One limitation - with one install of OpsCenter agent, it can only connect to one of the rings. We haven’t tried two OpsCenter agent installs, yet. Sean Durity From: Jonathan Haddad [mailto:j...@jonhaddad.com] Sent: Thursday, May 21, 2015 5:26 PM To: user@cassandra.apache.org Subject: Re: Multiple cassandra instances per physical node Yep, that would be one way to handle it. On Thu, May 21, 2015 at 2:07 PM Dan Kinder dkin...@turnitin.commailto:dkin...@turnitin.com wrote: @James Rothering yeah I was thinking of container in a broad sense: either full virtual machines, docker containers, straight LXC, or whatever else would allow the Cassandra nodes to have their own IPs and bind to default ports. @Jonathan Haddad thanks for the blog post. To ensure the same host does not replicate its own data, would I basically need the nodes on a single host to be labeled as one rack? (Assuming I use vnodes) On Thu, May 21, 2015 at 1:02 PM, Sebastian Estevez sebastian.este...@datastax.commailto:sebastian.este...@datastax.com wrote: JBOD -- just a bunch of disks, no raid. All the best, [Image removed by sender. datastax_logo.png]http://www.datastax.com/ Sebastián Estévez Solutions Architect | 954 905 8615tel:954%20905%208615 | sebastian.este...@datastax.commailto:sebastian.este...@datastax.com [Image removed by sender. linkedin.png]https://www.linkedin.com/company/datastax[Image removed by sender. facebook.png]https://www.facebook.com/datastax[Image removed by sender. twitter.png]https://twitter.com/datastax[Image removed by sender. g+.png]https://plus.google.com/+Datastax/about[Image removed by sender.]http://feeds.feedburner.com/datastax [Image removed by sender.]http://cassandrasummit-datastax.com/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. On Thu, May 21, 2015 at 4:00 PM, James Rothering jrother...@codojo.memailto:jrother...@codojo.me wrote: Hmmm ... Not familiar with JBOD. Is that just RAID-0? Also ... wrt the container talk, is that a Docker container you're talking about? On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad j...@jonhaddad.commailto:j...@jonhaddad.com wrote: If you run it in a container with dedicated IPs it'll work just fine. Just be sure you aren't using the same machine to replicate it's own data. On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar khangaon...@gmail.commailto:khangaon...@gmail.com wrote: +1. I agree we need to be able to run multiple server instances on one physical machine. This is especially necessary in development and test environments where one is experimenting and needs a cluster, but do not have access to multiple physical machines. If you google , you can find a few blogs that talk about how to do this. But it is less than ideal. We need to be able to do it by changing ports in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka or Redis and many other distributed systems) regards On Thu, May 21, 2015 at 10:32 AM, Dan Kinder dkin...@turnitin.commailto:dkin...@turnitin.com wrote: Hi, I'd just like some clarity and advice regarding running multiple cassandra instances on a single large machine (big JBOD array, plenty of CPU/RAM). First, I am aware this was not Cassandra's original design, and doing this seems to unreasonably go against the commodity hardware intentions of Cassandra's design. In general it seems to be recommended against (at least as far as I've heard from @Rob Coli and others). However maybe this term commodity is changing... my hardware/ops team argues that due to cooling, power, and other datacenter costs, having slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a better price point. Now, I am not a hardware guy, so if this is not actually true I'd love to hear why, otherwise I pretty much need to take them at their word. Now, Cassandra features seemed to have improved such that JBOD works fairly well, but especially with memory/GC this seems to be reaching its limit. One Cassandra instance can only scale up so much. So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes (each with 5 data disks, 1 commit log disk) and either give each its own container IP or change the listen ports. Will this work? What are the risks? Will/should Cassandra support this better in the future? -- http://khangaonkar.blogspot.com/ -- Dan
Re: Multiple cassandra instances per physical node
Hi, I also advice against multiple instances on the same hardware. If you have really big boxes why not virtualize? Other option is experiment with CCM. Although there are some limitations with CCM (ex: JNA is disabled) If you follow up on this I would to hear how it went. Em 21/05/2015 19:33, Dan Kinder dkin...@turnitin.com escreveu: Hi, I'd just like some clarity and advice regarding running multiple cassandra instances on a single large machine (big JBOD array, plenty of CPU/RAM). First, I am aware this was not Cassandra's original design, and doing this seems to unreasonably go against the commodity hardware intentions of Cassandra's design. In general it seems to be recommended against (at least as far as I've heard from @Rob Coli and others). However maybe this term commodity is changing... my hardware/ops team argues that due to cooling, power, and other datacenter costs, having slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a better price point. Now, I am not a hardware guy, so if this is not actually true I'd love to hear why, otherwise I pretty much need to take them at their word. Now, Cassandra features seemed to have improved such that JBOD works fairly well, but especially with memory/GC this seems to be reaching its limit. One Cassandra instance can only scale up so much. So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes (each with 5 data disks, 1 commit log disk) and either give each its own container IP or change the listen ports. Will this work? What are the risks? Will/should Cassandra support this better in the future? -- --
Re: Multiple cassandra instances per physical node
JBOD -- just a bunch of disks, no raid. All the best, [image: datastax_logo.png] http://www.datastax.com/ Sebastián Estévez Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com [image: linkedin.png] https://www.linkedin.com/company/datastax [image: facebook.png] https://www.facebook.com/datastax [image: twitter.png] https://twitter.com/datastax [image: g+.png] https://plus.google.com/+Datastax/about http://feeds.feedburner.com/datastax http://cassandrasummit-datastax.com/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. On Thu, May 21, 2015 at 4:00 PM, James Rothering jrother...@codojo.me wrote: Hmmm ... Not familiar with JBOD. Is that just RAID-0? Also ... wrt the container talk, is that a Docker container you're talking about? On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad j...@jonhaddad.com wrote: If you run it in a container with dedicated IPs it'll work just fine. Just be sure you aren't using the same machine to replicate it's own data. On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar khangaon...@gmail.com wrote: +1. I agree we need to be able to run multiple server instances on one physical machine. This is especially necessary in development and test environments where one is experimenting and needs a cluster, but do not have access to multiple physical machines. If you google , you can find a few blogs that talk about how to do this. But it is less than ideal. We need to be able to do it by changing ports in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka or Redis and many other distributed systems) regards On Thu, May 21, 2015 at 10:32 AM, Dan Kinder dkin...@turnitin.com wrote: Hi, I'd just like some clarity and advice regarding running multiple cassandra instances on a single large machine (big JBOD array, plenty of CPU/RAM). First, I am aware this was not Cassandra's original design, and doing this seems to unreasonably go against the commodity hardware intentions of Cassandra's design. In general it seems to be recommended against (at least as far as I've heard from @Rob Coli and others). However maybe this term commodity is changing... my hardware/ops team argues that due to cooling, power, and other datacenter costs, having slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a better price point. Now, I am not a hardware guy, so if this is not actually true I'd love to hear why, otherwise I pretty much need to take them at their word. Now, Cassandra features seemed to have improved such that JBOD works fairly well, but especially with memory/GC this seems to be reaching its limit. One Cassandra instance can only scale up so much. So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes (each with 5 data disks, 1 commit log disk) and either give each its own container IP or change the listen ports. Will this work? What are the risks? Will/should Cassandra support this better in the future? -- http://khangaonkar.blogspot.com/
Re: Multiple cassandra instances per physical node
@James Rothering yeah I was thinking of container in a broad sense: either full virtual machines, docker containers, straight LXC, or whatever else would allow the Cassandra nodes to have their own IPs and bind to default ports. @Jonathan Haddad thanks for the blog post. To ensure the same host does not replicate its own data, would I basically need the nodes on a single host to be labeled as one rack? (Assuming I use vnodes) On Thu, May 21, 2015 at 1:02 PM, Sebastian Estevez sebastian.este...@datastax.com wrote: JBOD -- just a bunch of disks, no raid. All the best, [image: datastax_logo.png] http://www.datastax.com/ Sebastián Estévez Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com [image: linkedin.png] https://www.linkedin.com/company/datastax [image: facebook.png] https://www.facebook.com/datastax [image: twitter.png] https://twitter.com/datastax [image: g+.png] https://plus.google.com/+Datastax/about http://feeds.feedburner.com/datastax http://cassandrasummit-datastax.com/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. On Thu, May 21, 2015 at 4:00 PM, James Rothering jrother...@codojo.me wrote: Hmmm ... Not familiar with JBOD. Is that just RAID-0? Also ... wrt the container talk, is that a Docker container you're talking about? On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad j...@jonhaddad.com wrote: If you run it in a container with dedicated IPs it'll work just fine. Just be sure you aren't using the same machine to replicate it's own data. On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar khangaon...@gmail.com wrote: +1. I agree we need to be able to run multiple server instances on one physical machine. This is especially necessary in development and test environments where one is experimenting and needs a cluster, but do not have access to multiple physical machines. If you google , you can find a few blogs that talk about how to do this. But it is less than ideal. We need to be able to do it by changing ports in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka or Redis and many other distributed systems) regards On Thu, May 21, 2015 at 10:32 AM, Dan Kinder dkin...@turnitin.com wrote: Hi, I'd just like some clarity and advice regarding running multiple cassandra instances on a single large machine (big JBOD array, plenty of CPU/RAM). First, I am aware this was not Cassandra's original design, and doing this seems to unreasonably go against the commodity hardware intentions of Cassandra's design. In general it seems to be recommended against (at least as far as I've heard from @Rob Coli and others). However maybe this term commodity is changing... my hardware/ops team argues that due to cooling, power, and other datacenter costs, having slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a better price point. Now, I am not a hardware guy, so if this is not actually true I'd love to hear why, otherwise I pretty much need to take them at their word. Now, Cassandra features seemed to have improved such that JBOD works fairly well, but especially with memory/GC this seems to be reaching its limit. One Cassandra instance can only scale up so much. So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes (each with 5 data disks, 1 commit log disk) and either give each its own container IP or change the listen ports. Will this work? What are the risks? Will/should Cassandra support this better in the future? -- http://khangaonkar.blogspot.com/ -- Dan Kinder Senior Software Engineer Turnitin – www.turnitin.com dkin...@turnitin.com
Re: Multiple cassandra instances per physical node
You could use docker but it's not required. You could use LXC if you wanted. Shameless self promo: http://rustyrazorblade.com/2013/08/advanced-devops-with-vagrant-and-lxc/ On Thu, May 21, 2015 at 1:00 PM James Rothering jrother...@codojo.me wrote: Hmmm ... Not familiar with JBOD. Is that just RAID-0? Also ... wrt the container talk, is that a Docker container you're talking about? On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad j...@jonhaddad.com wrote: If you run it in a container with dedicated IPs it'll work just fine. Just be sure you aren't using the same machine to replicate it's own data. On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar khangaon...@gmail.com wrote: +1. I agree we need to be able to run multiple server instances on one physical machine. This is especially necessary in development and test environments where one is experimenting and needs a cluster, but do not have access to multiple physical machines. If you google , you can find a few blogs that talk about how to do this. But it is less than ideal. We need to be able to do it by changing ports in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka or Redis and many other distributed systems) regards On Thu, May 21, 2015 at 10:32 AM, Dan Kinder dkin...@turnitin.com wrote: Hi, I'd just like some clarity and advice regarding running multiple cassandra instances on a single large machine (big JBOD array, plenty of CPU/RAM). First, I am aware this was not Cassandra's original design, and doing this seems to unreasonably go against the commodity hardware intentions of Cassandra's design. In general it seems to be recommended against (at least as far as I've heard from @Rob Coli and others). However maybe this term commodity is changing... my hardware/ops team argues that due to cooling, power, and other datacenter costs, having slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a better price point. Now, I am not a hardware guy, so if this is not actually true I'd love to hear why, otherwise I pretty much need to take them at their word. Now, Cassandra features seemed to have improved such that JBOD works fairly well, but especially with memory/GC this seems to be reaching its limit. One Cassandra instance can only scale up so much. So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes (each with 5 data disks, 1 commit log disk) and either give each its own container IP or change the listen ports. Will this work? What are the risks? Will/should Cassandra support this better in the future? -- http://khangaonkar.blogspot.com/
Re: Multiple cassandra instances per physical node
Hi, we do operate multiple instances (of possibly different versions) of Cassandra on rather thick nodes. The only problem we encountered so far was sharing same physical data disk among multiple instances - it proved to not be the best idea.Sharing of commitlog disks caused no troubles so far. Other than that, it works without any problems. We manage the instances by a set of helper scripts (which change the env variables, so nodetool and such operates on right instance) and puppet templates. Jiri Horky On Thu, May 21, 2015 at 11:06 PM, Dan Kinder dkin...@turnitin.com wrote: @James Rothering yeah I was thinking of container in a broad sense: either full virtual machines, docker containers, straight LXC, or whatever else would allow the Cassandra nodes to have their own IPs and bind to default ports. @Jonathan Haddad thanks for the blog post. To ensure the same host does not replicate its own data, would I basically need the nodes on a single host to be labeled as one rack? (Assuming I use vnodes) On Thu, May 21, 2015 at 1:02 PM, Sebastian Estevez sebastian.este...@datastax.com wrote: JBOD -- just a bunch of disks, no raid. All the best, [image: datastax_logo.png] http://www.datastax.com/ Sebastián Estévez Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com [image: linkedin.png] https://www.linkedin.com/company/datastax [image: facebook.png] https://www.facebook.com/datastax [image: twitter.png] https://twitter.com/datastax [image: g+.png] https://plus.google.com/+Datastax/about http://feeds.feedburner.com/datastax http://cassandrasummit-datastax.com/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. On Thu, May 21, 2015 at 4:00 PM, James Rothering jrother...@codojo.me wrote: Hmmm ... Not familiar with JBOD. Is that just RAID-0? Also ... wrt the container talk, is that a Docker container you're talking about? On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad j...@jonhaddad.com wrote: If you run it in a container with dedicated IPs it'll work just fine. Just be sure you aren't using the same machine to replicate it's own data. On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar khangaon...@gmail.com wrote: +1. I agree we need to be able to run multiple server instances on one physical machine. This is especially necessary in development and test environments where one is experimenting and needs a cluster, but do not have access to multiple physical machines. If you google , you can find a few blogs that talk about how to do this. But it is less than ideal. We need to be able to do it by changing ports in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka or Redis and many other distributed systems) regards On Thu, May 21, 2015 at 10:32 AM, Dan Kinder dkin...@turnitin.com wrote: Hi, I'd just like some clarity and advice regarding running multiple cassandra instances on a single large machine (big JBOD array, plenty of CPU/RAM). First, I am aware this was not Cassandra's original design, and doing this seems to unreasonably go against the commodity hardware intentions of Cassandra's design. In general it seems to be recommended against (at least as far as I've heard from @Rob Coli and others). However maybe this term commodity is changing... my hardware/ops team argues that due to cooling, power, and other datacenter costs, having slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a better price point. Now, I am not a hardware guy, so if this is not actually true I'd love to hear why, otherwise I pretty much need to take them at their word. Now, Cassandra features seemed to have improved such that JBOD works fairly well, but especially with memory/GC this seems to be reaching its limit. One Cassandra instance can only scale up so much. So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes (each with 5 data disks, 1 commit log disk) and either give each its own container IP or change the listen ports. Will this work? What are the risks? Will/should Cassandra support this better in the future? -- http://khangaonkar.blogspot.com/ -- Dan Kinder Senior Software Engineer Turnitin – www.turnitin.com dkin...@turnitin.com
Re: Multiple cassandra instances per physical node
Hmmm ... Not familiar with JBOD. Is that just RAID-0? Also ... wrt the container talk, is that a Docker container you're talking about? On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad j...@jonhaddad.com wrote: If you run it in a container with dedicated IPs it'll work just fine. Just be sure you aren't using the same machine to replicate it's own data. On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar khangaon...@gmail.com wrote: +1. I agree we need to be able to run multiple server instances on one physical machine. This is especially necessary in development and test environments where one is experimenting and needs a cluster, but do not have access to multiple physical machines. If you google , you can find a few blogs that talk about how to do this. But it is less than ideal. We need to be able to do it by changing ports in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka or Redis and many other distributed systems) regards On Thu, May 21, 2015 at 10:32 AM, Dan Kinder dkin...@turnitin.com wrote: Hi, I'd just like some clarity and advice regarding running multiple cassandra instances on a single large machine (big JBOD array, plenty of CPU/RAM). First, I am aware this was not Cassandra's original design, and doing this seems to unreasonably go against the commodity hardware intentions of Cassandra's design. In general it seems to be recommended against (at least as far as I've heard from @Rob Coli and others). However maybe this term commodity is changing... my hardware/ops team argues that due to cooling, power, and other datacenter costs, having slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a better price point. Now, I am not a hardware guy, so if this is not actually true I'd love to hear why, otherwise I pretty much need to take them at their word. Now, Cassandra features seemed to have improved such that JBOD works fairly well, but especially with memory/GC this seems to be reaching its limit. One Cassandra instance can only scale up so much. So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes (each with 5 data disks, 1 commit log disk) and either give each its own container IP or change the listen ports. Will this work? What are the risks? Will/should Cassandra support this better in the future? -- http://khangaonkar.blogspot.com/
Re: Multiple cassandra instances per physical node
Yep, that would be one way to handle it. On Thu, May 21, 2015 at 2:07 PM Dan Kinder dkin...@turnitin.com wrote: @James Rothering yeah I was thinking of container in a broad sense: either full virtual machines, docker containers, straight LXC, or whatever else would allow the Cassandra nodes to have their own IPs and bind to default ports. @Jonathan Haddad thanks for the blog post. To ensure the same host does not replicate its own data, would I basically need the nodes on a single host to be labeled as one rack? (Assuming I use vnodes) On Thu, May 21, 2015 at 1:02 PM, Sebastian Estevez sebastian.este...@datastax.com wrote: JBOD -- just a bunch of disks, no raid. All the best, [image: datastax_logo.png] http://www.datastax.com/ Sebastián Estévez Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com [image: linkedin.png] https://www.linkedin.com/company/datastax [image: facebook.png] https://www.facebook.com/datastax [image: twitter.png] https://twitter.com/datastax [image: g+.png] https://plus.google.com/+Datastax/about http://feeds.feedburner.com/datastax http://cassandrasummit-datastax.com/ DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world’s most innovative enterprises. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. On Thu, May 21, 2015 at 4:00 PM, James Rothering jrother...@codojo.me wrote: Hmmm ... Not familiar with JBOD. Is that just RAID-0? Also ... wrt the container talk, is that a Docker container you're talking about? On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad j...@jonhaddad.com wrote: If you run it in a container with dedicated IPs it'll work just fine. Just be sure you aren't using the same machine to replicate it's own data. On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar khangaon...@gmail.com wrote: +1. I agree we need to be able to run multiple server instances on one physical machine. This is especially necessary in development and test environments where one is experimenting and needs a cluster, but do not have access to multiple physical machines. If you google , you can find a few blogs that talk about how to do this. But it is less than ideal. We need to be able to do it by changing ports in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka or Redis and many other distributed systems) regards On Thu, May 21, 2015 at 10:32 AM, Dan Kinder dkin...@turnitin.com wrote: Hi, I'd just like some clarity and advice regarding running multiple cassandra instances on a single large machine (big JBOD array, plenty of CPU/RAM). First, I am aware this was not Cassandra's original design, and doing this seems to unreasonably go against the commodity hardware intentions of Cassandra's design. In general it seems to be recommended against (at least as far as I've heard from @Rob Coli and others). However maybe this term commodity is changing... my hardware/ops team argues that due to cooling, power, and other datacenter costs, having slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a better price point. Now, I am not a hardware guy, so if this is not actually true I'd love to hear why, otherwise I pretty much need to take them at their word. Now, Cassandra features seemed to have improved such that JBOD works fairly well, but especially with memory/GC this seems to be reaching its limit. One Cassandra instance can only scale up so much. So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes (each with 5 data disks, 1 commit log disk) and either give each its own container IP or change the listen ports. Will this work? What are the risks? Will/should Cassandra support this better in the future? -- http://khangaonkar.blogspot.com/ -- Dan Kinder Senior Software Engineer Turnitin – www.turnitin.com dkin...@turnitin.com
Re: Multiple cassandra instances per physical node
If you run it in a container with dedicated IPs it'll work just fine. Just be sure you aren't using the same machine to replicate it's own data. On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar khangaon...@gmail.com wrote: +1. I agree we need to be able to run multiple server instances on one physical machine. This is especially necessary in development and test environments where one is experimenting and needs a cluster, but do not have access to multiple physical machines. If you google , you can find a few blogs that talk about how to do this. But it is less than ideal. We need to be able to do it by changing ports in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka or Redis and many other distributed systems) regards On Thu, May 21, 2015 at 10:32 AM, Dan Kinder dkin...@turnitin.com wrote: Hi, I'd just like some clarity and advice regarding running multiple cassandra instances on a single large machine (big JBOD array, plenty of CPU/RAM). First, I am aware this was not Cassandra's original design, and doing this seems to unreasonably go against the commodity hardware intentions of Cassandra's design. In general it seems to be recommended against (at least as far as I've heard from @Rob Coli and others). However maybe this term commodity is changing... my hardware/ops team argues that due to cooling, power, and other datacenter costs, having slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a better price point. Now, I am not a hardware guy, so if this is not actually true I'd love to hear why, otherwise I pretty much need to take them at their word. Now, Cassandra features seemed to have improved such that JBOD works fairly well, but especially with memory/GC this seems to be reaching its limit. One Cassandra instance can only scale up so much. So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes (each with 5 data disks, 1 commit log disk) and either give each its own container IP or change the listen ports. Will this work? What are the risks? Will/should Cassandra support this better in the future? -- http://khangaonkar.blogspot.com/
Re: Multiple cassandra instances per physical node
+1. I agree we need to be able to run multiple server instances on one physical machine. This is especially necessary in development and test environments where one is experimenting and needs a cluster, but do not have access to multiple physical machines. If you google , you can find a few blogs that talk about how to do this. But it is less than ideal. We need to be able to do it by changing ports in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka or Redis and many other distributed systems) regards On Thu, May 21, 2015 at 10:32 AM, Dan Kinder dkin...@turnitin.com wrote: Hi, I'd just like some clarity and advice regarding running multiple cassandra instances on a single large machine (big JBOD array, plenty of CPU/RAM). First, I am aware this was not Cassandra's original design, and doing this seems to unreasonably go against the commodity hardware intentions of Cassandra's design. In general it seems to be recommended against (at least as far as I've heard from @Rob Coli and others). However maybe this term commodity is changing... my hardware/ops team argues that due to cooling, power, and other datacenter costs, having slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a better price point. Now, I am not a hardware guy, so if this is not actually true I'd love to hear why, otherwise I pretty much need to take them at their word. Now, Cassandra features seemed to have improved such that JBOD works fairly well, but especially with memory/GC this seems to be reaching its limit. One Cassandra instance can only scale up so much. So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes (each with 5 data disks, 1 commit log disk) and either give each its own container IP or change the listen ports. Will this work? What are the risks? Will/should Cassandra support this better in the future? -- http://khangaonkar.blogspot.com/