Re: Multiple cassandra instances per physical node

2015-05-26 Thread Ken Hancock
I had the exact same question, but I think this is what Nate was thinking:

If you're running multiple nodes on a single server, vnodes give you no
control over which instance has which key (whereas you can assign initial
tokens).  Therefore you could have two of your three replicas on the same
physical server which, if it goes down, you can't read or write at quorum.

However, can't you use the topology snitch to put both nodes in the same
rack?  Won't that prevent the issue and still allow you to maintain quorum
if a single server goes down?  If I have a 20-node cluster with 2 nodes on
each physical server, can I use 10 racks to properly segment my partitions?



On Sun, May 24, 2015 at 5:38 PM, Jonathan Haddad j...@jonhaddad.com wrote:

 What impact would vnodes have on strong consistency?  I think the problem
 you're describing exists with or without them.

 On Sat, May 23, 2015 at 2:30 PM Nate McCall n...@thelastpickle.com
 wrote:


 So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
 nodes (each with 5 data disks, 1 commit log disk) and either give each its
 own container  IP or change the listen ports. Will this work? What are the
 risks? Will/should Cassandra support this better in the future?


 Don't use vnodes if any operations need strong consistency (reading or
 writing at quorum). Otherwise, at RF=3, if you loose a single node you will
 only have one 1 replica left for some portion of the ring.



 --
 -
 Nate McCall
 Austin, TX
 @zznate

 Co-Founder  Sr. Technical Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com




-- 
*Ken Hancock *| System Architect, Advanced Advertising
SeaChange International
50 Nagog Park
Acton, Massachusetts 01720
ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC
http://www.schange.com/en-US/Company/InvestorRelations.aspx
Office: +1 (978) 889-3329 | [image: Google Talk:]
ken.hanc...@schange.com | [image:
Skype:]hancockks | [image: Yahoo IM:]hancockks[image: LinkedIn]
http://www.linkedin.com/in/kenhancock

[image: SeaChange International]
http://www.schange.com/This e-mail and any attachments may contain
information which is SeaChange International confidential. The information
enclosed is intended only for the addressees herein and may not be copied
or forwarded without permission from SeaChange International.


Re: Multiple cassandra instances per physical node

2015-05-26 Thread Ben Bromhead
@Sean - You can manually change the ports used by Datastax agent using the
address.yaml file in the agent install directory.

+1 on using racks to separate it out... but it will increase operational
complexity somewhat

On 26 May 2015 at 08:11, Nate McCall n...@thelastpickle.com wrote:


 If you're running multiple nodes on a single server, vnodes give you no
 control over which instance has which key (whereas you can assign initial
 tokens).  Therefore you could have two of your three replicas on the same
 physical server which, if it goes down, you can't read or write at quorum.


 Yep. You *will* have overlapping ranges on each physical server so long as
 Vnodes  'number of nodes in the cluster'.




 However, can't you use the topology snitch to put both nodes in the same
 rack?  Won't that prevent the issue and still allow you to maintain quorum
 if a single server goes down?  If I have a 20-node cluster with 2 nodes on
 each physical server, can I use 10 racks to properly segment my partitions?


 That's a good point, yes. I'd still personally prefer the operational
 simplicity of simply spacing out token assignments though, but YMMV.



 --
 -
 Nate McCall
 Austin, TX
 @zznate

 Co-Founder  Sr. Technical Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com




-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
http://twitter.com/instaclustr | (650) 284 9692


Re: Multiple cassandra instances per physical node

2015-05-26 Thread Nate McCall


 If you're running multiple nodes on a single server, vnodes give you no
 control over which instance has which key (whereas you can assign initial
 tokens).  Therefore you could have two of your three replicas on the same
 physical server which, if it goes down, you can't read or write at quorum.


Yep. You *will* have overlapping ranges on each physical server so long as
Vnodes  'number of nodes in the cluster'.




 However, can't you use the topology snitch to put both nodes in the same
 rack?  Won't that prevent the issue and still allow you to maintain quorum
 if a single server goes down?  If I have a 20-node cluster with 2 nodes on
 each physical server, can I use 10 racks to properly segment my partitions?


That's a good point, yes. I'd still personally prefer the operational
simplicity of simply spacing out token assignments though, but YMMV.



-- 
-
Nate McCall
Austin, TX
@zznate

Co-Founder  Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: Multiple cassandra instances per physical node

2015-05-26 Thread Jake Luciani

  If I have a 20-node cluster with 2 nodes on each physical server, can I
 use 10 racks to properly segment my partitions?


Yes.




 On Sun, May 24, 2015 at 5:38 PM, Jonathan Haddad j...@jonhaddad.com
 wrote:

 What impact would vnodes have on strong consistency?  I think the problem
 you're describing exists with or without them.

 On Sat, May 23, 2015 at 2:30 PM Nate McCall n...@thelastpickle.com
 wrote:


 So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
 nodes (each with 5 data disks, 1 commit log disk) and either give each its
 own container  IP or change the listen ports. Will this work? What are the
 risks? Will/should Cassandra support this better in the future?


 Don't use vnodes if any operations need strong consistency (reading or
 writing at quorum). Otherwise, at RF=3, if you loose a single node you will
 only have one 1 replica left for some portion of the ring.



 --
 -
 Nate McCall
 Austin, TX
 @zznate

 Co-Founder  Sr. Technical Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com




 --
 *Ken Hancock *| System Architect, Advanced Advertising
 SeaChange International
 50 Nagog Park
 Acton, Massachusetts 01720
 ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC
 http://www.schange.com/en-US/Company/InvestorRelations.aspx
 Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hanc...@schange.com
  | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks[image: LinkedIn]
 http://www.linkedin.com/in/kenhancock

 [image: SeaChange International]
 http://www.schange.com/This e-mail and any attachments may contain
 information which is SeaChange International confidential. The information
 enclosed is intended only for the addressees herein and may not be copied
 or forwarded without permission from SeaChange International.




-- 
http://twitter.com/tjake


Re: Multiple cassandra instances per physical node

2015-05-24 Thread Jonathan Haddad
What impact would vnodes have on strong consistency?  I think the problem
you're describing exists with or without them.

On Sat, May 23, 2015 at 2:30 PM Nate McCall n...@thelastpickle.com wrote:


 So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
 nodes (each with 5 data disks, 1 commit log disk) and either give each its
 own container  IP or change the listen ports. Will this work? What are the
 risks? Will/should Cassandra support this better in the future?


 Don't use vnodes if any operations need strong consistency (reading or
 writing at quorum). Otherwise, at RF=3, if you loose a single node you will
 only have one 1 replica left for some portion of the ring.



 --
 -
 Nate McCall
 Austin, TX
 @zznate

 Co-Founder  Sr. Technical Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com



Re: Multiple cassandra instances per physical node

2015-05-23 Thread Nate McCall


 So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes
 (each with 5 data disks, 1 commit log disk) and either give each its own
 container  IP or change the listen ports. Will this work? What are the
 risks? Will/should Cassandra support this better in the future?


Don't use vnodes if any operations need strong consistency (reading or
writing at quorum). Otherwise, at RF=3, if you loose a single node you will
only have one 1 replica left for some portion of the ring.



-- 
-
Nate McCall
Austin, TX
@zznate

Co-Founder  Sr. Technical Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


RE: Multiple cassandra instances per physical node

2015-05-22 Thread SEAN_R_DURITY
We run 2 nodes (from 2 different rings) on the same physical host. One is for a 
random ring; the other is byteordered to support some alphabetic range queries. 
Each instance has its own binary install, data directory and ports. One 
limitation - with one install of OpsCenter agent, it can only connect to one of 
the rings. We haven’t tried two OpsCenter agent installs, yet.


Sean Durity

From: Jonathan Haddad [mailto:j...@jonhaddad.com]
Sent: Thursday, May 21, 2015 5:26 PM
To: user@cassandra.apache.org
Subject: Re: Multiple cassandra instances per physical node

Yep, that would be one way to handle it.
On Thu, May 21, 2015 at 2:07 PM Dan Kinder 
dkin...@turnitin.commailto:dkin...@turnitin.com wrote:
@James Rothering yeah I was thinking of container in a broad sense: either full 
virtual machines, docker containers, straight LXC, or whatever else would allow 
the Cassandra nodes to have their own IPs and bind to default ports.

@Jonathan Haddad thanks for the blog post. To ensure the same host does not 
replicate its own data, would I basically need the nodes on a single host to be 
labeled as one rack? (Assuming I use vnodes)

On Thu, May 21, 2015 at 1:02 PM, Sebastian Estevez 
sebastian.este...@datastax.commailto:sebastian.este...@datastax.com wrote:
JBOD -- just a bunch of disks, no raid.


All the best,



[Image removed by sender. datastax_logo.png]http://www.datastax.com/

Sebastián Estévez

Solutions Architect | 954 905 8615tel:954%20905%208615 | 
sebastian.este...@datastax.commailto:sebastian.este...@datastax.com

[Image removed by sender. 
linkedin.png]https://www.linkedin.com/company/datastax[Image removed by 
sender. facebook.png]https://www.facebook.com/datastax[Image removed by 
sender. twitter.png]https://twitter.com/datastax[Image removed by sender. 
g+.png]https://plus.google.com/+Datastax/about[Image removed by 
sender.]http://feeds.feedburner.com/datastax

[Image removed by sender.]http://cassandrasummit-datastax.com/

DataStax is the fastest, most scalable distributed database technology, 
delivering Apache Cassandra to the world’s most innovative enterprises. 
Datastax is built to be agile, always-on, and predictably scalable to any size. 
With more than 500 customers in 45 countries, DataStax is the database 
technology and transactional backbone of choice for the worlds most innovative 
companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, May 21, 2015 at 4:00 PM, James Rothering 
jrother...@codojo.memailto:jrother...@codojo.me wrote:
Hmmm ... Not familiar with JBOD. Is that just RAID-0?

Also ... wrt  the container talk, is that a Docker container you're talking 
about?



On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad 
j...@jonhaddad.commailto:j...@jonhaddad.com wrote:
If you run it in a container with dedicated IPs it'll work just fine.  Just be 
sure you aren't using the same machine to replicate it's own data.

On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar 
khangaon...@gmail.commailto:khangaon...@gmail.com wrote:
+1.
I agree we need to be able to run multiple server instances on one physical 
machine. This is especially necessary in development and test environments 
where one is experimenting and needs a cluster, but do not have access to 
multiple physical machines.
If you google , you  can find a few blogs that talk about how to do this.

But it is less than ideal. We need to be able to do it by changing ports in 
cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka or 
Redis and many other distributed systems)

regards



On Thu, May 21, 2015 at 10:32 AM, Dan Kinder 
dkin...@turnitin.commailto:dkin...@turnitin.com wrote:
Hi, I'd just like some clarity and advice regarding running multiple cassandra 
instances on a single large machine (big JBOD array, plenty of CPU/RAM).

First, I am aware this was not Cassandra's original design, and doing this 
seems to unreasonably go against the commodity hardware intentions of 
Cassandra's design. In general it seems to be recommended against (at least as 
far as I've heard from @Rob Coli and others).

However maybe this term commodity is changing... my hardware/ops team argues 
that due to cooling, power, and other datacenter costs, having slightly larger 
nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a better price point. 
Now, I am not a hardware guy, so if this is not actually true I'd love to hear 
why, otherwise I pretty much need to take them at their word.

Now, Cassandra features seemed to have improved such that JBOD works fairly 
well, but especially with memory/GC this seems to be reaching its limit. One 
Cassandra instance can only scale up so much.

So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes 
(each with 5 data disks, 1 commit log disk) and either give each its own 
container  IP or change the listen ports. Will this work? What are the risks? 
Will/should Cassandra support this better in the future?


--
http://khangaonkar.blogspot.com/





--
Dan

Re: Multiple cassandra instances per physical node

2015-05-21 Thread Carlos Rolo
Hi,

I also advice against multiple instances on the same hardware. If you have
really big boxes why not virtualize?

Other option is experiment with CCM. Although there are some limitations
with CCM (ex: JNA is disabled)

If you follow up on this I would to hear how it went.
Em 21/05/2015 19:33, Dan Kinder dkin...@turnitin.com escreveu:

 Hi, I'd just like some clarity and advice regarding running multiple
 cassandra instances on a single large machine (big JBOD array, plenty of
 CPU/RAM).

 First, I am aware this was not Cassandra's original design, and doing this
 seems to unreasonably go against the commodity hardware intentions of
 Cassandra's design. In general it seems to be recommended against (at least
 as far as I've heard from @Rob Coli and others).

 However maybe this term commodity is changing... my hardware/ops team
 argues that due to cooling, power, and other datacenter costs, having
 slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a
 better price point. Now, I am not a hardware guy, so if this is not
 actually true I'd love to hear why, otherwise I pretty much need to take
 them at their word.

 Now, Cassandra features seemed to have improved such that JBOD works
 fairly well, but especially with memory/GC this seems to be reaching its
 limit. One Cassandra instance can only scale up so much.

 So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes
 (each with 5 data disks, 1 commit log disk) and either give each its own
 container  IP or change the listen ports. Will this work? What are the
 risks? Will/should Cassandra support this better in the future?


-- 


--





Re: Multiple cassandra instances per physical node

2015-05-21 Thread Sebastian Estevez
JBOD -- just a bunch of disks, no raid.

All the best,


[image: datastax_logo.png] http://www.datastax.com/

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] https://www.linkedin.com/company/datastax [image:
facebook.png] https://www.facebook.com/datastax [image: twitter.png]
https://twitter.com/datastax [image: g+.png]
https://plus.google.com/+Datastax/about
http://feeds.feedburner.com/datastax

http://cassandrasummit-datastax.com/

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, May 21, 2015 at 4:00 PM, James Rothering jrother...@codojo.me
wrote:

 Hmmm ... Not familiar with JBOD. Is that just RAID-0?

 Also ... wrt  the container talk, is that a Docker container you're
 talking about?



 On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad j...@jonhaddad.com
 wrote:

 If you run it in a container with dedicated IPs it'll work just fine.
 Just be sure you aren't using the same machine to replicate it's own data.

 On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar khangaon...@gmail.com
 wrote:

 +1.

 I agree we need to be able to run multiple server instances on one
 physical machine. This is especially necessary in development and test
 environments where one is experimenting and needs a cluster, but do not
 have access to multiple physical machines.

 If you google , you  can find a few blogs that talk about how to do this.

 But it is less than ideal. We need to be able to do it by changing ports
 in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka
 or Redis and many other distributed systems)


 regards



 On Thu, May 21, 2015 at 10:32 AM, Dan Kinder dkin...@turnitin.com
 wrote:

 Hi, I'd just like some clarity and advice regarding running multiple
 cassandra instances on a single large machine (big JBOD array, plenty of
 CPU/RAM).

 First, I am aware this was not Cassandra's original design, and doing
 this seems to unreasonably go against the commodity hardware intentions
 of Cassandra's design. In general it seems to be recommended against (at
 least as far as I've heard from @Rob Coli and others).

 However maybe this term commodity is changing... my hardware/ops team
 argues that due to cooling, power, and other datacenter costs, having
 slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a
 better price point. Now, I am not a hardware guy, so if this is not
 actually true I'd love to hear why, otherwise I pretty much need to take
 them at their word.

 Now, Cassandra features seemed to have improved such that JBOD works
 fairly well, but especially with memory/GC this seems to be reaching its
 limit. One Cassandra instance can only scale up so much.

 So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
 nodes (each with 5 data disks, 1 commit log disk) and either give each its
 own container  IP or change the listen ports. Will this work? What are the
 risks? Will/should Cassandra support this better in the future?




 --
 http://khangaonkar.blogspot.com/





Re: Multiple cassandra instances per physical node

2015-05-21 Thread Dan Kinder
@James Rothering yeah I was thinking of container in a broad sense: either
full virtual machines, docker containers, straight LXC, or whatever else
would allow the Cassandra nodes to have their own IPs and bind to default
ports.

@Jonathan Haddad thanks for the blog post. To ensure the same host does not
replicate its own data, would I basically need the nodes on a single host
to be labeled as one rack? (Assuming I use vnodes)

On Thu, May 21, 2015 at 1:02 PM, Sebastian Estevez 
sebastian.este...@datastax.com wrote:

 JBOD -- just a bunch of disks, no raid.

 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Thu, May 21, 2015 at 4:00 PM, James Rothering jrother...@codojo.me
 wrote:

 Hmmm ... Not familiar with JBOD. Is that just RAID-0?

 Also ... wrt  the container talk, is that a Docker container you're
 talking about?



 On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad j...@jonhaddad.com
 wrote:

 If you run it in a container with dedicated IPs it'll work just fine.
 Just be sure you aren't using the same machine to replicate it's own data.

 On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar 
 khangaon...@gmail.com wrote:

 +1.

 I agree we need to be able to run multiple server instances on one
 physical machine. This is especially necessary in development and test
 environments where one is experimenting and needs a cluster, but do not
 have access to multiple physical machines.

 If you google , you  can find a few blogs that talk about how to do
 this.

 But it is less than ideal. We need to be able to do it by changing
 ports in cassandra.yaml. ( The way it is done easily with Hadoop or Apache
 Kafka or Redis and many other distributed systems)


 regards



 On Thu, May 21, 2015 at 10:32 AM, Dan Kinder dkin...@turnitin.com
 wrote:

 Hi, I'd just like some clarity and advice regarding running multiple
 cassandra instances on a single large machine (big JBOD array, plenty of
 CPU/RAM).

 First, I am aware this was not Cassandra's original design, and doing
 this seems to unreasonably go against the commodity hardware intentions
 of Cassandra's design. In general it seems to be recommended against (at
 least as far as I've heard from @Rob Coli and others).

 However maybe this term commodity is changing... my hardware/ops
 team argues that due to cooling, power, and other datacenter costs, having
 slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a
 better price point. Now, I am not a hardware guy, so if this is not
 actually true I'd love to hear why, otherwise I pretty much need to take
 them at their word.

 Now, Cassandra features seemed to have improved such that JBOD works
 fairly well, but especially with memory/GC this seems to be reaching its
 limit. One Cassandra instance can only scale up so much.

 So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
 nodes (each with 5 data disks, 1 commit log disk) and either give each its
 own container  IP or change the listen ports. Will this work? What are 
 the
 risks? Will/should Cassandra support this better in the future?




 --
 http://khangaonkar.blogspot.com/






-- 
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkin...@turnitin.com


Re: Multiple cassandra instances per physical node

2015-05-21 Thread Jonathan Haddad
You could use docker but it's not required.  You could use LXC if you
wanted.

Shameless self promo:
http://rustyrazorblade.com/2013/08/advanced-devops-with-vagrant-and-lxc/


On Thu, May 21, 2015 at 1:00 PM James Rothering jrother...@codojo.me
wrote:

 Hmmm ... Not familiar with JBOD. Is that just RAID-0?

 Also ... wrt  the container talk, is that a Docker container you're
 talking about?



 On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad j...@jonhaddad.com
 wrote:

 If you run it in a container with dedicated IPs it'll work just fine.
 Just be sure you aren't using the same machine to replicate it's own data.

 On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar khangaon...@gmail.com
 wrote:

 +1.

 I agree we need to be able to run multiple server instances on one
 physical machine. This is especially necessary in development and test
 environments where one is experimenting and needs a cluster, but do not
 have access to multiple physical machines.

 If you google , you  can find a few blogs that talk about how to do this.

 But it is less than ideal. We need to be able to do it by changing ports
 in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka
 or Redis and many other distributed systems)


 regards



 On Thu, May 21, 2015 at 10:32 AM, Dan Kinder dkin...@turnitin.com
 wrote:

 Hi, I'd just like some clarity and advice regarding running multiple
 cassandra instances on a single large machine (big JBOD array, plenty of
 CPU/RAM).

 First, I am aware this was not Cassandra's original design, and doing
 this seems to unreasonably go against the commodity hardware intentions
 of Cassandra's design. In general it seems to be recommended against (at
 least as far as I've heard from @Rob Coli and others).

 However maybe this term commodity is changing... my hardware/ops team
 argues that due to cooling, power, and other datacenter costs, having
 slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a
 better price point. Now, I am not a hardware guy, so if this is not
 actually true I'd love to hear why, otherwise I pretty much need to take
 them at their word.

 Now, Cassandra features seemed to have improved such that JBOD works
 fairly well, but especially with memory/GC this seems to be reaching its
 limit. One Cassandra instance can only scale up so much.

 So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
 nodes (each with 5 data disks, 1 commit log disk) and either give each its
 own container  IP or change the listen ports. Will this work? What are the
 risks? Will/should Cassandra support this better in the future?




 --
 http://khangaonkar.blogspot.com/





Re: Multiple cassandra instances per physical node

2015-05-21 Thread Horký , Jiří
Hi,
we do operate multiple instances (of possibly different versions) of
Cassandra on rather thick nodes. The only problem we encountered so far was
sharing same physical data disk among multiple instances - it proved to not
be the best idea.Sharing of commitlog disks caused no troubles so far.
Other than that, it works without any problems. We manage the instances by
a set of helper scripts (which change the env variables, so nodetool and
such operates on right instance) and puppet templates.

Jiri Horky

On Thu, May 21, 2015 at 11:06 PM, Dan Kinder dkin...@turnitin.com wrote:

 @James Rothering yeah I was thinking of container in a broad sense: either
 full virtual machines, docker containers, straight LXC, or whatever else
 would allow the Cassandra nodes to have their own IPs and bind to default
 ports.

 @Jonathan Haddad thanks for the blog post. To ensure the same host does
 not replicate its own data, would I basically need the nodes on a single
 host to be labeled as one rack? (Assuming I use vnodes)

 On Thu, May 21, 2015 at 1:02 PM, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 JBOD -- just a bunch of disks, no raid.

 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Thu, May 21, 2015 at 4:00 PM, James Rothering jrother...@codojo.me
 wrote:

 Hmmm ... Not familiar with JBOD. Is that just RAID-0?

 Also ... wrt  the container talk, is that a Docker container you're
 talking about?



 On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad j...@jonhaddad.com
 wrote:

 If you run it in a container with dedicated IPs it'll work just fine.
 Just be sure you aren't using the same machine to replicate it's own data.

 On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar 
 khangaon...@gmail.com wrote:

 +1.

 I agree we need to be able to run multiple server instances on one
 physical machine. This is especially necessary in development and test
 environments where one is experimenting and needs a cluster, but do not
 have access to multiple physical machines.

 If you google , you  can find a few blogs that talk about how to do
 this.

 But it is less than ideal. We need to be able to do it by changing
 ports in cassandra.yaml. ( The way it is done easily with Hadoop or Apache
 Kafka or Redis and many other distributed systems)


 regards



 On Thu, May 21, 2015 at 10:32 AM, Dan Kinder dkin...@turnitin.com
 wrote:

 Hi, I'd just like some clarity and advice regarding running multiple
 cassandra instances on a single large machine (big JBOD array, plenty of
 CPU/RAM).

 First, I am aware this was not Cassandra's original design, and doing
 this seems to unreasonably go against the commodity hardware intentions
 of Cassandra's design. In general it seems to be recommended against (at
 least as far as I've heard from @Rob Coli and others).

 However maybe this term commodity is changing... my hardware/ops
 team argues that due to cooling, power, and other datacenter costs, 
 having
 slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a
 better price point. Now, I am not a hardware guy, so if this is not
 actually true I'd love to hear why, otherwise I pretty much need to take
 them at their word.

 Now, Cassandra features seemed to have improved such that JBOD works
 fairly well, but especially with memory/GC this seems to be reaching its
 limit. One Cassandra instance can only scale up so much.

 So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
 nodes (each with 5 data disks, 1 commit log disk) and either give each 
 its
 own container  IP or change the listen ports. Will this work? What are 
 the
 risks? Will/should Cassandra support this better in the future?




 --
 http://khangaonkar.blogspot.com/






 --
 Dan Kinder
 Senior Software Engineer
 Turnitin – www.turnitin.com
 dkin...@turnitin.com



Re: Multiple cassandra instances per physical node

2015-05-21 Thread James Rothering
Hmmm ... Not familiar with JBOD. Is that just RAID-0?

Also ... wrt  the container talk, is that a Docker container you're talking
about?



On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad j...@jonhaddad.com wrote:

 If you run it in a container with dedicated IPs it'll work just fine.
 Just be sure you aren't using the same machine to replicate it's own data.

 On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar khangaon...@gmail.com
 wrote:

 +1.

 I agree we need to be able to run multiple server instances on one
 physical machine. This is especially necessary in development and test
 environments where one is experimenting and needs a cluster, but do not
 have access to multiple physical machines.

 If you google , you  can find a few blogs that talk about how to do this.

 But it is less than ideal. We need to be able to do it by changing ports
 in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka
 or Redis and many other distributed systems)


 regards



 On Thu, May 21, 2015 at 10:32 AM, Dan Kinder dkin...@turnitin.com
 wrote:

 Hi, I'd just like some clarity and advice regarding running multiple
 cassandra instances on a single large machine (big JBOD array, plenty of
 CPU/RAM).

 First, I am aware this was not Cassandra's original design, and doing
 this seems to unreasonably go against the commodity hardware intentions
 of Cassandra's design. In general it seems to be recommended against (at
 least as far as I've heard from @Rob Coli and others).

 However maybe this term commodity is changing... my hardware/ops team
 argues that due to cooling, power, and other datacenter costs, having
 slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a
 better price point. Now, I am not a hardware guy, so if this is not
 actually true I'd love to hear why, otherwise I pretty much need to take
 them at their word.

 Now, Cassandra features seemed to have improved such that JBOD works
 fairly well, but especially with memory/GC this seems to be reaching its
 limit. One Cassandra instance can only scale up so much.

 So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
 nodes (each with 5 data disks, 1 commit log disk) and either give each its
 own container  IP or change the listen ports. Will this work? What are the
 risks? Will/should Cassandra support this better in the future?




 --
 http://khangaonkar.blogspot.com/




Re: Multiple cassandra instances per physical node

2015-05-21 Thread Jonathan Haddad
Yep, that would be one way to handle it.

On Thu, May 21, 2015 at 2:07 PM Dan Kinder dkin...@turnitin.com wrote:

 @James Rothering yeah I was thinking of container in a broad sense: either
 full virtual machines, docker containers, straight LXC, or whatever else
 would allow the Cassandra nodes to have their own IPs and bind to default
 ports.

 @Jonathan Haddad thanks for the blog post. To ensure the same host does
 not replicate its own data, would I basically need the nodes on a single
 host to be labeled as one rack? (Assuming I use vnodes)

 On Thu, May 21, 2015 at 1:02 PM, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 JBOD -- just a bunch of disks, no raid.

 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Thu, May 21, 2015 at 4:00 PM, James Rothering jrother...@codojo.me
 wrote:

 Hmmm ... Not familiar with JBOD. Is that just RAID-0?

 Also ... wrt  the container talk, is that a Docker container you're
 talking about?



 On Thu, May 21, 2015 at 12:48 PM, Jonathan Haddad j...@jonhaddad.com
 wrote:

 If you run it in a container with dedicated IPs it'll work just fine.
 Just be sure you aren't using the same machine to replicate it's own data.

 On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar 
 khangaon...@gmail.com wrote:

 +1.

 I agree we need to be able to run multiple server instances on one
 physical machine. This is especially necessary in development and test
 environments where one is experimenting and needs a cluster, but do not
 have access to multiple physical machines.

 If you google , you  can find a few blogs that talk about how to do
 this.

 But it is less than ideal. We need to be able to do it by changing
 ports in cassandra.yaml. ( The way it is done easily with Hadoop or Apache
 Kafka or Redis and many other distributed systems)


 regards



 On Thu, May 21, 2015 at 10:32 AM, Dan Kinder dkin...@turnitin.com
 wrote:

 Hi, I'd just like some clarity and advice regarding running multiple
 cassandra instances on a single large machine (big JBOD array, plenty of
 CPU/RAM).

 First, I am aware this was not Cassandra's original design, and doing
 this seems to unreasonably go against the commodity hardware intentions
 of Cassandra's design. In general it seems to be recommended against (at
 least as far as I've heard from @Rob Coli and others).

 However maybe this term commodity is changing... my hardware/ops
 team argues that due to cooling, power, and other datacenter costs, 
 having
 slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a
 better price point. Now, I am not a hardware guy, so if this is not
 actually true I'd love to hear why, otherwise I pretty much need to take
 them at their word.

 Now, Cassandra features seemed to have improved such that JBOD works
 fairly well, but especially with memory/GC this seems to be reaching its
 limit. One Cassandra instance can only scale up so much.

 So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
 nodes (each with 5 data disks, 1 commit log disk) and either give each 
 its
 own container  IP or change the listen ports. Will this work? What are 
 the
 risks? Will/should Cassandra support this better in the future?




 --
 http://khangaonkar.blogspot.com/






 --
 Dan Kinder
 Senior Software Engineer
 Turnitin – www.turnitin.com
 dkin...@turnitin.com



Re: Multiple cassandra instances per physical node

2015-05-21 Thread Jonathan Haddad
If you run it in a container with dedicated IPs it'll work just fine.  Just
be sure you aren't using the same machine to replicate it's own data.

On Thu, May 21, 2015 at 12:43 PM Manoj Khangaonkar khangaon...@gmail.com
wrote:

 +1.

 I agree we need to be able to run multiple server instances on one
 physical machine. This is especially necessary in development and test
 environments where one is experimenting and needs a cluster, but do not
 have access to multiple physical machines.

 If you google , you  can find a few blogs that talk about how to do this.

 But it is less than ideal. We need to be able to do it by changing ports
 in cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka
 or Redis and many other distributed systems)


 regards



 On Thu, May 21, 2015 at 10:32 AM, Dan Kinder dkin...@turnitin.com wrote:

 Hi, I'd just like some clarity and advice regarding running multiple
 cassandra instances on a single large machine (big JBOD array, plenty of
 CPU/RAM).

 First, I am aware this was not Cassandra's original design, and doing
 this seems to unreasonably go against the commodity hardware intentions
 of Cassandra's design. In general it seems to be recommended against (at
 least as far as I've heard from @Rob Coli and others).

 However maybe this term commodity is changing... my hardware/ops team
 argues that due to cooling, power, and other datacenter costs, having
 slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a
 better price point. Now, I am not a hardware guy, so if this is not
 actually true I'd love to hear why, otherwise I pretty much need to take
 them at their word.

 Now, Cassandra features seemed to have improved such that JBOD works
 fairly well, but especially with memory/GC this seems to be reaching its
 limit. One Cassandra instance can only scale up so much.

 So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra
 nodes (each with 5 data disks, 1 commit log disk) and either give each its
 own container  IP or change the listen ports. Will this work? What are the
 risks? Will/should Cassandra support this better in the future?




 --
 http://khangaonkar.blogspot.com/



Re: Multiple cassandra instances per physical node

2015-05-21 Thread Manoj Khangaonkar
+1.

I agree we need to be able to run multiple server instances on one physical
machine. This is especially necessary in development and test environments
where one is experimenting and needs a cluster, but do not have access to
multiple physical machines.

If you google , you  can find a few blogs that talk about how to do this.

But it is less than ideal. We need to be able to do it by changing ports in
cassandra.yaml. ( The way it is done easily with Hadoop or Apache Kafka or
Redis and many other distributed systems)


regards



On Thu, May 21, 2015 at 10:32 AM, Dan Kinder dkin...@turnitin.com wrote:

 Hi, I'd just like some clarity and advice regarding running multiple
 cassandra instances on a single large machine (big JBOD array, plenty of
 CPU/RAM).

 First, I am aware this was not Cassandra's original design, and doing this
 seems to unreasonably go against the commodity hardware intentions of
 Cassandra's design. In general it seems to be recommended against (at least
 as far as I've heard from @Rob Coli and others).

 However maybe this term commodity is changing... my hardware/ops team
 argues that due to cooling, power, and other datacenter costs, having
 slightly larger nodes (=32G RAM, =24 CPU, =8 disks JBOD) is actually a
 better price point. Now, I am not a hardware guy, so if this is not
 actually true I'd love to hear why, otherwise I pretty much need to take
 them at their word.

 Now, Cassandra features seemed to have improved such that JBOD works
 fairly well, but especially with memory/GC this seems to be reaching its
 limit. One Cassandra instance can only scale up so much.

 So my question is: suppose I take a 12 disk JBOD and run 2 Cassandra nodes
 (each with 5 data disks, 1 commit log disk) and either give each its own
 container  IP or change the listen ports. Will this work? What are the
 risks? Will/should Cassandra support this better in the future?




-- 
http://khangaonkar.blogspot.com/