Re: New application - separate column family or separate cluster?

2014-07-09 Thread Jeremy Jongsma
Thanks Tupshin, I am thinking #2 is the way to go in my case, and always
have the option of migrating column families to a new cluster if needed.

Parag, At the traffic volumes I'm talking about, #2 (and especially #3)
will have a lot more total VM nodes, because the other apps are used
lightly enough that there is no reason to add capacity specifically for
them to an already large cluster. But app-specific clusters would need at
least 3 nodes each (for redundancy) when the actual traffic load would
require less than one, hence the increased node costs.


On Wed, Jul 9, 2014 at 7:07 AM, Parag Patel 
wrote:

>  In your scenario #1, is the total number of nodes staying the same?
> Meaning, if you launch multiple clusters for #2, you’d have N total nodes –
> are we assuming #1 has N or less than N?
>
>
>
> If #1 and #2 both have N, wouldn’t the performance be the same since
> Cassandra’s performance increases linearly?
>
>
>
> *From:* Tupshin Harper [mailto:tups...@tupshin.com]
> *Sent:* Tuesday, July 08, 2014 11:13 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: New application - separate column family or separate
> cluster?
>
>
>
> I've seen a lot of deployments, and I think you captured the scenarios and
> reasoning quite well. You can apply other nuances and details to #2 (e.g.
> segment based on SLA or topology), but I agree with all of your reasoning.
>
> -Tupshin
> -Global Field Strategy
> -Datastax
>
> On Jul 8, 2014 10:54 AM, "Jeremy Jongsma"  wrote:
>
>  Do you prefer purpose-specific Cassandra clusters that support a single
> application's data set, or a single Cassandra cluster that contains column
> families for many applications? I realize there is no ideal answer for
> every situation, but what have your experiences been in this area for
> cluster planning?
>
>
>
> My reason for asking is that we have one application with high data volume
> (multiple TB, thousands of writes/sec) that caused us to adopt Cassandra in
> the first place. Now we have the tools and cluster management
> infrastructure built up to the point where it is not a major investment to
> store smaller sets of data for other applications in C* also, and I am
> debating whether to:
>
>
>
> 1) Store everything in one large cluster (no isolation, low cost)
>
> 2) Use one cluster for the high-volume data, and one for everything else
> (good isolation, medium cost)
>
> 3) Give every major service its own cluster, even if they have small
> amounts of data (best isolation, highest cost)
>
>
>
> I suspect #2 is the way to go as far as balancing hosting costs and
> application performance isolation. Any pros or cons am I missing?
>
>
>
> -j
>
>


RE: New application - separate column family or separate cluster?

2014-07-09 Thread Parag Patel
In your scenario #1, is the total number of nodes staying the same?  Meaning, 
if you launch multiple clusters for #2, you’d have N total nodes – are we 
assuming #1 has N or less than N?

If #1 and #2 both have N, wouldn’t the performance be the same since 
Cassandra’s performance increases linearly?

From: Tupshin Harper [mailto:tups...@tupshin.com]
Sent: Tuesday, July 08, 2014 11:13 PM
To: user@cassandra.apache.org
Subject: Re: New application - separate column family or separate cluster?


I've seen a lot of deployments, and I think you captured the scenarios and 
reasoning quite well. You can apply other nuances and details to #2 (e.g. 
segment based on SLA or topology), but I agree with all of your reasoning.

-Tupshin
-Global Field Strategy
-Datastax
On Jul 8, 2014 10:54 AM, "Jeremy Jongsma" 
mailto:jer...@barchart.com>> wrote:
Do you prefer purpose-specific Cassandra clusters that support a single 
application's data set, or a single Cassandra cluster that contains column 
families for many applications? I realize there is no ideal answer for every 
situation, but what have your experiences been in this area for cluster 
planning?

My reason for asking is that we have one application with high data volume 
(multiple TB, thousands of writes/sec) that caused us to adopt Cassandra in the 
first place. Now we have the tools and cluster management infrastructure built 
up to the point where it is not a major investment to store smaller sets of 
data for other applications in C* also, and I am debating whether to:

1) Store everything in one large cluster (no isolation, low cost)
2) Use one cluster for the high-volume data, and one for everything else (good 
isolation, medium cost)
3) Give every major service its own cluster, even if they have small amounts of 
data (best isolation, highest cost)

I suspect #2 is the way to go as far as balancing hosting costs and application 
performance isolation. Any pros or cons am I missing?

-j


Re: New application - separate column family or separate cluster?

2014-07-08 Thread Tupshin Harper
I've seen a lot of deployments, and I think you captured the scenarios and
reasoning quite well. You can apply other nuances and details to #2 (e.g.
segment based on SLA or topology), but I agree with all of your reasoning.

-Tupshin
-Global Field Strategy
-Datastax
On Jul 8, 2014 10:54 AM, "Jeremy Jongsma"  wrote:

> Do you prefer purpose-specific Cassandra clusters that support a single
> application's data set, or a single Cassandra cluster that contains column
> families for many applications? I realize there is no ideal answer for
> every situation, but what have your experiences been in this area for
> cluster planning?
>
> My reason for asking is that we have one application with high data volume
> (multiple TB, thousands of writes/sec) that caused us to adopt Cassandra in
> the first place. Now we have the tools and cluster management
> infrastructure built up to the point where it is not a major investment to
> store smaller sets of data for other applications in C* also, and I am
> debating whether to:
>
> 1) Store everything in one large cluster (no isolation, low cost)
> 2) Use one cluster for the high-volume data, and one for everything else
> (good isolation, medium cost)
> 3) Give every major service its own cluster, even if they have small
> amounts of data (best isolation, highest cost)
>
> I suspect #2 is the way to go as far as balancing hosting costs and
> application performance isolation. Any pros or cons am I missing?
>
> -j
>