Cassandra hardware - balancing CPU/memory/iops/disk space

2010-03-05 Thread Rosenberry, Eric
I am looking for advice from others that are further along in deploying Cassandra in production environments than we are. I want to know what you are finding your bottlenecks to be. I would feel silly purchasing dual processor quad core 2.93ghz Nehalem machines with 192 gigs of RAM just to fin

Re: Unreliable transport layer

2010-03-05 Thread Jonathan Ellis
In 0.6 gossip is over TCP. On Fri, Mar 5, 2010 at 6:54 PM, Ashwin Jayaprakash wrote: > Hey guys! I have a simple question. I'm a casual observer, not a real > Cassandra user yet. So, excuse my ignorance. > > I see that the Gossip feature uses UDP. I was curious to know if you guys > faced issues

Unreliable transport layer

2010-03-05 Thread Ashwin Jayaprakash
Hey guys! I have a simple question. I'm a casual observer, not a real Cassandra user yet. So, excuse my ignorance. I see that the Gossip feature uses UDP. I was curious to know if you guys faced issues with unreliable transports in your production clusters? Like faulty switches, dropped packets et

Re: ConcurrentModificationException

2010-03-05 Thread Jonathan Ellis
Fixed, thanks. On Fri, Mar 5, 2010 at 11:12 AM, B. Todd Burruss wrote: > https://issues.apache.org/jira/browse/CASSANDRA-853 > > On Thu, 2010-03-04 at 19:00 -0800, Jonathan Ellis wrote: > > This is the 0.6 beta yes? Looks like a regression, please open a ticket. > > On Thu, Mar 4, 2010 at 8:54 P

Re: Anti-compaction Diskspace issue even when latest patch applied

2010-03-05 Thread shiv shivaji
Ah, will look at the jmx console. Thought it was under nodetool. cont...@cl201 ~/swell/cassandra $ iostat -x Linux 2.6.30-gentoo-r4pb (cl201) 03/05/10 _x86_64_(8 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 9.660.002.184.980.00 83.18 Devic

Re: Dynamically Switching from Ordered Partitioner to Random?

2010-03-05 Thread shiv shivaji
Point taken. Was thinking of switching in parallel using a 2nd cassandra instance (perhaps on the same set of machines). This way if loadbalancing is too slow, I can try this version. From: Stu Hood To: cassandra-user@incubator.apache.org Sent: Fri, March 5,

Re: Anti-compaction Diskspace issue even when latest patch applied

2010-03-05 Thread Jonathan Ellis
On Fri, Mar 5, 2010 at 1:36 PM, shiv shivaji wrote: > Sorry, how to get compaction progress with 0.6. Is it in nodetool or > somewhere else? I tried a few options after nodetool and did not get this > info. it's under CompactionManager in jmx. I'm not sure if nodetool exposes this but it's easy

Re: Dynamically Switching from Ordered Partitioner to Random?

2010-03-05 Thread Stu Hood
But rather than switching, you should definitely try the 'loadbalance' approach first, and see whether OrderPP works out for you. -Original Message- From: "Chris Goffinet" Sent: Friday, March 5, 2010 1:43pm To: cassandra-user@incubator.apache.org Subject: Re: Dynamically Switching from O

Re: Dynamically Switching from Ordered Partitioner to Random?

2010-03-05 Thread Chris Goffinet
At this time, you have to re-import the data. -Chris On Fri, Mar 5, 2010 at 11:42 AM, shiv shivaji wrote: > I started with the ordered partitioner as I was hoping to make use of the > map-reduce functionality. However, my data was likely lopped onto 2 key > machines with most of it on one (as s

Dynamically Switching from Ordered Partitioner to Random?

2010-03-05 Thread shiv shivaji
I started with the ordered partitioner as I was hoping to make use of the map-reduce functionality. However, my data was likely lopped onto 2 key machines with most of it on one (as seen from another thread. There were also machine failures to blame for the uneven distribution). One solution whi

Re: Anti-compaction Diskspace issue even when latest patch applied

2010-03-05 Thread shiv shivaji
Sorry, how to get compaction progress with 0.6. Is it in nodetool or somewhere else? I tried a few options after nodetool and did not get this info. My vmstats are procs ---memory-- ---swap-- -io -system-- cpu r b swpd free buff cache si sobib

Re: ColumnFamilies vs composite rows in one table.

2010-03-05 Thread Jonathan Ellis
Generally, you want to have different types of data in different CFs so you can tune them separately (key / row caches). Mixing different row types in one CF also makes doing get_slice_range scans difficult. On Fri, Mar 5, 2010 at 12:04 PM, Erik Holstad wrote: > What are the benefits of using mu

Re: ColumnFamilies vs composite rows in one table.

2010-03-05 Thread David Strauss
On 2010-03-05 18:30, David Strauss wrote: > On 2010-03-05 18:04, Erik Holstad wrote: >> So you can either have >> ColumnFamilyFrom:userTo:{userFrom->messageid} >> ColumnFamilyTo:userFrom:{userTo->messageid} >> >> or something like >> ColumnFamily:user_to:{user1_messageId, user2_messageId} >> Column

Re: ColumnFamilies vs composite rows in one table.

2010-03-05 Thread David Strauss
On 2010-03-05 18:04, Erik Holstad wrote: > What are the benefits of using multiple ColumnFamilies compared to using > a composite row name? Just for terminology's sake, I'll note that rows have keys, not names. Only columns and supercolumns have names. I'm not the top expert here by any means, bu

ColumnFamilies vs composite rows in one table.

2010-03-05 Thread Erik Holstad
What are the benefits of using multiple ColumnFamilies compared to using a composite row name? Example: You have messages that you want to index on sent and to. So you can either have ColumnFamilyFrom:userTo:{userFrom->messageid} ColumnFamilyTo:userFrom:{userTo->messageid} or something like Colu

Re: ConcurrentModificationException

2010-03-05 Thread B. Todd Burruss
https://issues.apache.org/jira/browse/CASSANDRA-853 On Thu, 2010-03-04 at 19:00 -0800, Jonathan Ellis wrote: > This is the 0.6 beta yes? Looks like a regression, please open a ticket. > > On Thu, Mar 4, 2010 at 8:54 PM, Todd Burruss wrote: > > i'm seeing a lot of these ... any idea? > > > > 20

Re: ConcurrentModificationException

2010-03-05 Thread B. Todd Burruss
yes, 0.6 beta2 i'll open ticket On Thu, 2010-03-04 at 19:00 -0800, Jonathan Ellis wrote: > This is the 0.6 beta yes? Looks like a regression, please open a ticket. > > On Thu, Mar 4, 2010 at 8:54 PM, Todd Burruss wrote: > > i'm seeing a lot of these ... any idea? > > > > 2010-03-04 18:53:21,4

Re: Anti-compaction Diskspace issue even when latest patch applied

2010-03-05 Thread Jonathan Ellis
On Fri, Mar 5, 2010 at 2:13 AM, shiv shivaji wrote: > 1. Is there a way to estimate the time it would take to compact this work > load? I hope the load balancing will be much faster after the compaction. > Curious how fast I can get the transfer once compaction is done. 0.6 gives you compaction p

Re: Questions while evaluating Cassandra

2010-03-05 Thread Eran Kutner
Thank you Jonathan! On Fri, Mar 5, 2010 at 00:03, Jonathan Ellis wrote: > On Thu, Mar 4, 2010 at 2:51 AM, Eran Kutner wrote: >> On Tue, Mar 2, 2010 at 15:44, Jonathan Ellis wrote: >>> >>> On Tue, Mar 2, 2010 at 6:43 AM, Eran Kutner wrote: >>> > Is the procedure described in the description of

Re: Anti-compaction Diskspace issue even when latest patch applied

2010-03-05 Thread shiv shivaji
Thanks for the pointer. Wanted to figure out if this is the real bottleneck as there might be something else contributing to the low speed. Let me explain our setup in more detail: We are using cassandra to store about 700 million images. This includes image metadata and the image (in binary for