High latency on 5 node Cassandra Cluster

2014-06-04 Thread Arup Chakrabarti
Hello. We had some major latency problems yesterday with our 5 node cassandra cluster. Wanted to get some feedback on where we could start to look to figure out what was causing the issue. If there is more info I should provide, please let me know. Here are the basics of the cluster: Clients:

memtable mem usage off by 10?

2014-06-04 Thread Idrén , Johan
Hi, I'm seeing some strange behavior of the memtables, both in 1.2.13 and 2.0.7, basically it looks like it's using 10x less memory than it should based on the documentation and options. 10GB heap for both clusters. 1.2.x should use 1/3 of the heap for memtables, but it uses max ~300mb

Re: memtable mem usage off by 10?

2014-06-04 Thread Benedict Elliott Smith
If you are storing small values in your columns, the object overhead is very substantial. So what is 400Mb on disk may well be 4Gb in memtables, so if you are measuring the memtable size by the resulting sstable size, you are not getting an accurate picture. This overhead has been reduced by about

RE: memtable mem usage off by 10?

2014-06-04 Thread Idrén , Johan
I'm not measuring memtable size by looking at the sstables on disk, no. I'm looking through the JMX data. So I would believe (or hope) that I'm getting relevant data. If I have a heap of 10GB and set the memtable usage to 20GB, I would expect to hit other problems, but I'm not seeing memory

Re: memtable mem usage off by 10?

2014-06-04 Thread Benedict Elliott Smith
These measurements tell you the amount of user data stored in the memtables, not the amount of heap used to store it, so the same applies. On 4 June 2014 11:04, Idrén, Johan johan.id...@dice.se wrote: I'm not measuring memtable size by looking at the sstables on disk, no. I'm looking through

Re: High latency on 5 node Cassandra Cluster

2014-06-04 Thread Laing, Michael
I would first check to see if there was a time synchronization issue among nodes that triggered and/or perpetuated the event. ml On Wed, Jun 4, 2014 at 3:12 AM, Arup Chakrabarti a...@pagerduty.com wrote: Hello. We had some major latency problems yesterday with our 5 node cassandra cluster.

RE: memtable mem usage off by 10?

2014-06-04 Thread Idrén , Johan
Aha, ok. Thanks. Trying to understand what my cluster is doing: cassandra.db.memtable_data_size only gets me the actual data but not the memtable heap memory usage. Is there a way to check for heap memory usage? I would expect to hit the flush_largest_memtables_at value, and this would be

Re: memtable mem usage off by 10?

2014-06-04 Thread Benedict Elliott Smith
Unfortunately it looks like the heap utilisation of memtables was not exposed in earlier versions, because they only maintained an estimate. The overhead scales linearly with the amount of data in your memtables (assuming the size of each cell is approx. constant). flush_largest_memtables_at is

RE: memtable mem usage off by 10?

2014-06-04 Thread Idrén , Johan
Ok, so the overhead is a constant modifier, right. The 3x I arrived at with the following assumptions: heap is 10GB Default memory for memtable usage is 1/4 of heap in c* 2.0 max memory used for memtables is 2,5GB (10/4) flush_largest_memtables_at is 0.75 flush largest memtables when

Re: memtable mem usage off by 10?

2014-06-04 Thread Benedict Elliott Smith
I'm confused: there is no flush_largest_memtables_at property in C* 2.0? On 4 June 2014 12:55, Idrén, Johan johan.id...@dice.se wrote: Ok, so the overhead is a constant modifier, right. The 3x I arrived at with the following assumptions: heap is 10GB Default memory for memtable

RE: memtable mem usage off by 10?

2014-06-04 Thread Idrén , Johan
Oh, well ok that explains why I'm not seeing a flush at 750MB. Sorry, I was going by the documentation. It claims that the property is around in 2.0. If we skip that, part of my reply still makes sense: Having memtable_total_size_in_mb set to 20480, memtables are flushed at a reported value

Re: Multi-DC Environment Question

2014-06-04 Thread Vasileios Vlachos
Hello Matt, nodetool status: Datacenter: MAN === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Owns (effective) Host ID Token Rack UN 10.2.1.103 89.34 KB 99.2% b7f8bc93-bf39-475c-a251-8fbe2c7f7239 -9211685935328163899 RAC1 UN 10.2.1.102 86.32 KB 0.7%

Re: memtable mem usage off by 10?

2014-06-04 Thread Jack Krupansky
Yeah, it is in the doc: http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html And I don’t find a Jira issue mentioning it being removed, so... what’s the full story there?! -- Jack Krupansky From: Idrén, Johan Sent: Wednesday, June 4, 2014

Re: memtable mem usage off by 10?

2014-06-04 Thread Benedict Elliott Smith
Oh, well ok that explains why I'm not seeing a flush at 750MB. Sorry, I was going by the documentation. It claims that the property is around in 2.0. But something else is wrong, as Cassandra will crash if you supply an invalid property, implying it's not sourcing the config file you're

Re: memtable mem usage off by 10?

2014-06-04 Thread Idrén , Johan
I wasn’t supplying it, I was assuming it was using the default. It does not exist in my config file. Sorry for the confusion. From: Benedict Elliott Smith belliottsm...@datastax.commailto:belliottsm...@datastax.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org

Re: memtable mem usage off by 10?

2014-06-04 Thread Jack Krupansky
And sorry that the doc confused you as well! -- Jack Krupansky From: Idrén, Johan Sent: Wednesday, June 4, 2014 10:51 AM To: user@cassandra.apache.org Subject: Re: memtable mem usage off by 10? I wasn’t supplying it, I was assuming it was using the default. It does not exist in my config

Re: memtable mem usage off by 10?

2014-06-04 Thread Benedict Elliott Smith
In that case I would assume the problem is that for some reason JAMM is failing to load, and so the liveRatio it would ordinarily calculate is defaulting to 10 - are you using the bundled cassandra launch scripts? On 4 June 2014 15:51, Idrén, Johan johan.id...@dice.se wrote: I wasn’t

Re: migration to a new model

2014-06-04 Thread Laing, Michael
OK Marcelo, I'll work on it today. -ml On Tue, Jun 3, 2014 at 8:24 PM, Marcelo Elias Del Valle marc...@s1mbi0se.com.br wrote: Hi Michael, For sure I would be interested in this program! I am new both to python and for cql. I started creating this copier, but was having problems with

Customized Compaction Strategy: Dev Questions

2014-06-04 Thread Redmumba
Good morning! I've asked (and seen other people ask) about the ability to drop old sstables, basically creating a FIFO-like clean-up process. Since we're using Cassandra as an auditing system, this is particularly appealing to us because it means we can maximize the amount of auditing data we

Re: Customized Compaction Strategy: Dev Questions

2014-06-04 Thread Russell Bradberry
You mean this: https://issues.apache.org/jira/browse/CASSANDRA-5228 ? On June 4, 2014 at 12:42:33 PM, Redmumba (redmu...@gmail.com) wrote: Good morning! I've asked (and seen other people ask) about the ability to drop old sstables, basically creating a FIFO-like clean-up process.  Since

Cassandra 2.0 unbalanced ring with vnodes after adding new node

2014-06-04 Thread Владимир Рудев
Hello to everyone! Please, can someone explain where we made a mistake? We have cluster with 4 nodes which uses vnodes(256 per node, default settings), snitch is default on every node: SimpleSnitch. These four nodes was from beginning of a cluster. In this cluster we have keyspace with this

unsubscribe

2014-06-04 Thread Raj Janakarajan
-- Data Architect ❘ Zephyr Health 589 Howard St. ❘ San Francisco, CA 94105 m: +1 9176477433 ❘ f: +1 415 520-9288 o: +1 415 529-7649 | s: raj.janakarajan http://www.zephyrhealth.com

Re: Customized Compaction Strategy: Dev Questions

2014-06-04 Thread Redmumba
Not quite; if I'm at say 90% disk usage, I'd like to drop the oldest sstable rather than simply run out of space. The problem with using TTLs is that I have to try and guess how much data is being put in--since this is auditing data, the usage can vary wildly depending on time of year, verbosity

Re: Customized Compaction Strategy: Dev Questions

2014-06-04 Thread Russell Bradberry
hmm, I see. So something similar to Capped Collections in MongoDB. On June 4, 2014 at 1:03:46 PM, Redmumba (redmu...@gmail.com) wrote: Not quite; if I'm at say 90% disk usage, I'd like to drop the oldest sstable rather than simply run out of space. The problem with using TTLs is that I have

Re: Customized Compaction Strategy: Dev Questions

2014-06-04 Thread Redmumba
Thanks, Russell--yes, a similar concept, just applied to sstables. I'm assuming this would require changes to both major compactions, and probably GC (to remove the old tables), but since I'm not super-familiar with the C* internals, I wanted to make sure it was feasible with the current toolset

Re: Customized Compaction Strategy: Dev Questions

2014-06-04 Thread Russell Bradberry
I’m not sure what you want to do is feasible.  At a high level I can see you running into issues with RF etc.  The SSTables node to node are not identical, so if you drop a full SSTable on one node there is no one corresponding SSTable on the adjacent nodes to drop.    You would need to choose

Linux containers, docker, SSD, and RAID.

2014-06-04 Thread Kevin Burton
Hey guys. Question about using container with Cassandra. I think we will eventually deploy on containers… lxc with docker probably. Our first config will have one cassandra daemon per box. Of course there are issues here. Larger per VM heap means more GC time and potential stop the world and

Re: Too Many Open Files (sockets) - VNodes - Map/Reduce Job

2014-06-04 Thread Michael Shuler
(this is probably a better question for the user list - cc/reply-to set) Allow more files to be open :) http://www.datastax.com/documentation/cassandra/1.2/cassandra/install/installRecommendSettings.html -- Kind regards, Michael On 06/04/2014 12:15 PM, Florian Dambrine wrote: Hi every

Re: High latency on 5 node Cassandra Cluster

2014-06-04 Thread Nate McCall
That is a pretty old version of Cassandra at this point. If you are using counters anywhere, you are probably seeing https://issues.apache.org/jira/browse/CASSANDRA-4578 which only shows up after you hit some arbitrary traffic threshold. If you don't want to upgrade (you really should), there

Re: Customized Compaction Strategy: Dev Questions

2014-06-04 Thread Redmumba
Let's say I run a major compaction every day, so that the oldest sstable contains only the data for January 1st. Assuming all the nodes are in-sync and have had at least one repair run before the table is dropped (so that all information for that time period is the same), wouldn't it be safe to

Re: Customized Compaction Strategy: Dev Questions

2014-06-04 Thread Russell Bradberry
Maybe I’m misunderstanding something, but what makes you think that running a major compaction every day will cause they data from January 1st to exist in only one SSTable and not have data from other days in the SSTable as well? Are you talking about making a new compaction strategy that

Re: Customized Compaction Strategy: Dev Questions

2014-06-04 Thread Redmumba
Sorry, yes, that is what I was looking to do--i.e., create a TopologicalCompactionStrategy or similar. On Wed, Jun 4, 2014 at 10:40 AM, Russell Bradberry rbradbe...@gmail.com wrote: Maybe I’m misunderstanding something, but what makes you think that running a major compaction every day will

Re: Customized Compaction Strategy: Dev Questions

2014-06-04 Thread Jonathan Haddad
I'd suggest creating 1 table per day, and dropping the tables you don't need once you're done. On Wed, Jun 4, 2014 at 10:44 AM, Redmumba redmu...@gmail.com wrote: Sorry, yes, that is what I was looking to do--i.e., create a TopologicalCompactionStrategy or similar. On Wed, Jun 4, 2014 at

Re: Customized Compaction Strategy: Dev Questions

2014-06-04 Thread Redmumba
That still involves quite a bit of infrastructure work--it also means that to query the data, I would have to make N queries, one per table, to query for audit information (audit information is sorted by a key identifying the item, and then the date). I don't think this would yield any benefit

Re: Customized Compaction Strategy: Dev Questions

2014-06-04 Thread Russell Bradberry
Well, DELETE will not free up disk space until after GC grace has passed and the next major compaction has run. So in essence, if you need to free up space right away, then creating daily/monthly tables would be one way to go.  Just remember to clear your snapshots after dropping though. On

Re: High latency on 5 node Cassandra Cluster

2014-06-04 Thread Robert Coli
On Wed, Jun 4, 2014 at 12:12 AM, Arup Chakrabarti a...@pagerduty.com wrote: Size: 5 nodes (2 in AWS US-West-1, 2 in AWS US-West-2, 1 in Linode Fremont) Replication Factor: 5 You're operating with a single-DC strategy across multiple data centers? If so, I'm surprised you get sane latency

Re: New node Unable to gossip with any seeds

2014-06-04 Thread Chris Burroughs
This generally means that how you are describing the see nodes address doesn't match how it's described in the second node seeds list in the correct way. CASSANDRA-6523 has some links that might be helpful. On 05/26/2014 12:07 AM, Tim Dunphy wrote: Hello, I am trying to spin up a new node

Re: alternative vnode upgrade strategy?

2014-06-04 Thread Chris Burroughs
On 05/28/2014 02:18 PM, William Oberman wrote: 1.) Upgrade all N nodes to vnodes in place Start loop 2.) Boot a new node and let it bootstrap 3.) Decommission an old node End loop I's been a while since I had to think about the vnode migration, but I've think this would fall pray to

Re: Number of rows under one partition key

2014-06-04 Thread Robert Coli
On Wed, Jun 4, 2014 at 12:39 PM, Chris Burroughs chris.burrou...@gmail.com wrote: https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ Although by the simplistic version count hueirstic the sheer quantity of releases in the 2.0.x line would now satisfy the constraint.

Snapshot the data with 3 node and replicationfactor=3

2014-06-04 Thread ng
Is there any reason you would like to take snapshot of column family on each node when cluster consists of 3 nodes with keyspace on replication factor =3? I am thinking of taking snapshot of CF on only one node. For restore, I will follow below 1. drop and recreate the CF on node1 2. copy

Re: problem removing dead node from ring

2014-06-04 Thread Robert Coli
On Tue, Jun 3, 2014 at 9:03 PM, Matthew Allen matthew.j.al...@gmail.com wrote: Thanks Robert, this makes perfect sense. Do you know if CASSANDRA-6961 will be ported to 1.2.x ? I just asked driftx, he said not gonna happen. And apologies if these appear to be dumb questions, but is a

Re: Snapshot the data with 3 node and replicationfactor=3

2014-06-04 Thread Robert Coli
On Wed, Jun 4, 2014 at 1:26 PM, ng pipeli...@gmail.com wrote: Is there any reason you would like to take snapshot of column family on each node when cluster consists of 3 nodes with keyspace on replication factor =3? Unless all read/write occurs with CL.ALL (which is an availability

Re: Snapshot the data with 3 node and replicationfactor=3

2014-06-04 Thread ng
I am not worried about eventually consistent data. I just wanted to get rough data in close proximate. ng On Wed, Jun 4, 2014 at 2:49 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Jun 4, 2014 at 1:26 PM, ng pipeli...@gmail.com wrote: Is there any reason you would like to take snapshot

nodetool move seems slow

2014-06-04 Thread Jason Tyler
Hello, We have a 5-node cluster runing cassandra 1.2.16, with a significant amount of data: AddressRackStatus State LoadOwns Token 6783174585269344219 10.198.xx.xx1

Re: Consolidating records and TTL

2014-06-04 Thread Tyler Hobbs
Just use an atomic batch that holds both the insert and deletes: http://www.datastax.com/dev/blog/atomic-batches-in-cassandra-1-2 On Tue, Jun 3, 2014 at 2:13 PM, Charlie Mason charlie@gmail.com wrote: Hi All. I have a system thats going to make possibly several concurrent changes to a

Re: nodetool move seems slow

2014-06-04 Thread Robert Coli
On Wed, Jun 4, 2014 at 2:34 PM, Jason Tyler jaty...@yahoo-inc.com wrote: I wrote 'apparent progress' because it reports “MOVING” and the Pending Commands/Responses are changing over time. However, I haven’t seen the individual .db files progress go above 0%. Your move is hung. Restart the

Re: migration to a new model

2014-06-04 Thread Laing, Michael
BTW you might want to put a LIMIT clause on your SELECT for testing. -ml On Wed, Jun 4, 2014 at 6:04 PM, Laing, Michael michael.la...@nytimes.com wrote: Marcelo, Here is a link to the preview of the python fast copy program: https://gist.github.com/michaelplaing/37d89c8f5f09ae779e47 It