Re: why do I have to use internal IP for EC2 nodes?
@Yang: Sounds legit, as internal is not the same as external. Beware of the fact that internal traffic is only free when it's in the same availability zone. In the same region is charged with a small amount (~ $0.01). With kind regards, Robin Verlangen *Software engineer* * * W http://www.robinverlangen.nl E ro...@us2.nl Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. 2012/9/5 Yang tedd...@gmail.com thanks, but if the communication between cluster nodes all resolve to internal to internal, amazon will not charge the traffic as external traffic, right? On Tue, Sep 4, 2012 at 7:08 PM, aaron morton aa...@thelastpickle.comwrote: See http://aws.amazon.com/articles/1145?_encoding=UTF8jiveRedirect=1#12 The external dns will resolve to the internal IP when resolved internally. Using the internal IP means you are not charged for IO and it makes it clear you do not expect this service to be access from outside. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 5/09/2012, at 7:37 AM, Yang tedd...@gmail.com wrote: http://www.datastax.com/docs/1.1/initialize/cluster_init says: Note In the - seeds list property, include the internal IP addresses of each seed node. why do I have to use internal IP? on a EC2 node, hostname resolution seems to directly give its internal IP: $ host aws1devbic1.biqa.ctgrd.com aws1devbic1.biqa.ctgrd.com is an alias for ec2-50-17-3-229.compute-1.amazonaws.com. ec2-50-17-3-229.compute-1.amazonaws.com has address 10.28.166.83 so using the public DNS or internal IP seems to be the same thing, or something I'm missing ? Thanks Yang
Re: java.lang.NoClassDefFoundError when trying to do anything on one CF on one node
Thanks for the help Aaron. I've checked NodeIdInfo and LocationInfo as below. What am I looking at? I'm guessing the first row in NodeIdInfo represents the ring with the node ids, but the second row perhaps dead nodes with old schemas? That's a total guess, I'd be very interested to know what it and the LocationInfo are. If there's anything else you'd like me to check let me know, otherwise I'll attempt your workaround later today. [default@system] list NodeIdInfo ; Using default limit of 100 --- RowKey: 4c6f63616c = (column=b10552c0-ea0f-11e0--cb1f02ccbcff, value=0a1020d2, timestamp=1317241393645) = (column=e64fc8f0-595b-11e1--51be601cd0d7, value=0a1020d2, timestamp=1329478703871) = (column=732d4690-a596-11e1--51be601cd09f, value=0a1020d2, timestamp=1337860139385) = (column=bffd9d40-aa45-11e1--51be601cd0fe, value=0a1020d2, timestamp=1338375234836) = (column=01efa5d0-e133-11e1--51be601cd0ff, value=0a1020d2, timestamp=1344414498989) = (column=92109b80-ea0a-11e1--51be601cd0af, value=0a1020d2, timestamp=1345386691897) --- RowKey: 43757272656e744c6f63616c = (column=01efa5d0-e133-11e1--51be601cd0ff, value=0a1020d2, timestamp=1344414498989) = (column=92109b80-ea0a-11e1--51be601cd0af, value=0a1020d2, timestamp=1345386691897) 2 Rows Returned. Elapsed time: 128 msec(s). [default@system] list LocationInfo ; Using default limit of 100 --- RowKey: 52696e67 = (column=00, value=0a1080d2, timestamp=134104900) = (column=04a7128b6c83505dcd618720f92028f4, value=0a1020b7, timestamp=1332360971660) = (column=09249249249249249249249249249249, value=0a1080cd, timestamp=1341136002862) = (column=12492492492492492492492492492492, value=0a1020d3, timestamp=1341135999465) = (column=1500, value=0a1060d3, timestamp=134104671) = (column=1555, value=0a1020d3, timestamp=1344530188382) = (column=1b6db6db6db6db6db6db6db6db6db6db, value=0a1020b1, timestamp=1341135997643) = (column=1c71c71c71c71bff, value=0a1080d2, timestamp=1317241889689) = (column=24924924924924924924924924924924, value=0a1060d3, timestamp=1341135996555) = (column=29ff, value=0a1020d3, timestamp=1317241534292) = (column=2aaa, value=0a1060d3, timestamp=1344530187539) = (column=38e38e38e38e37ff, value=0a1060d3, timestamp=1317241257569) = (column=38e38e38e38e38e38e38e38e38e38e38, value=0a1060d3, timestamp=1343136501647) = (column=393170e0207a17d8519f0c1bfe325d51, value=0a1020d3, timestamp=1345381375120) = (column=3fff, value=0a1080d3, timestamp=134104939) = (column=471c71c71c71c71c71c71c71c71c71c6, value=0a1080d3, timestamp=1343133153701) = (column=471c71c71c71c7ff, value=0a1080d3, timestamp=1317241786636) = (column=49249249249249249249249249249249, value=0a1080d3, timestamp=1341136002693) = (column=52492492492492492492492492492492, value=0a106010, timestamp=1341136002626) = (column=53ff, value=0a1020d4, timestamp=1328473688357) = (column=5554, value=0a1060d4, timestamp=134104910) = (column=5b6db6db6db6db6db6db6db6db6db6da, value=0a1060d4, timestamp=1332389784945) = (column=5b6db6db6db6db6db6db6db6db6db6db, value=0a1060d4, timestamp=1341136001027) = (column=638e38e38e38e38e38e38e38e38e38e2, value=0a1060d4, timestamp=1343125208462) = (column=638e38e38e38e3ff, value=0a1060d4, timestamp=1317241257577) = (column=6c00, value=0a1020d3, timestamp=134104789) --- RowKey: 4c = (column=436c75737465724e616d65, value=4d6f6e737465724d696e642050726f6420436c7573746572, timestamp=1317241251097000) = (column=47656e65726174696f6e, value=50447e78, timestamp=134104152000) = (column=50617274696f6e6572, value=6f72672e6170616368652e63617373616e6472612e6468742e52616e646f6d506172746974696f6e6572, timestamp=1317241251097000) = (column=546f6b656e, value=2a00, timestamp=134104214) --- RowKey: 436f6f6b696573 = (column=48696e7473207075726765642061732070617274206f6620757067726164696e672066726f6d20302e362e7820746f20302e37, value=6f68207965732c20697420746865792077657265207075726765642e, timestamp=1317241251249) = (column=5072652d312e302068696e747320707572676564, value=6f68207965732c2074686579207765726520707572676564, timestamp=1326274339337) --- RowKey: 426f6f747374726170 = (column=42, value=01, timestamp=134104213) 4 Rows Returned. Elapsed time: 34 msec(s). On Wed, Sep 5, 2012 at 2:42 AM, aaron morton aa...@thelastpickle.comwrote: Hmmm, this looks like an error in ctor for NodeId$LocalNodeIdHistory. Are there any other ERROR log messages? Do you see either of these two messages in the log: No saved local node id, using newly generated: {} or Saved local node id: {} Can you use cassandra-cli / cqlsh to print the contents of the
Re: Practical node size limits
You can try increasing streaming throttle. 2012/9/4 Dustin Wenz dustinw...@ebureau.com I'm following up on this issue, which I've been monitoring for the last several weeks. I thought people might find my observations interesting. Ever since increasing the heap size to 64GB, we've had no OOM conditions that resulted in a JVM termination. Our nodes have around 2.5TB of data each, and the replication factor is four. IO on the cluster seems to be fine, though I haven't been paying particular attention to any GC hangs. The bottleneck now seems to be the repair time. If any node becomes too inconsistent, or needs to be replaced, the rebuilt time is over a week. That issue alone makes this cluster configuration unsuitable for production use. - .Dustin On Jul 30, 2012, at 2:04 PM, Dustin Wenz dustinw...@ebureau.com wrote: Thanks for the pointer! It sounds likely that's what I'm seeing. CFStats reports that the bloom filter size is currently several gigabytes. Is there any way to estimate how much heap space a repair would require? Is it a function of simply adding up the filter file sizes, plus some fraction of neighboring nodes? I'm still curious about the largest heap sizes that people are running with on their deployments. I'm considering increasing ours to 64GB (with 96GB physical memory) to see where that gets us. Would it be necessary to keep the young-gen size small to avoid long GC pauses? I also suspect that I may need to keep my memtable sizes small to avoid long flushes; maybe in the 1-2GB range. - .Dustin On Jul 29, 2012, at 10:45 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Yikes. You should read: http://wiki.apache.org/cassandra/LargeDataSetConsiderations Essentially what it sounds like your are now running into is this: The BloomFilters for each SSTable must exist in main memory. Repair tends to create some extra data which normally gets compacted away later. Your best bet is to temporarily raise the Xmx heap and adjust the index sampling size. If you need to save the data (if it is just test data you may want to give up and start fresh) Generally the issue with the large disk configurations it is hard to keep a good ram/disk ratio. Then most reads turn into disk seeks and the throughput is low. I get the vibe people believe large stripes are going to help Cassandra. The issue is that stripes generally only increase sequential throughput, but Cassandra is a random read system. How much ram/disk you need is case dependent but 1/5 ratio of RAM to disk is where I think most people want to be, unless their system is carrying SSD disks. Again you have to keep your bloom filters in java heap memory so and design that tries to create a quatrillion small rows is going to have memory issues as well. On Sun, Jul 29, 2012 at 10:40 PM, Dustin Wenz dustinw...@ebureau.com wrote: I'm trying to determine if there are any practical limits on the amount of data that a single node can handle efficiently, and if so, whether I've hit that limit or not. We've just set up a new 7-node cluster with Cassandra 1.1.2 running under OpenJDK6. Each node is 12-core Xeon with 24GB of RAM and is connected to a stripe of 10 3TB disk mirrors (a total of 20 spindles each) and connected via dual SATA-3 interconnects. I can read and write around 900MB/s sequentially on the arrays. I started out with Cassandra tuned with all-default values, with the exception of the compaction throughput which was increased from 16MB/s to 100MB/s. These defaults will set the heap size to 6GB. Our schema is pretty simple; only 4 column families and each has one secondary index. The replication factor was set to four, and compression disabled. Our access patterns are intended to be about equal numbers of inserts and selects, with no updates, and the occasional delete. The first thing we did was begin to load data into the cluster. We could perform about 3000 inserts per second, which stayed mostly flat. Things started to go wrong around the time the nodes exceeded 800GB. Cassandra began to generate a lot of mutations messages dropped warnings, and was complaining that the heap was over 75% capacity. At that point, we stopped all activity on the cluster and attempted a repair. We did this so we could be sure that the data was fully consistent before continuing. Our mistake was probably trying to repair all of the nodes simultaneously - within an hour, Java terminated on one of the nodes with a heap out-of-memory message. I then increased all of the heap sizes to 8GB, and reduced the heap_newsize to 800MB. All of the nodes were restarted, and there was no no outside activity on the cluster. I then began a repair on a single node. Within a few hours, it OOMed again and exited. I then increased the heap to 12GB, and attempted the same thing. This time, the repair ran for about 7 hours before
Re: java.lang.NoClassDefFoundError when trying to do anything on one CF on one node
forgot to answer your first question. I see this: INFO 14:31:31,896 No saved local node id, using newly generated: 92109b80-ea0a-11e1--51be601cd0af On Wed, Sep 5, 2012 at 8:41 AM, Thomas van Neerijnen t...@bossastudios.comwrote: Thanks for the help Aaron. I've checked NodeIdInfo and LocationInfo as below. What am I looking at? I'm guessing the first row in NodeIdInfo represents the ring with the node ids, but the second row perhaps dead nodes with old schemas? That's a total guess, I'd be very interested to know what it and the LocationInfo are. If there's anything else you'd like me to check let me know, otherwise I'll attempt your workaround later today. [default@system] list NodeIdInfo ; Using default limit of 100 --- RowKey: 4c6f63616c = (column=b10552c0-ea0f-11e0--cb1f02ccbcff, value=0a1020d2, timestamp=1317241393645) = (column=e64fc8f0-595b-11e1--51be601cd0d7, value=0a1020d2, timestamp=1329478703871) = (column=732d4690-a596-11e1--51be601cd09f, value=0a1020d2, timestamp=1337860139385) = (column=bffd9d40-aa45-11e1--51be601cd0fe, value=0a1020d2, timestamp=1338375234836) = (column=01efa5d0-e133-11e1--51be601cd0ff, value=0a1020d2, timestamp=1344414498989) = (column=92109b80-ea0a-11e1--51be601cd0af, value=0a1020d2, timestamp=1345386691897) --- RowKey: 43757272656e744c6f63616c = (column=01efa5d0-e133-11e1--51be601cd0ff, value=0a1020d2, timestamp=1344414498989) = (column=92109b80-ea0a-11e1--51be601cd0af, value=0a1020d2, timestamp=1345386691897) 2 Rows Returned. Elapsed time: 128 msec(s). [default@system] list LocationInfo ; Using default limit of 100 --- RowKey: 52696e67 = (column=00, value=0a1080d2, timestamp=134104900) = (column=04a7128b6c83505dcd618720f92028f4, value=0a1020b7, timestamp=1332360971660) = (column=09249249249249249249249249249249, value=0a1080cd, timestamp=1341136002862) = (column=12492492492492492492492492492492, value=0a1020d3, timestamp=1341135999465) = (column=1500, value=0a1060d3, timestamp=134104671) = (column=1555, value=0a1020d3, timestamp=1344530188382) = (column=1b6db6db6db6db6db6db6db6db6db6db, value=0a1020b1, timestamp=1341135997643) = (column=1c71c71c71c71bff, value=0a1080d2, timestamp=1317241889689) = (column=24924924924924924924924924924924, value=0a1060d3, timestamp=1341135996555) = (column=29ff, value=0a1020d3, timestamp=1317241534292) = (column=2aaa, value=0a1060d3, timestamp=1344530187539) = (column=38e38e38e38e37ff, value=0a1060d3, timestamp=1317241257569) = (column=38e38e38e38e38e38e38e38e38e38e38, value=0a1060d3, timestamp=1343136501647) = (column=393170e0207a17d8519f0c1bfe325d51, value=0a1020d3, timestamp=1345381375120) = (column=3fff, value=0a1080d3, timestamp=134104939) = (column=471c71c71c71c71c71c71c71c71c71c6, value=0a1080d3, timestamp=1343133153701) = (column=471c71c71c71c7ff, value=0a1080d3, timestamp=1317241786636) = (column=49249249249249249249249249249249, value=0a1080d3, timestamp=1341136002693) = (column=52492492492492492492492492492492, value=0a106010, timestamp=1341136002626) = (column=53ff, value=0a1020d4, timestamp=1328473688357) = (column=5554, value=0a1060d4, timestamp=134104910) = (column=5b6db6db6db6db6db6db6db6db6db6da, value=0a1060d4, timestamp=1332389784945) = (column=5b6db6db6db6db6db6db6db6db6db6db, value=0a1060d4, timestamp=1341136001027) = (column=638e38e38e38e38e38e38e38e38e38e2, value=0a1060d4, timestamp=1343125208462) = (column=638e38e38e38e3ff, value=0a1060d4, timestamp=1317241257577) = (column=6c00, value=0a1020d3, timestamp=134104789) --- RowKey: 4c = (column=436c75737465724e616d65, value=4d6f6e737465724d696e642050726f6420436c7573746572, timestamp=1317241251097000) = (column=47656e65726174696f6e, value=50447e78, timestamp=134104152000) = (column=50617274696f6e6572, value=6f72672e6170616368652e63617373616e6472612e6468742e52616e646f6d506172746974696f6e6572, timestamp=1317241251097000) = (column=546f6b656e, value=2a00, timestamp=134104214) --- RowKey: 436f6f6b696573 = (column=48696e7473207075726765642061732070617274206f6620757067726164696e672066726f6d20302e362e7820746f20302e37, value=6f68207965732c20697420746865792077657265207075726765642e, timestamp=1317241251249) = (column=5072652d312e302068696e747320707572676564, value=6f68207965732c2074686579207765726520707572676564, timestamp=1326274339337) --- RowKey: 426f6f747374726170 = (column=42, value=01, timestamp=134104213) 4 Rows Returned. Elapsed time: 34 msec(s). On Wed, Sep 5, 2012 at 2:42 AM,
Cannot bootstrap new nodes in 1.0.11 ring - schema issue
Hey folks, I have a 1.0.11 ring running in production with 6 nodes. Trying to bootstrap a new node in, and I'm getting the following consistently: INFO [main] 2012-09-05 04:24:13,317 StorageService.java (line 668) JOINING: waiting for schema information to complete After waiting for over 30 minutes, I restarted the node to try again, and got the same thing. Tried wiping out the data dir on the new node, as well. Same result. Turned on DEBUG, and got the following: INFO [main] 2012-09-05 03:58:55,205 StorageService.java (line 668) JOINING: waiting for schema information to complete DEBUG [MigrationStage:1] 2012-09-05 03:59:11,440 DefinitionsUpdateVerbHandler.java (line 70) Applying UpdateColumnFamily from /10.140.128.218 DEBUG [MigrationStage:1] 2012-09-05 03:59:11,440 DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous version mismatch. cannot apply. DEBUG [MigrationStage:1] 2012-09-05 03:59:11,631 DefinitionsUpdateVerbHandler.java (line 70) Applying UpdateColumnFamily from /10.140.128.218 DEBUG [MigrationStage:1] 2012-09-05 03:59:11,631 DefinitionsUpdateVerbHandler.java (line 80) Migration not applied Previous version mismatch. cannot apply. The logs continue with a bunch of failed migration errors from each node in the ring. So, I'm guessing that there is a schema history problem on one of my nodes? Any clues on how I can fix this? I had considered wiping out the schema on one of my running nodes and starting it back up, but I'm worried it might not come back if it gets the same errors. Also as a random question: is there any way to 'merge' historical schema changes together? Thanks, Jason
Re: configure KeyCahce to use Non-Heap memory ?
Hello Aaron, Thanks a lot for the response. Raised a request https://issues.apache.org/jira/browse/CASSANDRA-4619 Here is the nodetool dump: (from one of the two nodes in the cluster) Token: 0 Gossip active: true Thrift active: true Load : 147.64 GB Generation No: 1346635362 Uptime (seconds) : 182707 Heap Memory (MB) : 4884.33 / 8032.00 Data Center : datacenter1 Rack : rack1 Exceptions : 0 Key Cache: size 777651120 (bytes), capacity 777651120 (bytes), 44354999 hits, 98275175 requests, 0.451 recent hit rate, 14400 save period in seconds Row Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds Number of rows in the 2 node cluster is 74+ Million Regards, Ananth From: aaron morton aa...@thelastpickle.commailto:aa...@thelastpickle.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Wednesday, September 5, 2012 11:33 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: configure KeyCahce to use Non-Heap memory ? Is there any way I can configure KeyCahce to use Non-Heap memory ? No. You could add a feature request here https://issues.apache.org/jira/browse/CASSANDRA Could you post some stats on the current key cache size and hit rate ? (from nodetool info) It would be interesting to know how many keys it contains Vs the number of rows on the box and the hit rate. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 4/09/2012, at 3:01 PM, Ananth Gundabattula agundabatt...@threatmetrix.commailto:agundabatt...@threatmetrix.com wrote: Is there any way I can configure KeyCahce to use Non-Heap memory ? We have large memory nodes : ~96GB memory per node and effectively using only 8 GB configured for heap ( to avoid GC issues because of a large heap) We have a constraint with respect to : 1. Row cache models don't reflect our data query patterns and hence can only optimize on the key cache 2. Time constrained to change our schema to be more NO-SQL specific Regards, Ananth
Schema Disagreement after migration from 1.0.6 to 1.1.4
Hi list We have a 5-node Cassandra cluster with a single 1.0.9 installation and four 1.0.6 installations. We have tried installing 1.1.4 on one of the 1.0.6 nodes (following the instructions on http://www.datastax.com/docs/1.1/install/upgrading). After bringing up 1.1.4 there are no errors in the log, but the cluster now suffers from schema disagreement [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 59adb24e-f3cd-3e02-97f0-5b395827453f: [10.10.29.67] - The new 1.1.4 node 943fc0a0-f678-11e1--339cf8a6c1bf: [10.10.87.228, 10.10.153.45, 10.10.145.90, 10.38.127.80] - nodes in the old cluster The recipe for recovering from schema disagreement ( http://wiki.apache.org/cassandra/FAQ#schema_disagreement) doesn't cover the new directory layout. The system/Schema directory is empty save for a snapshots subdirectory. system/schema_columnfamilies and system/schema_keyspaces contain some files. As described in datastax's description, we tried running nodetool upgradesstables. When this had done, describe schema in the cli showed a schema definition which seemed correct, but was indeed different from the schema on the other nodes in the cluster. Any clues on how we should proceed? Thanks, /Martin Koch
Re: Schema Disagreement after migration from 1.0.6 to 1.1.4
I would try nodetool resetlocalschema. On 12-09-05 07:08 AM, Martin Koch wrote: Hi list We have a 5-node Cassandra cluster with a single 1.0.9 installation and four 1.0.6 installations. We have tried installing 1.1.4 on one of the 1.0.6 nodes (following the instructions on http://www.datastax.com/docs/1.1/install/upgrading). After bringing up 1.1.4 there are no errors in the log, but the cluster now suffers from schema disagreement [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 59adb24e-f3cd-3e02-97f0-5b395827453f: [10.10.29.67] -The new 1.1.4 node 943fc0a0-f678-11e1--339cf8a6c1bf: [10.10.87.228, 10.10.153.45, 10.10.145.90, 10.38.127.80] - nodes in the old cluster The recipe for recovering from schema disagreement (http://wiki.apache.org/cassandra/FAQ#schema_disagreement) doesn't cover the new directory layout. The system/Schema directory is empty save for a snapshots subdirectory. system/schema_columnfamilies and system/schema_keyspaces contain some files. As described in datastax's description, we tried running nodetool upgradesstables. When this had done, describe schema in the cli showed a schema definition which seemed correct, but was indeed different from the schema on the other nodes in the cluster. Any clues on how we should proceed? Thanks, /Martin Koch -- Edward Sargisson senior java developer Global Relay edward.sargis...@globalrelay.net mailto:edward.sargis...@globalrelay.net *866.484.6630* New York | Chicago | Vancouver | London (+44.0800.032.9829) | Singapore (+65.3158.1301) Global Relay Archive supports email, instant messaging, BlackBerry, Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook and more. Ask about *Global Relay Message* http://www.globalrelay.com/services/message*--- *The Future of Collaboration in the Financial Services World * *All email sent to or from this address will be retained by Global Relay's email archiving system. This message is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. Global Relay will not be liable for any compliance or technical information provided herein. All trademarks are the property of their respective owners.
SurgeCon 2012
Surge [1] is scalability focused conference in late September hosted in Baltimore. It's a pretty cool conference with a good mix of operationally minded people interested in scalability, distributed systems, systems level performance and good stuff like that. You should go! [2] For those of you who like historical trivia Mike Malone gave a well recieved Cassandra talk at the first SurgeCon in 2010 [3]. This year there is organised room for BoF's and such with several one-hour slots Wednesday and Thursday evenings, between 9 p.m. and midnight for BoFs. Last year a few of us got together informally around lunch time [4]. Interested in getting together again this year? Think we have critical mass for a BoF? [1] http://omniti.com/surge/2012 [2] http://omniti.com/surge/2012/register [3] http://omniti.com/surge/2010/speakers/mike-malone [4] http://mail-archives.apache.org/mod_mbox/cassandra-user/201109.mbox/%3c4e82140a.5070...@gmail.com%3E
Re: unsubscribe
http://wiki.apache.org/cassandra/FAQ#unsubscribe On Wed, Aug 29, 2012 at 3:57 PM, Juan Antonio Gomez Moriano mori...@exciteholidays.com wrote: -- *Juan Antonio Gomez Moriano* DEVELOPER TEAM LEADER [image: Excite Holidays] T +61 2 8061 2917 emori...@exciteholidays.com Wwww.exciteholidays.com A Suite 1901, 101 Grafton St, Bondi Junction, NSW 2022, Australia -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
Re: Schema Disagreement after migration from 1.0.6 to 1.1.4
Do you see exceptions like java.lang.UnsupportedOperationException: Not a time-based UUID in log files of nodes running 1.0.6 and 1.0.9? Then it's probably due to [1] explained here [2] -- In this case you either have to upgrade all nodes to 1.1.4 or if you prefer keeping a mixed-version cluster, the 1.0.6 and 1.0.9 nodes won't be able to join the cluster again, unless you temporarily upgrade them to 1.0.11. Cheers, Omid [1] https://issues.apache.org/jira/browse/CASSANDRA-1391 [2] https://issues.apache.org/jira/browse/CASSANDRA-4195 On Wed, Sep 5, 2012 at 4:08 PM, Martin Koch m...@issuu.com wrote: Hi list We have a 5-node Cassandra cluster with a single 1.0.9 installation and four 1.0.6 installations. We have tried installing 1.1.4 on one of the 1.0.6 nodes (following the instructions on http://www.datastax.com/docs/1.1/install/upgrading). After bringing up 1.1.4 there are no errors in the log, but the cluster now suffers from schema disagreement [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 59adb24e-f3cd-3e02-97f0-5b395827453f: [10.10.29.67] - The new 1.1.4 node 943fc0a0-f678-11e1--339cf8a6c1bf: [10.10.87.228, 10.10.153.45, 10.10.145.90, 10.38.127.80] - nodes in the old cluster The recipe for recovering from schema disagreement (http://wiki.apache.org/cassandra/FAQ#schema_disagreement) doesn't cover the new directory layout. The system/Schema directory is empty save for a snapshots subdirectory. system/schema_columnfamilies and system/schema_keyspaces contain some files. As described in datastax's description, we tried running nodetool upgradesstables. When this had done, describe schema in the cli showed a schema definition which seemed correct, but was indeed different from the schema on the other nodes in the cluster. Any clues on how we should proceed? Thanks, /Martin Koch
Monitoring replication lag/latency in multi DC setup
Hi, We have multi DC Cassandra ring with 2 DCs setup. We use LOCAL_QUORUM for writes and reads. The network we have seen between the DC is sometimes flaky lasting few minutes to few 10 of minutes. I wanted to know what is the best way to measure/monitor either the lag or replication latency between the data centers. Are there any metrics I can monitor to find the backlog of data that needs to be transferred? Thanks in advance. VR
Re: Monitoring replication lag/latency in multi DC setup
As far as I know Cassandra doesn't use internal queueing mechanism specific to replication. Cassandra sends the write the remote DC and after that it's upto the tcp/ip stack to deal with buffering. If requests starts to timeout Cassandra would use HH upto certain time. For longer outage you would have to run repair. Also look at tcp/ip tuning parameters that are helpful with your scenario: http://kaivanov.blogspot.com/2010/09/linux-tcp-tuning.html Run iperf and test the latency. On Wed, Sep 5, 2012 at 8:22 PM, Venkat Rama venkata.s.r...@gmail.comwrote: Hi, We have multi DC Cassandra ring with 2 DCs setup. We use LOCAL_QUORUM for writes and reads. The network we have seen between the DC is sometimes flaky lasting few minutes to few 10 of minutes. I wanted to know what is the best way to measure/monitor either the lag or replication latency between the data centers. Are there any metrics I can monitor to find the backlog of data that needs to be transferred? Thanks in advance. VR
Re: Practical node size limits
On Sun, Jul 29, 2012 at 7:40 PM, Dustin Wenz dustinw...@ebureau.com wrote: We've just set up a new 7-node cluster with Cassandra 1.1.2 running under OpenJDK6. It's worth noting that Cassandra project recommends Sun JRE. Without the Sun JRE, you might not be able to use JAMM to determine the live ratio. Very few people use OpenJDK in production, so using it also increases the likelihood that you might be the first to encounter a given issue. FWIW! =Rob -- =Robert Coli AIMGTALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
Re: Monitoring replication lag/latency in multi DC setup
Thanks for the quick reply, Mohit.Can we measure/monitor the size of Hinted Handoffs? Would it be a good enough indicator of my back log? Although we know when a network is flaky, we are interested in knowing how much data is piling up in local DC that needs to be transferred. Greatly appreciate your help. VR On Wed, Sep 5, 2012 at 8:33 PM, Mohit Anchlia mohitanch...@gmail.comwrote: As far as I know Cassandra doesn't use internal queueing mechanism specific to replication. Cassandra sends the write the remote DC and after that it's upto the tcp/ip stack to deal with buffering. If requests starts to timeout Cassandra would use HH upto certain time. For longer outage you would have to run repair. Also look at tcp/ip tuning parameters that are helpful with your scenario: http://kaivanov.blogspot.com/2010/09/linux-tcp-tuning.html Run iperf and test the latency. On Wed, Sep 5, 2012 at 8:22 PM, Venkat Rama venkata.s.r...@gmail.comwrote: Hi, We have multi DC Cassandra ring with 2 DCs setup. We use LOCAL_QUORUM for writes and reads. The network we have seen between the DC is sometimes flaky lasting few minutes to few 10 of minutes. I wanted to know what is the best way to measure/monitor either the lag or replication latency between the data centers. Are there any metrics I can monitor to find the backlog of data that needs to be transferred? Thanks in advance. VR
Re: Monitoring replication lag/latency in multi DC setup
Cassandra exposes lot of metrics through Jconsole. You might be able to get some information from Jconsole. On Wed, Sep 5, 2012 at 8:47 PM, Venkat Rama venkata.s.r...@gmail.comwrote: Thanks for the quick reply, Mohit.Can we measure/monitor the size of Hinted Handoffs? Would it be a good enough indicator of my back log? Although we know when a network is flaky, we are interested in knowing how much data is piling up in local DC that needs to be transferred. Greatly appreciate your help. VR On Wed, Sep 5, 2012 at 8:33 PM, Mohit Anchlia mohitanch...@gmail.comwrote: As far as I know Cassandra doesn't use internal queueing mechanism specific to replication. Cassandra sends the write the remote DC and after that it's upto the tcp/ip stack to deal with buffering. If requests starts to timeout Cassandra would use HH upto certain time. For longer outage you would have to run repair. Also look at tcp/ip tuning parameters that are helpful with your scenario: http://kaivanov.blogspot.com/2010/09/linux-tcp-tuning.html Run iperf and test the latency. On Wed, Sep 5, 2012 at 8:22 PM, Venkat Rama venkata.s.r...@gmail.comwrote: Hi, We have multi DC Cassandra ring with 2 DCs setup. We use LOCAL_QUORUM for writes and reads. The network we have seen between the DC is sometimes flaky lasting few minutes to few 10 of minutes. I wanted to know what is the best way to measure/monitor either the lag or replication latency between the data centers. Are there any metrics I can monitor to find the backlog of data that needs to be transferred? Thanks in advance. VR
Re: Schema Disagreement after migration from 1.0.6 to 1.1.4
Thanks, this is exactly it. We'd like to do a rolling upgrade - this is a production cluster - so I guess we'll upgrade 1.0.6 - 1.0.11 - 1.1.4, then. /Martin On Thu, Sep 6, 2012 at 2:35 AM, Omid Aladini omidalad...@gmail.com wrote: Do you see exceptions like java.lang.UnsupportedOperationException: Not a time-based UUID in log files of nodes running 1.0.6 and 1.0.9? Then it's probably due to [1] explained here [2] -- In this case you either have to upgrade all nodes to 1.1.4 or if you prefer keeping a mixed-version cluster, the 1.0.6 and 1.0.9 nodes won't be able to join the cluster again, unless you temporarily upgrade them to 1.0.11. Cheers, Omid [1] https://issues.apache.org/jira/browse/CASSANDRA-1391 [2] https://issues.apache.org/jira/browse/CASSANDRA-4195 On Wed, Sep 5, 2012 at 4:08 PM, Martin Koch m...@issuu.com wrote: Hi list We have a 5-node Cassandra cluster with a single 1.0.9 installation and four 1.0.6 installations. We have tried installing 1.1.4 on one of the 1.0.6 nodes (following the instructions on http://www.datastax.com/docs/1.1/install/upgrading). After bringing up 1.1.4 there are no errors in the log, but the cluster now suffers from schema disagreement [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.SimpleSnitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 59adb24e-f3cd-3e02-97f0-5b395827453f: [10.10.29.67] - The new 1.1.4 node 943fc0a0-f678-11e1--339cf8a6c1bf: [10.10.87.228, 10.10.153.45, 10.10.145.90, 10.38.127.80] - nodes in the old cluster The recipe for recovering from schema disagreement ( http://wiki.apache.org/cassandra/FAQ#schema_disagreement) doesn't cover the new directory layout. The system/Schema directory is empty save for a snapshots subdirectory. system/schema_columnfamilies and system/schema_keyspaces contain some files. As described in datastax's description, we tried running nodetool upgradesstables. When this had done, describe schema in the cli showed a schema definition which seemed correct, but was indeed different from the schema on the other nodes in the cluster. Any clues on how we should proceed? Thanks, /Martin Koch
Secondary index read/write explanation
Hi All, I am a new bee to Cassandra and trying to understand how secondary indexes work. I have been going over the discussion on https://issues.apache.org/jira/browse/CASSANDRA-749 about local secondary indexes. And interesting question on http://www.mail-archive.com/user@cassandra.apache.org/msg16966.html. The discussion seems to assume that most common uses cases are ones with range queries. Is this right? I am trying to understand the low cardinality reasoning and how the read gets executed. I have following questions, hoping i can explain my question well :) 1. When a write request is received, it is written to the base CF and secondary index to secondary (hidden) CF. If this right, will the secondary index be written local the node or will it follow RP/OPP to write to nodes. 2. When a coordinator receives a read request with say predicate x=y where column x is the secondary index, how does the coordinator query relevant node(s)? How does it avoid sending it to all nodes if it is locally indexed? If there is any article/blog that can help understand this better, please let me know. Thanks again in advance. VR
Re: Monitoring replication lag/latency in multi DC setup
Is there a specific metric you can recommend? VR On Wed, Sep 5, 2012 at 9:19 PM, Mohit Anchlia mohitanch...@gmail.comwrote: Cassandra exposes lot of metrics through Jconsole. You might be able to get some information from Jconsole. On Wed, Sep 5, 2012 at 8:47 PM, Venkat Rama venkata.s.r...@gmail.comwrote: Thanks for the quick reply, Mohit.Can we measure/monitor the size of Hinted Handoffs? Would it be a good enough indicator of my back log? Although we know when a network is flaky, we are interested in knowing how much data is piling up in local DC that needs to be transferred. Greatly appreciate your help. VR On Wed, Sep 5, 2012 at 8:33 PM, Mohit Anchlia mohitanch...@gmail.comwrote: As far as I know Cassandra doesn't use internal queueing mechanism specific to replication. Cassandra sends the write the remote DC and after that it's upto the tcp/ip stack to deal with buffering. If requests starts to timeout Cassandra would use HH upto certain time. For longer outage you would have to run repair. Also look at tcp/ip tuning parameters that are helpful with your scenario: http://kaivanov.blogspot.com/2010/09/linux-tcp-tuning.html Run iperf and test the latency. On Wed, Sep 5, 2012 at 8:22 PM, Venkat Rama venkata.s.r...@gmail.comwrote: Hi, We have multi DC Cassandra ring with 2 DCs setup. We use LOCAL_QUORUM for writes and reads. The network we have seen between the DC is sometimes flaky lasting few minutes to few 10 of minutes. I wanted to know what is the best way to measure/monitor either the lag or replication latency between the data centers. Are there any metrics I can monitor to find the backlog of data that needs to be transferred? Thanks in advance. VR