Re: Streaming from 1 node only when adding a new DC
Thanks, Created the issue: https://issues.apache.org/jira/browse/CASSANDRA-12015 2016-06-15 15:25 GMT+02:00 Paulo Motta <pauloricard...@gmail.com>: > For rebuild, replace and -Dcassandra.consistent.rangemovement=false in > general we currently pick the closest replica (as indicated by the Snitch) > which has the range, what will often map to the same node due to the > dynamic snitch, specially when N=RF. This is good for picking a node in the > same DC or rack for transferring, but we can probably improve this to > distribute streaming load more evenly within candidate source nodes in the > same rack/DC. > > Would you mind opening a ticket for improving this? > > > 2016-06-14 17:35 GMT-03:00 Fabien Rousseau <fabifab...@gmail.com>: > >> We've tested with C* 2.1.14 version >> Yes VNodes with 256 tokens >> Once all the nodes in dc2 are added, schema is modified to have RF=3 in >> dc1 and RF=3 in dc2. >> Then on each nodes of dc2: >> nodetool rebuild dc1 >> Le 14 juin 2016 10:39, "kurt Greaves" <k...@instaclustr.com> a écrit : >> >>> What version of Cassandra are you using? Also what command are you using >>> to run the rebuilds? Are you using vnodes? >>> >>> On 13 June 2016 at 09:01, Fabien Rousseau <fabifab...@gmail.com> wrote: >>> >>>> Hello, >>>> >>>> We've tested adding a new DC from an existing DC having 3 nodes and >>>> RF=3 (ie all nodes have all data). >>>> During the rebuild process, only one node of the first DC streamed data >>>> to the 3 nodes of the second DC. >>>> >>>> Our goal is to minimise the time it takes to rebuild a DC and would >>>> like to be able to stream from all nodes. >>>> >>>> Starting C* with debug logs, it appears that all nodes, when computing >>>> their "streaming plan" returns the same node for all ranges. >>>> This is probably because all nodes in DC2 have the same view of the >>>> ring. >>>> >>>> I understand that when bootstrapping a new node, it's preferable to >>>> stream from the node being replaced, but when rebuilding a new DC, it >>>> should probably select sources "randomly" (rather than always selecting the >>>> same source for a specific range). >>>> What do you think ? >>>> >>>> Best Regards, >>>> Fabien >>>> >>> >>> >>> >>> -- >>> Kurt Greaves >>> k...@instaclustr.com >>> www.instaclustr.com >>> >> >
Re: Streaming from 1 node only when adding a new DC
We've tested with C* 2.1.14 version Yes VNodes with 256 tokens Once all the nodes in dc2 are added, schema is modified to have RF=3 in dc1 and RF=3 in dc2. Then on each nodes of dc2: nodetool rebuild dc1 Le 14 juin 2016 10:39, "kurt Greaves" <k...@instaclustr.com> a écrit : > What version of Cassandra are you using? Also what command are you using > to run the rebuilds? Are you using vnodes? > > On 13 June 2016 at 09:01, Fabien Rousseau <fabifab...@gmail.com> wrote: > >> Hello, >> >> We've tested adding a new DC from an existing DC having 3 nodes and RF=3 >> (ie all nodes have all data). >> During the rebuild process, only one node of the first DC streamed data >> to the 3 nodes of the second DC. >> >> Our goal is to minimise the time it takes to rebuild a DC and would like >> to be able to stream from all nodes. >> >> Starting C* with debug logs, it appears that all nodes, when computing >> their "streaming plan" returns the same node for all ranges. >> This is probably because all nodes in DC2 have the same view of the ring. >> >> I understand that when bootstrapping a new node, it's preferable to >> stream from the node being replaced, but when rebuilding a new DC, it >> should probably select sources "randomly" (rather than always selecting the >> same source for a specific range). >> What do you think ? >> >> Best Regards, >> Fabien >> > > > > -- > Kurt Greaves > k...@instaclustr.com > www.instaclustr.com >
Streaming from 1 node only when adding a new DC
Hello, We've tested adding a new DC from an existing DC having 3 nodes and RF=3 (ie all nodes have all data). During the rebuild process, only one node of the first DC streamed data to the 3 nodes of the second DC. Our goal is to minimise the time it takes to rebuild a DC and would like to be able to stream from all nodes. Starting C* with debug logs, it appears that all nodes, when computing their "streaming plan" returns the same node for all ranges. This is probably because all nodes in DC2 have the same view of the ring. I understand that when bootstrapping a new node, it's preferable to stream from the node being replaced, but when rebuilding a new DC, it should probably select sources "randomly" (rather than always selecting the same source for a specific range). What do you think ? Best Regards, Fabien
Re: MX4J support broken in cassandra 3.0.5?
Hi Robert, This could be related to: https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-9242 (Maybe you can try to comment this option and try again) Le 27 avr. 2016 15:21, "Robert Sicoie"a écrit : > Hi guys, > > I'm upgrading from cassandra 2.1 to cassandra 3.0.5 and mx4j support seems > to be broker. An empty html page is shown: > > > GET / HTTP/1.1 > > Host: localhost:8081 > > User-Agent: curl/7.43.0 > > Accept: */* > > > * HTTP 1.0, assume close after body > < HTTP/1.0 200 OK > < expires: now > < Server: MX4J-HTTPD/1.0 > < Cache-Control: no-cache > < pragma: no-cache > < Content-Type: text/html > > This is what I have in cassandra-env.sh > ... > MX4J_PORT="-Dmx4jport=8081" > ... > And the mx4j-tools.jar is in place. > > It worked fine with cassandra 2.1. Is there a new configuration needed in > 3.0.5? > > Any advice? > > Thanks, > Robert > > In order to protect our email recipients, the Paddy Power Betfair plc > group of companies use MessageLabs to scan all Incoming and Outgoing mail > for viruses. > Paddy Power Betfair may monitor the content of email sent and received for > the purpose of ensuring compliance with its policies and procedures. >
Re: Network / GC / Latency spike
Hi Alain, Maybe it's possible to confirm this by testing on a small cluster: - create a cluster of 2 nodes (using https://github.com/pcmanus/ccm for example) - create a fake wide row of a few mb (using the python driver for example) - drain and stop one of the two nodes - remove the sstables of the stopped node (to provoke inconsistencies) - start it again - select a small portion of the wide row (many times, use nodetool tpstats to know when a read repair has been triggered) - nodetool flush (on the previously stopped node) - check the size of the sstable (if a few kb, then only the selected slice was repaired, but if a few mb then the whole row was repaired) The wild guess was: if a read repair was triggered when reading a small portion of a wide row and if it resulted in streaming the whole wide row, it could explain a network burst. (But, on a second thought it make more sense to only repair the small portion being read...) 2015-09-01 12:05 GMT+02:00 Alain RODRIGUEZ <arodr...@gmail.com>: > Hi Fabien, thanks for your help. > > I did not mention it but I indeed saw a correlation between latency and > read repairs spikes. Though this is like going from 5 RR per second to 10 > per sec cluster wide according to opscenter: http://img42.com/L6gx1 > > I have indeed some wide rows and this explanation looks reasonable to me, > I mean this makes sense. Yet isn't this amount of Read Repair too low to > induce such a "shitstorm" (even if it spikes x2, I got network x10) ? Also > wide rows are present on heavy used tables (sadly...), so I should be using > more network all the time (why only a few spikes per day (like 2 / 3 max) ? > > How could I confirm this, without removing RR and waiting a week I mean, > is there a way to see the size of the data being repaired through this > mechanism ? > > C*heers > > Alain > > 2015-09-01 0:11 GMT+02:00 Fabien Rousseau <fabifab...@gmail.com>: > >> Hi Alain, >> >> Could it be wide rows + read repair ? (Let's suppose the "read repair" >> repairs the full row, and it may not be subject to stream throughput limit) >> >> Best Regards >> Fabien >> >> 2015-08-31 15:56 GMT+02:00 Alain RODRIGUEZ <arodr...@gmail.com>: >> >>> I just realised that I have no idea about how this mailing list handle >>> attached files. >>> >>> Please find screenshots there --> http://img42.com/collection/y2KxS >>> >>> Alain >>> >>> 2015-08-31 15:48 GMT+02:00 Alain RODRIGUEZ <arodr...@gmail.com>: >>> >>>> Hi, >>>> >>>> Running a 2.0.16 C* on AWS (private VPC, 2 DC). >>>> >>>> I am facing an issue on our EU DC where I have a network burst >>>> (alongside with GC and latency increase). >>>> >>>> My first thought was a sudden application burst, though, I see no >>>> corresponding evolution on reads / write or even CPU. >>>> >>>> So I thought that this might come from the node themselves as IN almost >>>> equal OUT Network. I tried lowering stream throughput on the whole DC to 1 >>>> Mbps, with ~30 nodes --> 30 Mbps --> ~4 MB/s max. My network went a lot >>>> higher about 30 M in both sides (see screenshots attached). >>>> >>>> I have tried to use iftop to see where this network is headed too, but >>>> I was not able to do it because burst are very shorts. >>>> >>>> So, questions are: >>>> >>>> - Did someone experienced something similar already ? If so, any clue >>>> would be appreciated :). >>>> - How can I know (monitor, capture) where this big amount of network is >>>> headed to or due to ? >>>> - Am I right trying to figure out what this network is or should I >>>> follow an other lead ? >>>> >>>> Notes: I also noticed that CPU does not spike nor does R, but disk >>>> reads also spikes ! >>>> >>>> C*heers, >>>> >>>> Alain >>>> >>> >>> >> >
Re: Network / GC / Latency spike
Hi Alain, Could it be wide rows + read repair ? (Let's suppose the "read repair" repairs the full row, and it may not be subject to stream throughput limit) Best Regards Fabien 2015-08-31 15:56 GMT+02:00 Alain RODRIGUEZ: > I just realised that I have no idea about how this mailing list handle > attached files. > > Please find screenshots there --> http://img42.com/collection/y2KxS > > Alain > > 2015-08-31 15:48 GMT+02:00 Alain RODRIGUEZ : > >> Hi, >> >> Running a 2.0.16 C* on AWS (private VPC, 2 DC). >> >> I am facing an issue on our EU DC where I have a network burst (alongside >> with GC and latency increase). >> >> My first thought was a sudden application burst, though, I see no >> corresponding evolution on reads / write or even CPU. >> >> So I thought that this might come from the node themselves as IN almost >> equal OUT Network. I tried lowering stream throughput on the whole DC to 1 >> Mbps, with ~30 nodes --> 30 Mbps --> ~4 MB/s max. My network went a lot >> higher about 30 M in both sides (see screenshots attached). >> >> I have tried to use iftop to see where this network is headed too, but I >> was not able to do it because burst are very shorts. >> >> So, questions are: >> >> - Did someone experienced something similar already ? If so, any clue >> would be appreciated :). >> - How can I know (monitor, capture) where this big amount of network is >> headed to or due to ? >> - Am I right trying to figure out what this network is or should I follow >> an other lead ? >> >> Notes: I also noticed that CPU does not spike nor does R, but disk >> reads also spikes ! >> >> C*heers, >> >> Alain >> > >
Re: sstableloader Could not retrieve endpoint ranges
Hi, I already got this error on a 2.1 clusters because thrift was disabled. So you should check that thrift is enabled and accessible from the sstableloader process. Hope this help Fabien Le 19 juin 2015 05:44, Mitch Gitman mgit...@gmail.com a écrit : I'm using sstableloader to bulk-load a table from one cluster to another. I can't just copy sstables because the clusters have different topologies. While we're looking to upgrade soon to Cassandra 2.0.x, we're on Cassandra 1.2.19. The source data comes from a nodetool snapshot. Here's the command I ran: sstableloader -d *IP_ADDRESSES_OF_SEED_NOTES* */SNAPSHOT_DIRECTORY/* Here's the result I got: Could not retrieve endpoint ranges: -pr,--principal kerberos principal -k,--keytab keytab location --ssl-keystoressl keystore location --ssl-keystore-password ssl keystore password --ssl-keystore-type ssl keystore type --ssl-truststore ssl truststore location --ssl-truststore-password ssl truststore password --ssl-truststore-type ssl truststore type Not sure what to make of this, what with the hints at security arguments that pop up. The source and destination clusters have no security. Hoping this might ring a bell with someone out there.
Re: Problems after trying a migration
Hi David, There is an excellent article which describes exactly what you want to do (ie migrate from one DC to another DC) : http://planetcassandra.org/blog/cassandra-migration-to-ec2/ 2015-03-18 17:05 GMT+01:00 David CHARBONNIER david.charbonn...@rgsystem.com : Hi, We’re using Cassandra through the Datastax Enterprise package in version 4.5.1 (Cassandra version 2.0.8.39) with 7 nodes in a single datacenter. We need to move our Cassandra cluster from France to another country. To do this, we want to add a second 7-nodes datacenter to our cluster and stream all data between the two countries before dropping the first datacenter. On January 31st, we tried doing so but we had some problems: - New nodes in the other country have been installed like French nodes except for Datastax Enterprise version (4.5.1 in France and 4.6.1 in the other country which means Cassandra version 2.0.8.39 in France and 2.0.12.200 in the other country) - The following procedure has been followed: http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html but an error occurred during step 3. New nodes have been started before the *cassandra-topology.properties* file has been updated on the original datacenter. New nodes appeared in the original datacenter instead of the new one. - To recover our original cluster, we decommissionned every node of the new datacenter with the *nodetool decommission* command. On February 9th, nodes in the second datacenter have been restarted and joined the cluster. We had to decommission them just like before. On February 11th, we added disk space on our 7 running French nodes. To achieve this, we restarted the cluster but the nodes updated their perring informations and nodes from Luxembourg (decommissionned on February 9th) were present. This behaviour is described here: https://issues.apache.org/jira/browse/CASSANDRA-7825. So we cleaned *system.peers* table content. On March 11th, we needed to add an 8th node to our existing French cluster. We installed the same Datastax Enterprise version (4.5.1 with Cassandra 2.0.8.39) and tried to add this node to the cluster with this procedure: http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html. In OPSCenter, the node was joining the cluster and data streaming got stuck at 100%. After several hours, *nodetool status* showed us that the node was still joining but nothing in the logs let us know there was a problem. We restarted the node but it has no effect. Then we cleaned data and commitlog contents and try to add the node to the cluster again but without result. Last try was to add the node with *auto_bootstrap : false* in order to add the node to the cluster manually but it messed up with the data. So we shut down the node and decommissioned it (with *nodetool removenode*). The whole cluster has been repaired and we stopped doing anything. Now, our cluster has only 7 French nodes in which we can’t add any node. The OPSCenter data has disapeared and we work without any information about how our cluster is running. You’ll find attached to this email our current configuration and a screenshot of our OPSCenter metric page. Do you have some idea on how to clean up the mess and get our cluster running cleanly before we start our migration (France to another country like described in the beginning of this email)? Thank you. Best regards, *David CHARBONNIER* Sysadmin T : +33 411 934 200 david.charbonn...@rgsystem.com ZAC Aéroport 125 Impasse Adam Smith 34470 Pérols - France *www.rgsystem.com* http://www.rgsystem.com/ -- Fabien Rousseau aur...@yakaz.comwww.yakaz.com
Re: Migrate data to new cluster using datacenters?
Hi, We did it once and it worked well. Those two links should help (this is more or less what we've done) : http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/operations/ops_add_dc_to_cluster_t.html http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/operations/ops_decomission_dc_t.html 2013/12/12 Andrew Cooper andrew.coo...@nisc.coop Hello, We are in the process of isolating multiple applications currently running in one large cassandra cluster to individual smaller clusters. Each application runs in its own keyspace. In order to reduce/eliminate downtime for a migration, I was curious if anyone had attempted the following process to migrate data to a new cluster: 1) Add new cluster nodes as a new datacenter to existing cluster 2) Set RF for specific keyspace to non-zero for new cluster, use nodetool rebuild on new nodes to stream data 3) Change application node connections to point to new cluster 4) Set RF to 0 for original cluster (stop new writes from going to original cluster) 5) Break connection between nodes so new nodes become a standalone cluster??? - Is this possible? what would be the high level steps? If this is an extremely bad or misinformed idea, I would like to know that as well! I am aware of other tools available including sstableloader, etc, but this seemed like a more elegant solution, leveraging cassandra's active-active features. Thanks, -Andrew NISC -- Fabien Rousseau aur...@yakaz.comwww.yakaz.com
Re: OOM while reading key cache
A few month ago, we've got a similar issue on 1.2.6 : https://issues.apache.org/jira/browse/CASSANDRA-5706 But it has been fixed and did not encountered this issue anymore (we're also on 1.2.10) 2013/11/14 olek.stas...@gmail.com olek.stas...@gmail.com Yes, as I wrote in first e-mail. When I removed key cache file cassandra started without further problems. regards Olek 2013/11/13 Robert Coli rc...@eventbrite.com: On Wed, Nov 13, 2013 at 12:35 AM, Tom van den Berge t...@drillster.com wrote: I'm having the same problem, after upgrading from 1.2.3 to 1.2.10. I can remember this was a bug that was solved in the 1.0 or 1.1 version some time ago, but apparently it got back. A workaround is to delete the contents of the saved_caches directory before starting up. Yours is not the first report of this I've heard resulting from a 1.2.x to 1.2.x upgrade. Reports are of the form I had to nuke my saved_caches or I couldn't start my node, it OOMED, etc.. https://issues.apache.org/jira/browse/CASSANDRA-6325 Exists, but doesn't seem to be the same issue. https://issues.apache.org/jira/browse/CASSANDRA-5986 Similar, doesn't seem to be an issue triggered by upgrade.. If I were one of the posters on this thread, I would strongly consider filing a JIRA on point. @OP (olek) : did removing the saved_caches also fix your problem? =Rob -- Fabien Rousseau aur...@yakaz.comwww.yakaz.com
Re: disappointed
Hi Paul, Concerning large rows which are not compacting, I've probably managed to reproduce your problem. I suppose you're using collections, but also TTLs ? Anyway, I opened an issue here : https://issues.apache.org/jira/browse/CASSANDRA-5799 Hope this helps 2013/7/24 Christopher Wirt chris.w...@struq.com Hi Paul, ** ** Sorry to hear you’re having a low point. ** ** We ended up not using the collection features of 1.2. Instead storing a compressed string containing the map and handling client side. ** ** We only have fixed schema short rows so no experience with large row compaction. ** ** File descriptors have never got that high for us. But, if you only have a couple physical nodes with loads of data and small ss-tables maybe they could get that high? ** ** Only time I’ve had file descriptors get out of hand was then compaction got slightly confused with a new schema when I dropped and recreated instead of truncating. https://issues.apache.org/jira/browse/CASSANDRA-4857 restarting the node fixed the issue. ** ** ** ** From my limited experience I think Cassandra is a dangerous choice for an young limited funding/experience start-up expecting to scale fast. We are a fairly mature start-up with funding. We’ve just spent 3-5 months moving from Mongo to Cassandra. It’s been expensive and painful getting Cassandra to read like Mongo, but we’ve made it J ** ** ** ** ** ** ** ** *From:* Paul Ingalls [mailto:paulinga...@gmail.com] *Sent:* 24 July 2013 06:01 *To:* user@cassandra.apache.org *Subject:* disappointed ** ** I want to check in. I'm sad, mad and afraid. I've been trying to get a 1.2 cluster up and working with my data set for three weeks with no success. I've been running a 1.1 cluster for 8 months now with no hiccups, but for me at least 1.2 has been a disaster. I had high hopes for leveraging the new features of 1.2, specifically vnodes and collections. But at this point I can't release my system into production, and will probably need to find a new back end. As a small startup, this could be catastrophic. I'm mostly mad at myself. I took a risk moving to the new tech. I forgot sometimes when you gamble, you lose. ** ** First, the performance of 1.2.6 was horrible when using collections. I wasn't able to push through 500k rows before the cluster became unusable. With a lot of digging, and way too much time, I discovered I was hitting a bug that had just been fixed, but was unreleased. This scared me, because the release was already at 1.2.6 and I would have expected something as https://issues.apache.org/jira/browse/CASSANDRA-5677 would have been addressed long before. But gamely I grabbed the latest code from the 1.2 branch, built it and I was finally able to get past half a million rows. ** ** But, then I hit ~4 million rows, and a multitude of problems. Even with the fix above, I was still seeing a ton of compactions failing, specifically the ones for large rows. Not a single large row will compact, they all assert with the wrong size. Worse, and this is what kills the whole thing, I keep hitting a wall with open files, even after dumping the whole DB, dropping vnodes and trying again. Seriously, 650k open file descriptors? When it hits this limit, the whole DB craps out and is basically unusable. This isn't that many rows. I have close to a half a billion in 1.1… ** ** I'm now at a standstill. I figure I have two options unless someone here can help me. Neither of them involve 1.2. I can either go back to 1.1 and remove the features that collections added to my service, or I find another data backend that has similar performance characteristics to cassandra but allows collections type behavior in a scalable manner. Cause as far as I can tell, 1.2 doesn't scale. Which makes me sad, I was proud of what I accomplished with 1.1…. ** ** Does anyone know why there are so many open file descriptors? Any ideas on why a large row won't compact? ** ** Paul -- Fabien Rousseau * * aur...@yakaz.comwww.yakaz.com
Re: Performance issues with CQL3 collections?
IMHO, having many tombstones can slow down reads and writes in the following cases : - For reads, it is slow if the requested slice contains many tombstones - For writes, it is is slower if the row in the memtable contains many tombstones. It's because, if the IntervalTree contains N intervals, and one tombstone must be added, then a new IntervalTree must be recreated. But it's true that writes are less impacted than reads. Sylvain, if you need/want some help/info for CASSANDRA-5677, don't hesitate to ask. 2013/6/28 Sylvain Lebresne sylv...@datastax.com As documented at http://cassandra.apache.org/doc/cql3/CQL.html#collections, the lists have 3 operations that require a read before a write (and should thus be avoided in performance sensitive code), namely setting and deleting by index, and removing by value. Outside of that, collections involves no read before writes. But, as you said, if you do overwrite a collection, the previous collection is removed (using a range tombstone) while the new one is added. This should have almost no impact on the insertion itself however (the tombstone is in the same internal mutation than the update itself, it's not 2 operations). But yes, if you do often overwrite collections in the same partition, this might have some impact on reads due to CASSANDRA-5677, and we'll look at fixing that. So in theory collections should have no special impact on writes, at least nothing that is by design. If you do observe differently and have a way to reproduce, feel free to open a JIRA issue. But I'm afraid we'll need more than two guys on stackoverflow claims they've seem write performance degradation due to collection to get going. -- Sylvain On Fri, Jun 28, 2013 at 7:30 AM, Theo Hultberg t...@iconara.net wrote: the thing I was doing was definitely triggering the range tombstone issue, this is what I was doing: UPDATE clocks SET clock = ? WHERE shard = ? in this table: CREATE TABLE clocks (shard INT PRIMARY KEY, clock MAPTEXT, TIMESTAMP) however, from the stack overflow posts it sounds like they aren't necessarily overwriting their collections. I've tried to replicate their problem with these two statements INSERT INTO clocks (shard, clock) VALUES (?, ?) UPDATE clocks SET clock = clock + ? WHERE shard = ? the first one should create range tombstones because it overwrites the the map on every insert, and the second should not because it adds to the map. neither of those seems to have any performance issues, at least not on inserts. and it's the slowdown on inserts that confuses me, both the stack overflow questioners say that they saw a drop in insert performance. I never saw that in my application, I just got slow reads (and Fabien's explanation makes complete sense for that). I don't understand how insert performance could be affected at all, and I know that for non-counter columns cassandra doesn't read before it writes, but is it the same for collections too? they are a bit special, but how special are they? T# On Fri, Jun 28, 2013 at 7:04 AM, aaron morton aa...@thelastpickle.comwrote: Can you provide details of the mutation statements you are running ? The Stack Overflow posts don't seem to include them. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 27/06/2013, at 5:58 AM, Theo Hultberg t...@iconara.net wrote: do I understand it correctly if I think that collection modifications are done by reading the collection, writing a range tombstone that would cover the collection and then re-writing the whole collection again? or is it just the modified parts of the collection that are covered by the range tombstones, but you still get massive amounts of them and its just their number that is the problem. would this explain the slowdown of writes too? I guess it would if cassandra needed to read the collection before it wrote the new values, otherwise I don't understand how this affects writes, but that only says how much I know about how this works. T# On Wed, Jun 26, 2013 at 10:48 AM, Fabien Rousseau fab...@yakaz.comwrote: Hi, I'm pretty sure that it's related to this ticket : https://issues.apache.org/jira/browse/CASSANDRA-5677 I'd be happy if someone tests this patch. It should apply easily on 1.2.5 1.2.6 After applying the patch, by default, the current implementation is still used, but modify your cassandra.yaml to add the following one : interval_tree_provider: IntervalTreeAvlProvider (Note that implementations should be interchangeable, because they share the same serializers and deserializers) Also, please note that this patch has not been reviewed nor intensively tested... So, it may not be production ready Fabien 2013/6/26 Theo Hultberg t...@iconara.net Hi, I've seen a couple of people on Stack Overflow having problems with performance when
Re: Errors while upgrading from 1.1.10 version to 1.2.4 version
Hello, Have a look at : https://issues.apache.org/jira/browse/CASSANDRA-5476 2013/6/28 Ananth Gundabattula agundabatt...@threatmetrix.com Hello Everybody, We were performing an upgrade of our cluster from 1.1.10 version to 1.2.4 . We tested the upgrade process in a QA environment and found no issues. However in the production node, we faced loads of errors and had to abort the upgrade process. I was wondering how we ran into such a situation. The main difference between the QA environment and the production environments is the Replication Factor. In QA , RF=1 and in production RF=3. Example stack traces are as seen on the other nodes are : http://pastebin.com/fSnMAd8q The other observation is that the node which was being upgraded is a seed node in the 1.1.10. We aborted right after the first node gave the above issues. Does this mean that there will be an application downtime required if we go for rolling upgrade on a live cluster from 1.1.10 version to 1.2.4 version ? Regards, Ananth -- Fabien Rousseau * * aur...@yakaz.comwww.yakaz.com
Re: Performance issues with CQL3 collections?
Hi, I'm pretty sure that it's related to this ticket : https://issues.apache.org/jira/browse/CASSANDRA-5677 I'd be happy if someone tests this patch. It should apply easily on 1.2.5 1.2.6 After applying the patch, by default, the current implementation is still used, but modify your cassandra.yaml to add the following one : interval_tree_provider: IntervalTreeAvlProvider (Note that implementations should be interchangeable, because they share the same serializers and deserializers) Also, please note that this patch has not been reviewed nor intensively tested... So, it may not be production ready Fabien 2013/6/26 Theo Hultberg t...@iconara.net Hi, I've seen a couple of people on Stack Overflow having problems with performance when they have maps that they continuously update, and in hindsight I think I might have run into the same problem myself (but I didn't suspect it as the reason and designed differently and by accident didn't use maps anymore). Is there any reason that maps (or lists or sets) in particular would become a performance issue when they're heavily modified? As I've understood them they're not special, and shouldn't be any different performance wise than overwriting regular columns. Is there something different going on that I'm missing? Here are the Stack Overflow questions: http://stackoverflow.com/questions/17282837/cassandra-insert-perfomance-issue-into-a-table-with-a-map-type/17290981 http://stackoverflow.com/questions/17082963/bad-performance-when-writing-log-data-to-cassandra-with-timeuuid-as-a-column-nam/17123236 yours, Theo -- Fabien Rousseau * * aur...@yakaz.comwww.yakaz.com