Re: Cassandra on Kubernetes
Hi Jain, Thanks for your comments about CassKop. We began the development of Casskop at the beginning of 2018. At this time, some K8S objects (i.e. statefulsets, operators, …) were still in beta version and we discovered a few strange behaviours. We upgraded to K8S 1.12 in mid-2018. After this upgrade, we did not encounter any real problems. However, we continue our tests; we have for example a cluster (16 nodes in 3 DC in the same region) which was deployed via CassKop and has been working correctly for 3 months , and we will continue to observe it during the next months before hopefully deploying a C* cluster via cassKop in production. In parallel, we do other tests of robustness and performances; right now, we have no real issue to report about K8S for our use case. I can’t say more right now. Thanks for your question. I would be more than happy to read other answers. Le mer. 30 oct. 2019 à 18:46, Akshit Jain a écrit : > Hi Jean > Thanks for replying. I had seen CassKop and the amount functionality it > provides is quite awesome as compared to other operators. > > I would like to know how stable is kubernetes for stateful/database > applications right now? > > I haven't read/heard any major production stateful application running on > k8s. > > > -Akshit > > > > > On Wed, 30 Oct, 2019, 8:12 PM Jean-Armel Luce, wrote: > >> Hi, >> >> We are currently developping CassKop, a Cassandra operator for K8S. >> This operator is developped in Go, based on the operator-sdk framework. >> >> At this time of the project, the goal is to deploy a Cassandra cluster in >> 1 Kubernetes datacenter, but this will change in next versions to deal with >> Kubernetes in multi-datacenters. >> >> The following features are already supported by CassKop: >> - Deployment of a C* cluster (rack or AZ aware) >> - Scaling up the cluster (with cleanup) >> - Scaling down the cluster (with decommission prior to Kubernetes scale >> down) >> - Pods operations (removenode, upgradesstable, cleanup, rebuild..) >> - Adding a Cassandra DC >> - Removing a Cassandra DC >> - Setting and modifying configuration files >> - Setting and modifying configuration parameters >> - Update of the Cassandra docker image >> - Rolling update of a Cassandra cluster >> - Update of Cassandra version (including upgradesstable in case of major >> upgrade) >> - Update of JVM >> - Update of configuration >> - Stopping a Kubernetes node for maintenance >> - Process a remove node (and create new Cassandra node on another >> Kubernetes node) >> - Process a replace address (of the old Cassandra node on another >> Kubernetes node) >> - Manage operations on pods through CassKop plugin (cleanup, rebuild, >> upgradesstable, removenode..) >> - Monitoring (using Instaclustr Prometheus exporter to Prometheus/Grafana) >> - Pause/Restart & rolling restart operations through CassKoP plugin. >> >> We use also Cassandra reaper for scheduling repair sessions. >> >> >> If you would like more informations about this operator, you may have a >> look here : https://github.com/Orange-OpenSource/cassandra-k8s-operator >> >> Please, feel free to download it and try it. We would be more than happy >> to receive your feedback >> >> >> If you have any question about this operator, feel free to contact us via >> our mailing-list: prj.casskop.supp...@list.orangeportails.net or on our >> slack https://casskop.slack.com >> >> Note : this operator is still in alpha version and works only in a mono >> region architecture for now. We are currently working hard for adding new >> features in order to run it in multi-regions architecture. >> >> >> Thanks. >> >> >> >> Le mer. 30 oct. 2019 à 13:56, Akshit Jain a >> écrit : >> >>> Hi everyone, >>> >>> Is there anyone who is running Cassandra on K8s clusters. It would be >>> great if you can share your experience , the operator you are using and the >>> overall stability of stateful sets in Kubernetes >>> >>> -Akshit >>> >>
Re: Cassandra on Kubernetes
Hi, We are currently developping CassKop, a Cassandra operator for K8S. This operator is developped in Go, based on the operator-sdk framework. At this time of the project, the goal is to deploy a Cassandra cluster in 1 Kubernetes datacenter, but this will change in next versions to deal with Kubernetes in multi-datacenters. The following features are already supported by CassKop: - Deployment of a C* cluster (rack or AZ aware) - Scaling up the cluster (with cleanup) - Scaling down the cluster (with decommission prior to Kubernetes scale down) - Pods operations (removenode, upgradesstable, cleanup, rebuild..) - Adding a Cassandra DC - Removing a Cassandra DC - Setting and modifying configuration files - Setting and modifying configuration parameters - Update of the Cassandra docker image - Rolling update of a Cassandra cluster - Update of Cassandra version (including upgradesstable in case of major upgrade) - Update of JVM - Update of configuration - Stopping a Kubernetes node for maintenance - Process a remove node (and create new Cassandra node on another Kubernetes node) - Process a replace address (of the old Cassandra node on another Kubernetes node) - Manage operations on pods through CassKop plugin (cleanup, rebuild, upgradesstable, removenode..) - Monitoring (using Instaclustr Prometheus exporter to Prometheus/Grafana) - Pause/Restart & rolling restart operations through CassKoP plugin. We use also Cassandra reaper for scheduling repair sessions. If you would like more informations about this operator, you may have a look here : https://github.com/Orange-OpenSource/cassandra-k8s-operator Please, feel free to download it and try it. We would be more than happy to receive your feedback If you have any question about this operator, feel free to contact us via our mailing-list: prj.casskop.supp...@list.orangeportails.net or on our slack https://casskop.slack.com Note : this operator is still in alpha version and works only in a mono region architecture for now. We are currently working hard for adding new features in order to run it in multi-regions architecture. Thanks. Le mer. 30 oct. 2019 à 13:56, Akshit Jain a écrit : > Hi everyone, > > Is there anyone who is running Cassandra on K8s clusters. It would be > great if you can share your experience , the operator you are using and the > overall stability of stateful sets in Kubernetes > > -Akshit >
CassKop : a Cassandra operator for Kubernetes developped by Orange
Hi folks, We are excited to announce that CassKop, a Cassandra operator for Kubernetes developped by Orange teams, is now ready for Beta testing. CassKop works as a usual K8S controller (reconcile the real state with a desired state) and automates the Cassandra operations through JMX. All the operations are launched by calling standard K8S APIs (kubectl apply …) or by using a K8S plugin (kubectl casskop …). CassKop is developed in GO, based on CoreOS operator-sdk framework. Main features already available : - deploying a rack aware cluster (or AZ aware cluster) - scaling up & down (including cleanups) - setting and modifying configuration parameters (C* and JVM parameters) - adding / removing a datacenter in Cassandra (all datacenters must be in the same region) - rebuilding nodes - removing node or replacing node (in case of hardware failure) - upgrading C* or Java versions (including upgradesstables) - monitoring (using Prometheus/Grafana) - ... By using local and persistent volumes, it is possible to handle failures or stop/start nodes for maintenance operations with no transfer of data between nodes. Moreover, we can deploy cassandra-reaper in K8S and use it for scheduling repair sessions. For now, we can deploy a C* cluster only as a mono-region cluster. We will work during the next weeks to be able to deploy a C* cluster as a multi regions cluster. Still in the roadmap : - Network encryption - Monitoring (exporting logs and metrics) - backup & restore - multi-regions support We'd be interested to hear you try this and let us know what you think! Please read the description and installation instructions on https://github.com/Orange-OpenSource/cassandra-k8s-operator. For a quick start, you can also follow this step by step guide : https://orange-opensource.github.io/cassandra-k8s-operator/index.html?slides=Slides-CassKop-demo.md#1 The CassKop Team
Re: Compaction Strategy guidance
Hi Andrei, Hi Nicolai, Which version of C* are you using ? There are some recommendations about the max storage per node : http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2 For 1.0 we recommend 300-500GB. For 1.2 we are looking to be able to handle 10x (3-5TB). I have the feeling that those recommendations are sensitive according many criteria such as : - your hardware - the compaction strategy - ... It looks that LCS lower those limitations. Increasing the size of sstables might help if you have enough CPU and you can put more load on your I/O system (@Andrei, I am interested by the results of your experimentation about large sstable files) From my point of view, there are some usage patterns where it is better to have many small servers than a few large servers. Probably, it is better to have many small servers if you need LCS for large tables. Just my 2 cents. Jean-Armel 2014-11-24 19:56 GMT+01:00 Robert Coli rc...@eventbrite.com: On Mon, Nov 24, 2014 at 6:48 AM, Nikolai Grigoriev ngrigor...@gmail.com wrote: One of the obvious recommendations I have received was to run more than one instance of C* per host. Makes sense - it will reduce the amount of data per node and will make better use of the resources. This is usually a Bad Idea to do in production. =Rob
Re: Compaction Strategy guidance
Hi Nikolai, Thanks for those informations. Please could you clarify a little bit what you call 2014-11-24 4:37 GMT+01:00 Nikolai Grigoriev ngrigor...@gmail.com: Just to clarify - when I was talking about the large amount of data I really meant large amount of data per node in a single CF (table). LCS does not seem to like it when it gets thousands of sstables (makes 4-5 levels). When bootstraping a new node you'd better enable that option from CASSANDRA-6621 (the one that disables STCS in L0). But it will still be a mess - I have a node that I have bootstrapped ~2 weeks ago. Initially it had 7,5K pending compactions, now it has almost stabilized ad 4,6K. Does not go down. Number of sstables at L0 is over 11K and it is slowly slowly building upper levels. Total number of sstables is 4x the normal amount. Now I am not entirely sure if this node will ever get back to normal life. And believe me - this is not because of I/O, I have SSDs everywhere and 16 physical cores. This machine is barely using 1-3 cores at most of the time. The problem is that allowing STCS fallback is not a good option either - it will quickly result in a few 200Gb+ sstables in my configuration and then these sstables will never be compacted. Plus, it will require close to 2x disk space on EVERY disk in my JBOD configuration...this will kill the node sooner or later. This is all because all sstables after bootstrap end at L0 and then the process slowly slowly moves them to other levels. If you have write traffic to that CF then the number of sstables and L0 will grow quickly - like it happens in my case now. Once something like https://issues.apache.org/jira/browse/CASSANDRA-8301 is implemented it may be better. On Sun, Nov 23, 2014 at 4:53 AM, Andrei Ivanov aiva...@iponweb.net wrote: Stephane, We are having a somewhat similar C* load profile. Hence some comments in addition Nikolai's answer. 1. Fallback to STCS - you can disable it actually 2. Based on our experience, if you have a lot of data per node, LCS may work just fine. That is, till the moment you decide to join another node - chances are that the newly added node will not be able to compact what it gets from old nodes. In your case, if you switch strategy the same thing may happen. This is all due to limitations mentioned by Nikolai. Andrei, On Sun, Nov 23, 2014 at 8:51 AM, Servando Muñoz G. smg...@gmail.com wrote: ABUSE YA NO QUIERO MAS MAILS SOY DE MEXICO De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com] Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m. Para: user@cassandra.apache.org Asunto: Re: Compaction Strategy guidance Importancia: Alta Stephane, As everything good, LCS comes at certain price. LCS will put most load on you I/O system (if you use spindles - you may need to be careful about that) and on CPU. Also LCS (by default) may fall back to STCS if it is falling behind (which is very possible with heavy writing activity) and this will result in higher disk space usage. Also LCS has certain limitation I have discovered lately. Sometimes LCS may not be able to use all your node's resources (algorithm limitations) and this reduces the overall compaction throughput. This may happen if you have a large column family with lots of data per node. STCS won't have this limitation. By the way, the primary goal of LCS is to reduce the number of sstables C* has to look at to find your data. With LCS properly functioning this number will be most likely between something like 1 and 3 for most of the reads. But if you do few reads and not concerned about the latency today, most likely LCS may only save you some disk space. On Sat, Nov 22, 2014 at 6:25 PM, Stephane Legay sle...@looplogic.com wrote: Hi there, use case: - Heavy write app, few reads. - Lots of updates of rows / columns. - Current performance is fine, for both writes and reads.. - Currently using SizedCompactionStrategy We're trying to limit the amount of storage used during compaction. Should we switch to LeveledCompactionStrategy? Thanks -- Nikolai Grigoriev (514) 772-5178 -- Nikolai Grigoriev (514) 772-5178
Re: Compaction Strategy guidance
Hi Nikolai, Please could you clarify a little bit what you call a large amount of data ? How many tables ? How many rows in your largest table ? How many GB in your largest table ? How many GB per node ? Thanks. 2014-11-24 8:27 GMT+01:00 Jean-Armel Luce jaluc...@gmail.com: Hi Nikolai, Thanks for those informations. Please could you clarify a little bit what you call 2014-11-24 4:37 GMT+01:00 Nikolai Grigoriev ngrigor...@gmail.com: Just to clarify - when I was talking about the large amount of data I really meant large amount of data per node in a single CF (table). LCS does not seem to like it when it gets thousands of sstables (makes 4-5 levels). When bootstraping a new node you'd better enable that option from CASSANDRA-6621 (the one that disables STCS in L0). But it will still be a mess - I have a node that I have bootstrapped ~2 weeks ago. Initially it had 7,5K pending compactions, now it has almost stabilized ad 4,6K. Does not go down. Number of sstables at L0 is over 11K and it is slowly slowly building upper levels. Total number of sstables is 4x the normal amount. Now I am not entirely sure if this node will ever get back to normal life. And believe me - this is not because of I/O, I have SSDs everywhere and 16 physical cores. This machine is barely using 1-3 cores at most of the time. The problem is that allowing STCS fallback is not a good option either - it will quickly result in a few 200Gb+ sstables in my configuration and then these sstables will never be compacted. Plus, it will require close to 2x disk space on EVERY disk in my JBOD configuration...this will kill the node sooner or later. This is all because all sstables after bootstrap end at L0 and then the process slowly slowly moves them to other levels. If you have write traffic to that CF then the number of sstables and L0 will grow quickly - like it happens in my case now. Once something like https://issues.apache.org/jira/browse/CASSANDRA-8301 is implemented it may be better. On Sun, Nov 23, 2014 at 4:53 AM, Andrei Ivanov aiva...@iponweb.net wrote: Stephane, We are having a somewhat similar C* load profile. Hence some comments in addition Nikolai's answer. 1. Fallback to STCS - you can disable it actually 2. Based on our experience, if you have a lot of data per node, LCS may work just fine. That is, till the moment you decide to join another node - chances are that the newly added node will not be able to compact what it gets from old nodes. In your case, if you switch strategy the same thing may happen. This is all due to limitations mentioned by Nikolai. Andrei, On Sun, Nov 23, 2014 at 8:51 AM, Servando Muñoz G. smg...@gmail.com wrote: ABUSE YA NO QUIERO MAS MAILS SOY DE MEXICO De: Nikolai Grigoriev [mailto:ngrigor...@gmail.com] Enviado el: sábado, 22 de noviembre de 2014 07:13 p. m. Para: user@cassandra.apache.org Asunto: Re: Compaction Strategy guidance Importancia: Alta Stephane, As everything good, LCS comes at certain price. LCS will put most load on you I/O system (if you use spindles - you may need to be careful about that) and on CPU. Also LCS (by default) may fall back to STCS if it is falling behind (which is very possible with heavy writing activity) and this will result in higher disk space usage. Also LCS has certain limitation I have discovered lately. Sometimes LCS may not be able to use all your node's resources (algorithm limitations) and this reduces the overall compaction throughput. This may happen if you have a large column family with lots of data per node. STCS won't have this limitation. By the way, the primary goal of LCS is to reduce the number of sstables C* has to look at to find your data. With LCS properly functioning this number will be most likely between something like 1 and 3 for most of the reads. But if you do few reads and not concerned about the latency today, most likely LCS may only save you some disk space. On Sat, Nov 22, 2014 at 6:25 PM, Stephane Legay sle...@looplogic.com wrote: Hi there, use case: - Heavy write app, few reads. - Lots of updates of rows / columns. - Current performance is fine, for both writes and reads.. - Currently using SizedCompactionStrategy We're trying to limit the amount of storage used during compaction. Should we switch to LeveledCompactionStrategy? Thanks -- Nikolai Grigoriev (514) 772-5178 -- Nikolai Grigoriev (514) 772-5178
Re: Why repair -pr doesn't work when RF=0 for 1 DC
Hi Fabrice and Yuri, Thanks for your questions and answers. From my point of view, it is now a common architecture to have 1 or more DC for on line queries, and 1 DC for stats. It is also very common (and recommended ?) to specify -pr option for anti-entropy repair operations. In my case, I have a very large table, and I don't want to compute statistics from the data stored in that table. In that case, a convenient and obvious solution is to create a specific keyspace with RF={s1:3,stats:0,b1:3}, as explained by Fabrice. It means that I can use -pr option for anti-entropy repair operations for keyspaces having RF 0 in DC stats. However, I have to execute a full repair (without -pr) for keyspace(s) having RF = 0 in DC stats. If my understanding is correct, I guess it should very useful to explain this point a little bit more in the documentation : http://www.datastax.com/documentation/cassandra/1.2/cassandra/tools/toolsNodetool_r.html http://www.datastax.com/documentation/cassandra/1.2/cassandra/operations/ops_repair_nodes_c.html?scroll=concept_ds_ebj_d3q_gk https://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair Something like /!\ If you use NTS and set RF=0 for 1 (or more) keyspace(s) in 1 (or more) DCs, -pr option for anti-entropy repair operations will not repair all the ranges of tokens for those keyspaces. If what you want a safe anti entropy operation, you need to repair without -pr option. I have read that 2.0 will come with a lot of exciting improvements for repair operations. Is it still a concern with 2.0 ? with 2.1 ? Regards. Jean Armel 2014-02-27 20:06 GMT+01:00 Yuki Morishita mor.y...@gmail.com: Yes. On Thu, Feb 27, 2014 at 12:49 PM, Fabrice Facorat fabrice.faco...@gmail.com wrote: So if I understand well from CASSANDRA-5424 and CASSANDRA-5608, as stats dc doesn't own data, repair -pr will not repair the data. Only a full repair will do it. Once we will add a RF to stats DC, repair -pr will work again. That's correct ? 2014-02-27 19:15 GMT+01:00 Yuki Morishita mor.y...@gmail.com: Yes, it is expected behavior since 1.2.5(https://issues.apache.org/jira/browse/CASSANDRA-5424). Since you set foobar not to replicate to stats dc, primary range of foobar keyspace for nodes in stats is empty. On Thu, Feb 27, 2014 at 10:16 AM, Fabrice Facorat fabrice.faco...@gmail.com wrote: Hi, we have a cluster with 3 DC, and for one DC ( stats ), RF=0 for a keyspace using NetworkTopologyStrategy. cqlsh SELECT * FROM system.schema_keyspaces WHERE keyspace_name='foobar'; keyspace_name | durable_writes | strategy_class | strategy_options ++--+- foobar | True | org.apache.cassandra.locator.NetworkTopologyStrategy | {s1:3,stats:0,b1:3} When doing a nodetool repair -pr foobar on a node in DC stats, we notice that the repair doesn't do anything : it just skips the keyspace. Is this normal behavior ? I guess that some keys belonging to DC stats's primary range token should have been repaired in the two others DC ? Am I wrong ? We are using cassandra 1.2.13, with 256 vnodes and Murmur3Partitioner -- Close the World, Open the Net http://www.linux-wizard.net -- Yuki Morishita t:yukim (http://twitter.com/yukim) -- Close the World, Open the Net http://www.linux-wizard.net -- Yuki Morishita t:yukim (http://twitter.com/yukim)
Re: Upgrading 1.1 to 1.2 in-place
Hi, I don't know how your application works, but I explained during the last Cassandra Summit Europe how we did the migration from relational database to Cassandra without any interruption of service. You can have a look at the video http://www.youtube.com/watch?v=mefOE9K7sLI And use the mod-dup module https://github.com/Orange-OpenSource/mod_dup For copying data from your Cassandra cluster 1.1 to the Cassandra cluster 1.2, you can backup your data and then use sstableloader (in this case, you will not have to modify the timestamp as I did for the migration from relational to Cassandra). Hope that helps !! Jean Armel 2013/12/30 Tupshin Harper tups...@tupshin.com No. This is not going to work. The vnodes feature requires the murmur3 partitioner which was introduced with Cassandra 1.2. Since you are currently using 1.1, you must be using the random partitioner, which is not compatible with vnodes. Because the partitioner determines the physical layout of all of your data on disk and across the cluster, it is not possible to change partitioner without taking some downtime to rewrite all of your data. You should probably plan on an upgrade to 1.2 but without also switching to vnodes at this point. -Tupshin On Dec 30, 2013 9:46 AM, Katriel Traum katr...@google.com wrote: Hello list, I have a 2 DC set up with DC1:3, DC2:3 replication factor. DC1 has 6 nodes, DC2 has 3. This whole setup runs on AWS, running cassandra 1.1. Here's my nodetool ring: 1.1.1.1 eu-west 1a Up Normal 55.07 GB50.00% 0 2.2.2.1 us-east 1b Up Normal 107.82 GB 100.00% 1 1.1.1.2 eu-west 1b Up Normal 53.98 GB50.00% 28356863910078205288614550619314017622 1.1.1.3 eu-west 1c Up Normal 54.85 GB50.00% 56713727820156410577229101238628035242 2.2.2.2 us-east 1d Up Normal 107.25 GB 100.00% 56713727820156410577229101238628035243 1.1.1.4 eu-west 1a Up Normal 54.99 GB50.00% 85070591730234615865843651857942052863 1.1.1.5 eu-west 1b Up Normal 55.1 GB 50.00% 113427455640312821154458202477256070484 2.2.2.3 us-east 1e Up Normal 106.78 GB 100.00% 113427455640312821154458202477256070485 1.1.1.6 eu-west 1c Up Normal 55.01 GB50.00% 141784319550391026443072753096570088105 I am going to upgrade my machine type, upgrade to 1.2 and change the 6-node to 3 nodes. I will have to do it on the live system. I'd appreciate any comments about my plan. 1. Decommission a 1.1 node. 2. Bootstrap a new one in-place, cassandra 1.2, vnodes enabled (I am trying to avoid a re-balance later on). 3. When done, decommission nodes 4-6 at DC1 Issues i've spotted: 1. I'm guessing I will have an unbalanced cluster for the time period where I have 1.2+vnodes and 1.1 mixed. 2. Rollback is cumbersome, snapshots won't help here. Any feedback appreciated Katriel
Re: is there a key to sstable index file?
@Michal : look a this for the improvement of read performance : https://issues.apache.org/jira/browse/CASSANDRA-2498 Best regards. Jean Armel 2013/7/18 Michał Michalski mich...@opera.com SSTables are immutable - once they're written to disk, they cannot be changed. On read C* checks *all* SSTables [1], but to make it faster, it uses Bloom Filters, that can tell you if a row is *not* in a specific SSTable, so you don't have to read it at all. However, *if* you read it in case you have to, you don't read a whole SSTable - there's an in-memory Index Sample, that is used for binary search and returning only a (relatively) small block of real (full, on-disk) index, which you have to scan to find a place to retrieve the data from SSTable. Additionally you have a KeyCache to make reads faster - it points location of data in SSTable, so you don't have to touch Index Sample and Index at all. Once C* retrieves all data parts (including the Memtable part), timestamps are used to find the most recent version of data. [1] I believe that it's not true for all cases, as I saw a piece of code somewhere in the source, that starts checking SSTables in order from the newest to the oldest one (in terms of data timestamps - AFAIR SSTable MetaData stores info about smallest and largest timestamp in SSTable), and once the newest data for all columns are retrieved (assuming that schema is defined), retrieving data stops and older SSTables are not checked. If someone could confirm that it works this way and it's not something that I saw in my dream and now believe it's real, I'd be glad ;-) W dniu 17.07.2013 22:58, S Ahmed pisze: Since SSTables are mutable, and they are ordered, does this mean that there is a index of key ranges that each SS table holds, and the value could be 1 more sstables that have to be scanned and then the latest one is chosen? e.g. Say I write a value abc to CF1. This gets stored in a sstable. Then I write def to CF1, this gets stored in another sstable eventually. How when I go to fetch the value, it has to scan 2 sstables and then figure out which is the latest entry correct? So is there an index of key's to sstables, and there can be 1 or more sstables per key? (This is assuming compaction hasn't occurred yet).
Re: Ten talks you shouldn’t miss at the Cassandra Summit
A envoyer à Pierre http://www.datastax.com/dev/blog/ten-talks-you-shouldnt-miss-at-the-cassandra-summit 2013/5/8 Jonathan Ellis jbel...@gmail.com The Cassandra Summit is just over a month away! I wrote up my thoughts on the talks I'm most excited for here: http://www.datastax.com/dev/blog/ten-talks-you-shouldnt-miss-at-the-cassandra-summit Don't forget to register with the code SFSummit25 for a 25% discount: http://datastax.regsvc.com/E2 (Want to go, but your company won't pay? Let me know off-list and I'll see what I can do.) -- Jonathan Ellis Project Chair, Apache Cassandra co-founder, http://www.datastax.com @spyced
Re: Really have to repair ?
Hi Cyril, According to the documentation ( http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair), I understand that is is not necessary to repair every node before gc_grace_seconds if you are sure that you don't miss to run a repair each time a node is down longer than gc_grace_seconds. *IF* your operations team is sufficiently on the ball, you can get by without repair as long as you do not have hardware failure -- in that case, HintedHandoff is adequate to repair successful updates that some replicas have missed Am I wrong ? Thoughts ? 2013/4/4 cscetbon@orange.com Hi, I know that deleted rows can reappear if node repair is not run on every node before *gc_grace_seconds* seconds. However do we really need to obey this rule if we run node repair on node that are down for more than * max_hint_window_in_ms* milliseconds ? Thanks -- Cyril SCETBON _ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, France Telecom - Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, France Telecom - Orange is not liable for messages that have been modified, changed or falsified. Thank you.
Re: Consistency level for system_auth keyspace
Hi Aaron, I have open a ticket in Jira : https://issues.apache.org/jira/browse/CASSANDRA-5310 Reading the user using the QUORUM consistency level means that in case of network outage, you are unable to open a connection, and all your data become unavailable. Regards. Jean Armel 2013/3/4 aaron morton aa...@thelastpickle.com In this case, it means that if there is a network split between the 2 datacenters, it is impossible to get the quorum, and all connections will be rejected. Yes. Is there a reason why Cassandra uses the Quorum consistency level ? I would guess to ensure there is a single, cluster wide, set of permissions. Using LOCAL or one could result in some requests that are rejected being allowed on other nodes. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 1/03/2013, at 6:40 AM, Jean-Armel Luce jaluc...@gmail.com wrote: Hi, I am using Cassandra 1.2.2. There are 16 nodes in my cluster in 2 datacenters (8 nodes in each datacenter). I am using NetworkTopologyStrategy. For information, I set a RF = 6 (3 replicas in each datacenter) With 1.2.2, I am using the new authentication backend PasswordAuthenticator with the authorizer CassandraAuthorizer. In the documentation ( http://www.datastax.com/docs/1.2/security/security_keyspace_replication#security-keyspace-replication), it is written that for all system_auth-related queries, Cassandra uses the QUORUM consistency level. In this case, it means that if there is a network split between the 2 datacenters, it is impossible to get the quorum, and all connections will be rejected. Is there a reason why Cassandra uses the Quorum consistency level ? Maybe a local_quorum conssitency level (or a one consistency level) could do the job ? Regards Jean Armel
Re: Consistency level for system_auth keyspace
Hi Dean, The new authentication modules currently uses a QUORUM consistency level when checking the user. That is the reason why it doesn't work in version 1.2.2. I thing that using LOCAL_QUORUM or ONE CL instead of QUORUM should solve this problem. But I didn't see any option in 1.2.2. Regards. Jean Armel 2013/3/4 Hiller, Dean dean.hil...@nrel.gov I thought there was already a LOCAL_QUOROM option so things continue to work when you get data center split. There was also TWO I think as well which allowed 4 nodes(2 in each data center) so you can continue to write when data center splits. Dean From: Jean-Armel Luce jaluc...@gmail.commailto:jaluc...@gmail.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Monday, March 4, 2013 9:12 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: Consistency level for system_auth keyspace Hi Aaron, I have open a ticket in Jira : https://issues.apache.org/jira/browse/CASSANDRA-5310 Reading the user using the QUORUM consistency level means that in case of network outage, you are unable to open a connection, and all your data become unavailable. Regards. Jean Armel 2013/3/4 aaron morton aa...@thelastpickle.commailto: aa...@thelastpickle.com In this case, it means that if there is a network split between the 2 datacenters, it is impossible to get the quorum, and all connections will be rejected. Yes. Is there a reason why Cassandra uses the Quorum consistency level ? I would guess to ensure there is a single, cluster wide, set of permissions. Using LOCAL or one could result in some requests that are rejected being allowed on other nodes. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 1/03/2013, at 6:40 AM, Jean-Armel Luce jaluc...@gmail.commailto: jaluc...@gmail.com wrote: Hi, I am using Cassandra 1.2.2. There are 16 nodes in my cluster in 2 datacenters (8 nodes in each datacenter). I am using NetworkTopologyStrategy. For information, I set a RF = 6 (3 replicas in each datacenter) With 1.2.2, I am using the new authentication backend PasswordAuthenticator with the authorizer CassandraAuthorizer. In the documentation ( http://www.datastax.com/docs/1.2/security/security_keyspace_replication#security-keyspace-replication), it is written that for all system_auth-related queries, Cassandra uses the QUORUM consistency level. In this case, it means that if there is a network split between the 2 datacenters, it is impossible to get the quorum, and all connections will be rejected. Is there a reason why Cassandra uses the Quorum consistency level ? Maybe a local_quorum conssitency level (or a one consistency level) could do the job ? Regards Jean Armel
Re: Other nodes are seen down with rpc_address 0.0.0.0 in version 1.2.2
Hi Michal, Thanks for your quick answer. I am going to apply the patch defined in CASSANDRA-5299. Regards. Jean Armel
Re: Adding new nodes in a cluster with virtual nodes
Hi Aaron, I tried again to add a node in the cluster. This time, I added the new node in the seeds list after the bootstrap (the first time, I added the new node in the seeds list before the bootstrap). And it works !!! Thanks Aaron. Regards. Jean Armel. 2013/2/22 Jean-Armel Luce jaluc...@gmail.com Thanks Aaron. I shall investigate more next week about this. regards. Jean Armel 2013/2/22 aaron morton aa...@thelastpickle.com So, it looks that the repair is required if we want to add new nodes in our platform, but I don't understand why. Bootstrapping should take care of it. But new seed nodes do not bootstrap. Check the logs on the nodes you added to see what messages have bootstrap in them. Anytime you are worried about things like this throw in a nodetool repair. If you are using QUOURM for read and writes you will still be getting consistent data, so long as you have only added one node. Or one node every RF'th nodes. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 22/02/2013, at 9:55 PM, Jean-Armel Luce jaluc...@gmail.com wrote: Hi Aaron, Thanks for your answer. I apologize, I did a mistake in my 1st mail. The cluster was only 12 nodes instead of 16 (it is a test cluster). There are 2 datacenters b1 and s1. Here is the result of nodetool status after adding a new node in the 1st datacenter (dc s1): root@node007:~# nodetool status Datacenter: b1 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.234.72.135 10.71 GB 256 44.6% 2fc583b2-822f-4347-9fab-5e9d10d548c9 c01 UN 10.234.72.134 16.74 GB 256 63.7% f209a8c5-7e1b-45b5-aa80-ed679bbbdbd1 e01 UN 10.234.72.139 17.09 GB 256 62.0% 95661392-ccd8-4592-a76f-1c99f7cdf23a e07 UN 10.234.72.138 10.96 GB 256 42.9% 0d6725f0-1357-423d-85c1-153fb94257d5 e03 UN 10.234.72.137 11.09 GB 256 45.7% 492190d7-3055-4167-8699-9c6560e28164 e03 UN 10.234.72.136 11.91 GB 256 41.1% 3872f26c-5f2d-4fb3-9f5c-08b4c7762466 c01 Datacenter: s1 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.98.255.139 16.94 GB 256 43.8% 3523e80c-8468-4502-b334-79eabc3357f0 g10 UN 10.98.255.138 12.62 GB 256 42.4% a2bcddf1-393e-453b-9d4f-9f7111c01d7f i02 UN 10.98.255.137 10.59 GB 256 38.4% f851b6ee-f1e4-431b-8beb-e7b173a77342 i02 UN 10.98.255.136 11.89 GB 256 42.9% 36fe902f-3fb1-4b6d-9e2c-71e601fa0f2e a09 UN 10.98.255.135 10.29 GB 256 40.4% e2d020a5-97a9-48d4-870c-d10b59858763 a09 UN 10.98.255.134 16.19 GB 256 52.3% 73e3376a-5a9f-4b8a-a119-c87ae1fafdcb h06 UN 10.98.255.140 127.84 KB 256 39.9% 3d5c33e6-35d0-40a0-b60d-2696fd5cbf72 g10 We can see that the new node (10.98.255.140) contains only 127,84KB. We saw also that there was no network traffic between the nodes. Then we added a new node in the 2nd datacenter (dc b1) root@node007:~# nodetool status Datacenter: b1 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.234.72.135 12.95 GB 256 42.0% 2fc583b2-822f-4347-9fab-5e9d10d548c9 c01 UN 10.234.72.134 20.11 GB 256 53.1% f209a8c5-7e1b-45b5-aa80-ed679bbbdbd1 e01 UN 10.234.72.140 122.25 KB 256 41.9% 501ea498-8fed-4cc8-a23a-c99492bc4f26 e07 UN 10.234.72.139 20.46 GB 256 40.2% 95661392-ccd8-4592-a76f-1c99f7cdf23a e07 UN 10.234.72.138 13.21 GB 256 40.9% 0d6725f0-1357-423d-85c1-153fb94257d5 e03 UN 10.234.72.137 13.34 GB 256 42.9% 492190d7-3055-4167-8699-9c6560e28164 e03 UN 10.234.72.136 14.16 GB 256 39.0% 3872f26c-5f2d-4fb3-9f5c-08b4c7762466 c01 Datacenter: s1 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.98.255.139 19.19 GB 256 43.8% 3523e80c-8468-4502-b334-79eabc3357f0 g10 UN 10.98.255.138 14.9 GB256 42.4% a2bcddf1-393e-453b-9d4f-9f7111c01d7f i02 UN 10.98.255.137 12.49 GB 256 38.4% f851b6ee-f1e4-431b-8beb-e7b173a77342 i02 UN 10.98.255.136 14.13 GB 256 42.9% 36fe902f-3fb1-4b6d-9e2c-71e601fa0f2e a09 UN 10.98.255.135 12.16 GB 256 40.4% e2d020a5-97a9-48d4-870c-d10b59858763 a09 UN 10.98.255.134 18.85 GB 256 52.3% 73e3376a-5a9f-4b8a-a119-c87ae1fafdcb h06 UN 10.98.255.140 2.24 GB256 39.9% 3d5c33e6-35d0-40a0-b60d-2696fd5cbf72 g10 We can see that the 2nd new node (10.234.72.140
Re: Adding new nodes in a cluster with virtual nodes
Hi Aaron, Thanks for your answer. I apologize, I did a mistake in my 1st mail. The cluster was only 12 nodes instead of 16 (it is a test cluster). There are 2 datacenters b1 and s1. Here is the result of nodetool status after adding a new node in the 1st datacenter (dc s1): root@node007:~# nodetool status Datacenter: b1 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.234.72.135 10.71 GB 256 44.6% 2fc583b2-822f-4347-9fab-5e9d10d548c9 c01 UN 10.234.72.134 16.74 GB 256 63.7% f209a8c5-7e1b-45b5-aa80-ed679bbbdbd1 e01 UN 10.234.72.139 17.09 GB 256 62.0% 95661392-ccd8-4592-a76f-1c99f7cdf23a e07 UN 10.234.72.138 10.96 GB 256 42.9% 0d6725f0-1357-423d-85c1-153fb94257d5 e03 UN 10.234.72.137 11.09 GB 256 45.7% 492190d7-3055-4167-8699-9c6560e28164 e03 UN 10.234.72.136 11.91 GB 256 41.1% 3872f26c-5f2d-4fb3-9f5c-08b4c7762466 c01 Datacenter: s1 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.98.255.139 16.94 GB 256 43.8% 3523e80c-8468-4502-b334-79eabc3357f0 g10 UN 10.98.255.138 12.62 GB 256 42.4% a2bcddf1-393e-453b-9d4f-9f7111c01d7f i02 UN 10.98.255.137 10.59 GB 256 38.4% f851b6ee-f1e4-431b-8beb-e7b173a77342 i02 UN 10.98.255.136 11.89 GB 256 42.9% 36fe902f-3fb1-4b6d-9e2c-71e601fa0f2e a09 UN 10.98.255.135 10.29 GB 256 40.4% e2d020a5-97a9-48d4-870c-d10b59858763 a09 UN 10.98.255.134 16.19 GB 256 52.3% 73e3376a-5a9f-4b8a-a119-c87ae1fafdcb h06 UN 10.98.255.140 127.84 KB 256 39.9% 3d5c33e6-35d0-40a0-b60d-2696fd5cbf72 g10 We can see that the new node (10.98.255.140) contains only 127,84KB. We saw also that there was no network traffic between the nodes. Then we added a new node in the 2nd datacenter (dc b1) root@node007:~# nodetool status Datacenter: b1 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.234.72.135 12.95 GB 256 42.0% 2fc583b2-822f-4347-9fab-5e9d10d548c9 c01 UN 10.234.72.134 20.11 GB 256 53.1% f209a8c5-7e1b-45b5-aa80-ed679bbbdbd1 e01 UN 10.234.72.140 122.25 KB 256 41.9% 501ea498-8fed-4cc8-a23a-c99492bc4f26 e07 UN 10.234.72.139 20.46 GB 256 40.2% 95661392-ccd8-4592-a76f-1c99f7cdf23a e07 UN 10.234.72.138 13.21 GB 256 40.9% 0d6725f0-1357-423d-85c1-153fb94257d5 e03 UN 10.234.72.137 13.34 GB 256 42.9% 492190d7-3055-4167-8699-9c6560e28164 e03 UN 10.234.72.136 14.16 GB 256 39.0% 3872f26c-5f2d-4fb3-9f5c-08b4c7762466 c01 Datacenter: s1 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.98.255.139 19.19 GB 256 43.8% 3523e80c-8468-4502-b334-79eabc3357f0 g10 UN 10.98.255.138 14.9 GB256 42.4% a2bcddf1-393e-453b-9d4f-9f7111c01d7f i02 UN 10.98.255.137 12.49 GB 256 38.4% f851b6ee-f1e4-431b-8beb-e7b173a77342 i02 UN 10.98.255.136 14.13 GB 256 42.9% 36fe902f-3fb1-4b6d-9e2c-71e601fa0f2e a09 UN 10.98.255.135 12.16 GB 256 40.4% e2d020a5-97a9-48d4-870c-d10b59858763 a09 UN 10.98.255.134 18.85 GB 256 52.3% 73e3376a-5a9f-4b8a-a119-c87ae1fafdcb h06 UN 10.98.255.140 2.24 GB256 39.9% 3d5c33e6-35d0-40a0-b60d-2696fd5cbf72 g10 We can see that the 2nd new node (10.234.72.140) contains only 122,25KB. The new node in the 1st datacenter contains now 2,24 GB because we were inserting data in the cluster while adding the new nodes. Then we started a repair from the new node in the 2nd datacenter : time nodetool repair We can see that the old nodes are sending data to the new node : root@node007:~# nodetool netstats Mode: NORMAL Not sending any streams. Streaming from: /10.98.255.137 hbxtest: /var/opt/hosting/db/iof/cassandra/data/hbxtest/medium_column/hbxtest-medium_column-ia-3-Data.db sections=130 progress=0/15598366 - 0% hbxtest: /var/opt/hosting/db/iof/cassandra/data/hbxtest/medium_column/hbxtest-medium_column-ia-198-Data.db sections=107 progress=0/429517 - 0% hbxtest: /var/opt/hosting/db/iof/cassandra/data/hbxtest/medium_column/hbxtest-medium_column-ia-17-Data.db sections=109 progress=0/696057 - 0% hbxtest: /var/opt/hosting/db/iof/cassandra/data/hbxtest/medium_column/hbxtest-medium_column-ia-119-Data.db sections=57 progress=0/189844 - 0% hbxtest: /var/opt/hosting/db/iof/cassandra/data/hbxtest/medium_column/hbxtest-medium_column-ia-199-Data.db sections=124 progress=56492032/4597955 - 1228% hbxtest: /var/opt/hosting/db/iof/cassandra/data/hbxtest/medium_column/hbxtest-medium_column-ia-196-Data.db
Adding new nodes in a cluster with virtual nodes
Hello, We are using Cassandra 1.2.0. We have a cluster of 16 physical nodes, each node has 256 virtual nodes. We want to add 2 new nodes in our cluster : we follow the procedure as explained here : http://www.datastax.com/docs/1.2/operations/add_replace_nodes. After starting 1 of the new node, we can see that this new node has 256 tokens ==looks good We can see that this node is in the ring (using nodetool status) == looks good After the bootstrap is finished in the new node, no data has been moved automatically from the old nodes to this new node. However, when we send insert queries in our cluster, the new node accepts to insert the new rows. Please, could you tell me if we need to perform a nodetool repair after the bootstrap of the new node ? What happens if we perform a nodetool cleanup in the old nodes before doing the nodetool repair ? (Is there a risk of loosing some data ?) Regards. Jean Armel
In REAME.txt : CQL3 instead of CQL2 ?
Hello, I have installed the 1.2 beta2 (download source + compil) The CREATE SCHEMA fails if I do as explained in README.txt : bin/cqlsh --cql3== --cql3 is the default in 1.2 so it is not needed Connected to Test Cluster at localhost:9160. [cqlsh 2.3.0 | Cassandra 1.2.0-beta2-SNAPSHOT | CQL spec 3.0.0 | Thrift protocol 19.35.0] Use HELP for help. cqlsh create keyspace jaltest WITH strategy_class = 'SimpleStrategy' AND strategy_options:replication_factor='1'; Bad Request: line 1:82 mismatched input ':' expecting '=' If I give the CREATE SCHEMA in cql3, it works :-) cqlsh create keyspace jaltest with replication ={'class': 'SimpleStrategy', 'replication_factor': '1'}; cqlsh describe keyspace jaltest; CREATE KEYSPACE jaltest WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': '1' }; cqlsh It looks that the syntax of CREATE SCHEMA in the README is in CQL2, while the syntax for connexion to cqlsh is for CQL3, From my point of view, it should be more friendly to write the CREATE SCHEMA command using the CQL3 syntax rather than the CQL2 syntax in the README.txt. Best regards. Jean Armel
Re: How to set LeveledCompactionStrategy for an existing table
Hello Aaron. Thanks for your answer Jira ticket 4597 created : https://issues.apache.org/jira/browse/CASSANDRA-4597 Jean-Armel 2012/8/31 aaron morton aa...@thelastpickle.com Looks like a bug. Can you please create a ticket on https://issues.apache.org/jira/browse/CASSANDRA and update the email thread ? Can you include this: CFPropDefs.applyToCFMetadata() does not set the compaction class on CFM Thanks - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 7:05 AM, Jean-Armel Luce jaluc...@gmail.com wrote: I tried as you said with cassandra-cli, and still unsuccessfully [default@unknown] use test1; Authenticated to keyspace: test1 [default@test1] UPDATE COLUMN FAMILY pns_credentials with compaction_strategy='LeveledCompactionStrategy'; 8ed12919-ef2b-327f-8f57-4c2de26c9d51 Waiting for schema agreement... ... schemas agree across the cluster And then, when I check the compaction strategy, it is still SizeTieredCompactionStrategy [default@test1] describe pns_credentials; ColumnFamily: pns_credentials Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.UTF8Type Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 0.1 DC Local Read repair chance: 0.0 Replicate on write: true Caching: KEYS_ONLY Bloom Filter FP chance: default Built indexes: [] Column Metadata: Column Name: isnew Validation Class: org.apache.cassandra.db.marshal.Int32Type Column Name: ts Validation Class: org.apache.cassandra.db.marshal.DateType Column Name: mergestatus Validation Class: org.apache.cassandra.db.marshal.Int32Type Column Name: infranetaccount Validation Class: org.apache.cassandra.db.marshal.UTF8Type Column Name: user_level Validation Class: org.apache.cassandra.db.marshal.Int32Type Column Name: msisdn Validation Class: org.apache.cassandra.db.marshal.LongType Column Name: mergeusertype Validation Class: org.apache.cassandra.db.marshal.Int32Type Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy Compression Options: sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor I tried also to create a new table with LeveledCompactionStrategy (using cqlsh), and when I check the compaction strategy, the SizeTieredCompactionStrategy is set for this table. cqlsh:test1 CREATE TABLE pns_credentials3 ( ... ise text PRIMARY KEY, ... isnew int, ... ts timestamp, ... mergestatus int, ... infranetaccount text, ... user_level int, ... msisdn bigint, ... mergeusertype int ... ) WITH ... comment='' AND ... read_repair_chance=0.10 AND ... gc_grace_seconds=864000 AND ... compaction_strategy_class='LeveledCompactionStrategy' AND ... compression_parameters:sstable_compression='SnappyCompressor'; cqlsh:test1 describe table pns_credentials3 CREATE TABLE pns_credentials3 ( ise text PRIMARY KEY, isnew int, ts timestamp, mergestatus int, infranetaccount text, user_level int, msisdn bigint, mergeusertype int ) WITH comment='' AND comparator=text AND read_repair_chance=0.10 AND gc_grace_seconds=864000 AND default_validation=text AND min_compaction_threshold=4 AND max_compaction_threshold=32 AND replicate_on_write='true' AND compaction_strategy_class='SizeTieredCompactionStrategy' AND compression_parameters:sstable_compression='SnappyCompressor'; Maybe something is wrong in my server. Any idea ? Thanks. Jean-Armel 2012/8/30 feedly team feedly...@gmail.com in cassandra-cli, i did something like: update column family xyz with compaction_strategy='LeveledCompactionStrategy' On Thu, Aug 30, 2012 at 5:20 AM, Jean-Armel Luce jaluc...@gmail.comwrote: Hello, I am using Cassandra 1.1.1 and CQL3. I have a cluster with 1 node (test environment) Could you tell how to set the compaction strategy to Leveled Strategy for an existing table ? I have a table pns_credentials jal@jal-VirtualBox:~/cassandra/apache-cassandra-1.1.1/bin$ ./cqlsh -3 Connected to Test Cluster at localhost:9160. [cqlsh 2.2.0 | Cassandra 1.1.1 | CQL spec 3.0.0 | Thrift protocol 19.32.0] Use HELP for help. cqlsh use test1; cqlsh:test1 describe table pns_credentials; CREATE TABLE pns_credentials ( ise text PRIMARY KEY, isnew int, ts timestamp, mergestatus int, infranetaccount text, user_level int, msisdn bigint, mergeusertype int ) WITH comment
How to set LeveledCompactionStrategy for an existing table
Hello, I am using Cassandra 1.1.1 and CQL3. I have a cluster with 1 node (test environment) Could you tell how to set the compaction strategy to Leveled Strategy for an existing table ? I have a table pns_credentials jal@jal-VirtualBox:~/cassandra/apache-cassandra-1.1.1/bin$ ./cqlsh -3 Connected to Test Cluster at localhost:9160. [cqlsh 2.2.0 | Cassandra 1.1.1 | CQL spec 3.0.0 | Thrift protocol 19.32.0] Use HELP for help. cqlsh use test1; cqlsh:test1 describe table pns_credentials; CREATE TABLE pns_credentials ( ise text PRIMARY KEY, isnew int, ts timestamp, mergestatus int, infranetaccount text, user_level int, msisdn bigint, mergeusertype int ) WITH comment='' AND comparator=text AND read_repair_chance=0.10 AND gc_grace_seconds=864000 AND default_validation=text AND min_compaction_threshold=4 AND max_compaction_threshold=32 AND replicate_on_write='true' AND compaction_strategy_class='SizeTieredCompactionStrategy' AND compression_parameters:sstable_compression='SnappyCompressor'; I want to set the LeveledCompaction strategy for this table, so I execute the following ALTER TABLE : cqlsh:test1 alter table pns_credentials ... WITH compaction_strategy_class='LeveledCompactionStrategy' ... AND compaction_strategy_options:sstable_size_in_mb=10; In Cassandra logs, I see some informations : INFO 10:23:52,532 Enqueuing flush of Memtable-schema_columnfamilies@965212657(1391/1738 serialized/live bytes, 20 ops) INFO 10:23:52,533 Writing Memtable-schema_columnfamilies@965212657(1391/1738 serialized/live bytes, 20 ops) INFO 10:23:52,629 Completed flushing /var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-94-Data.db (1442 bytes) for commitlog position ReplayPosition(segmentId=3556583843054, position=1987) However, when I look at the description of the table, the table is still with the SizeTieredCompactionStrategy cqlsh:test1 describe table pns_credentials ; CREATE TABLE pns_credentials ( ise text PRIMARY KEY, isnew int, ts timestamp, mergestatus int, infranetaccount text, user_level int, msisdn bigint, mergeusertype int ) WITH comment='' AND comparator=text AND read_repair_chance=0.10 AND gc_grace_seconds=864000 AND default_validation=text AND min_compaction_threshold=4 AND max_compaction_threshold=32 AND replicate_on_write='true' AND compaction_strategy_class='SizeTieredCompactionStrategy' AND compression_parameters:sstable_compression='SnappyCompressor'; In the schema_columnfamilies table (in system keyspace), the table pns_credentials is still using the SizeTieredCompactionStrategy cqlsh:test1 use system; cqlsh:system select * from schema_columnfamilies ; ... test1 | pns_credentials | null | KEYS_ONLY |[] | | org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy | {} | org.apache.cassandra.db.marshal.UTF8Type | {sstable_compression:org.apache.cassandra.io.compress.SnappyCompressor} | org.apache.cassandra.db.marshal.UTF8Type | 864000 | 1029 | ise | org.apache.cassandra.db.marshal.UTF8Type |0 | 32 |4 |0.1 | True | null | Standard |null ... I stopped/started the Cassandra node, but the table is still with SizeTieredCompactionStrategy I tried using cassandra-cli, but the alter is still unsuccessfull. Is there anything I am missing ? Thanks. Jean-Armel
Re: How to set LeveledCompactionStrategy for an existing table
I tried as you said with cassandra-cli, and still unsuccessfully [default@unknown] use test1; Authenticated to keyspace: test1 [default@test1] UPDATE COLUMN FAMILY pns_credentials with compaction_strategy='LeveledCompactionStrategy'; 8ed12919-ef2b-327f-8f57-4c2de26c9d51 Waiting for schema agreement... ... schemas agree across the cluster And then, when I check the compaction strategy, it is still SizeTieredCompactionStrategy [default@test1] describe pns_credentials; ColumnFamily: pns_credentials Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.UTF8Type Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 0.1 DC Local Read repair chance: 0.0 Replicate on write: true Caching: KEYS_ONLY Bloom Filter FP chance: default Built indexes: [] Column Metadata: Column Name: isnew Validation Class: org.apache.cassandra.db.marshal.Int32Type Column Name: ts Validation Class: org.apache.cassandra.db.marshal.DateType Column Name: mergestatus Validation Class: org.apache.cassandra.db.marshal.Int32Type Column Name: infranetaccount Validation Class: org.apache.cassandra.db.marshal.UTF8Type Column Name: user_level Validation Class: org.apache.cassandra.db.marshal.Int32Type Column Name: msisdn Validation Class: org.apache.cassandra.db.marshal.LongType Column Name: mergeusertype Validation Class: org.apache.cassandra.db.marshal.Int32Type Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy Compression Options: sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor I tried also to create a new table with LeveledCompactionStrategy (using cqlsh), and when I check the compaction strategy, the SizeTieredCompactionStrategy is set for this table. cqlsh:test1 CREATE TABLE pns_credentials3 ( ... ise text PRIMARY KEY, ... isnew int, ... ts timestamp, ... mergestatus int, ... infranetaccount text, ... user_level int, ... msisdn bigint, ... mergeusertype int ... ) WITH ... comment='' AND ... read_repair_chance=0.10 AND ... gc_grace_seconds=864000 AND ... compaction_strategy_class='LeveledCompactionStrategy' AND ... compression_parameters:sstable_compression='SnappyCompressor'; cqlsh:test1 describe table pns_credentials3 CREATE TABLE pns_credentials3 ( ise text PRIMARY KEY, isnew int, ts timestamp, mergestatus int, infranetaccount text, user_level int, msisdn bigint, mergeusertype int ) WITH comment='' AND comparator=text AND read_repair_chance=0.10 AND gc_grace_seconds=864000 AND default_validation=text AND min_compaction_threshold=4 AND max_compaction_threshold=32 AND replicate_on_write='true' AND compaction_strategy_class='SizeTieredCompactionStrategy' AND compression_parameters:sstable_compression='SnappyCompressor'; Maybe something is wrong in my server. Any idea ? Thanks. Jean-Armel 2012/8/30 feedly team feedly...@gmail.com in cassandra-cli, i did something like: update column family xyz with compaction_strategy='LeveledCompactionStrategy' On Thu, Aug 30, 2012 at 5:20 AM, Jean-Armel Luce jaluc...@gmail.comwrote: Hello, I am using Cassandra 1.1.1 and CQL3. I have a cluster with 1 node (test environment) Could you tell how to set the compaction strategy to Leveled Strategy for an existing table ? I have a table pns_credentials jal@jal-VirtualBox:~/cassandra/apache-cassandra-1.1.1/bin$ ./cqlsh -3 Connected to Test Cluster at localhost:9160. [cqlsh 2.2.0 | Cassandra 1.1.1 | CQL spec 3.0.0 | Thrift protocol 19.32.0] Use HELP for help. cqlsh use test1; cqlsh:test1 describe table pns_credentials; CREATE TABLE pns_credentials ( ise text PRIMARY KEY, isnew int, ts timestamp, mergestatus int, infranetaccount text, user_level int, msisdn bigint, mergeusertype int ) WITH comment='' AND comparator=text AND read_repair_chance=0.10 AND gc_grace_seconds=864000 AND default_validation=text AND min_compaction_threshold=4 AND max_compaction_threshold=32 AND replicate_on_write='true' AND compaction_strategy_class='SizeTieredCompactionStrategy' AND compression_parameters:sstable_compression='SnappyCompressor'; I want to set the LeveledCompaction strategy for this table, so I execute the following ALTER TABLE : cqlsh:test1 alter table pns_credentials ... WITH compaction_strategy_class='LeveledCompactionStrategy' ... AND compaction_strategy_options:sstable_size_in_mb=10; In Cassandra logs, I see some informations
Re: Secondary index and/or row key in the read path ?
Hi Aaron, Thank you for your answer. So, I shall do post-processing for selecting a row using a row key *and* applying a column level filter. Best Regards, Jean-Armel 2012/8/21 aaron morton aa...@thelastpickle.com - do we need to post-process (filter) the result of the query in our application ? Thats the one :) Right now the code paths don't exist to select a row using a row key *and* apply a column level filter. The RPC API does not work that way and I'm not sure if this is something that is planned for CQL. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 20/08/2012, at 6:33 PM, Jean-Armel Luce jaluc...@gmail.com wrote: Hello, I am using Cassandra 1.1.1 and CQL3. Could you tell me what is the best strategy for retrieving a row using a condition on a row key (operator =) and also filter on a 2nd column? For example, I create a table named testwhere with a row key on column mykey and 2 other columns col1 and col2. I would like to retrieve the row with the key 'key1' only if col1 = 'abcd' I send the request SELECT mykey, col1 from testwhere where mykey = 'key1' and col1 = 'abcd'; As you can see, the 1st condition in the WHERE clause is based on the row key. However the request doesn't work if no secondary index is created on the column used in the 2nd condition of the WHERE clause. It works only if a secondary indexed is created on this 2nd column (see below). Does that mean that the secondary index is used in the read path instead of the row key, even if there is a condition on the row key in the WHERE clause ? Here is an example : jal@jal-VirtualBox:~/cassandra/apache-cassandra-1.1.1/bin$ ./cqlsh -3 Connected to Test Cluster at localhost:9160. [cqlsh 2.2.0 | Cassandra 1.1.1 | CQL spec 3.0.0 | Thrift protocol 19.32.0] Use HELP for help. cqlsh use test1; cqlsh:test1 CREATE TABLE testwhere (mykey varchar PRIMARY KEY, ... col1 varchar, ... col2 varchar); cqlsh:test1 INSERT INTO testwhere (mykey, col1, col2) VALUES ('key1', 'abcd', 'efgh'); cqlsh:test1 SELECT mykey, col1 from testwhere where mykey = 'key1'; mykey | col1 ---+-- key1 | abcd cqlsh:test1 SELECT mykey, col1 from testwhere where mykey = 'key1' and col1 = 'abcd'; Bad Request: No indexed columns present in by-columns clause with Equal operator cqlsh:test1 CREATE INDEX col1_idx ON testwhere (col1); cqlsh:test1 SELECT mykey, col1 from testwhere where mykey = 'key1' and col1 = 'abcd'; mykey | col1 ---+-- key1 | abcd cqlsh:test1 My understanding is : The 1st SELECT is working because there is only the row key in the WHERE clause The 2nd SELECT is not working because the row key is in the WHERE clause, but there is no index on col1 The 3rd SELECT (which is the same as the 2nd SELECT) is working because the row key is in the WHERE clause, and a secondary index is created on col1 For this use case, what are the recommendations of the Cassandra community ? - do we need to create a secondary index for each column we want to filter ? - do we need to post-process (filter) the result of the query in our application ? - or is there another solution ? Thanks. Jean-Armel
Secondary index and/or row key in the read path ?
Hello, I am using Cassandra 1.1.1 and CQL3. Could you tell me what is the best strategy for retrieving a row using a condition on a row key (operator =) and also filter on a 2nd column? For example, I create a table named testwhere with a row key on column mykey and 2 other columns col1 and col2. I would like to retrieve the row with the key 'key1' only if col1 = 'abcd' I send the request SELECT mykey, col1 from testwhere where mykey = 'key1' and col1 = 'abcd'; As you can see, the 1st condition in the WHERE clause is based on the row key. However the request doesn't work if no secondary index is created on the column used in the 2nd condition of the WHERE clause. It works only if a secondary indexed is created on this 2nd column (see below). Does that mean that the secondary index is used in the read path instead of the row key, even if there is a condition on the row key in the WHERE clause ? Here is an example : jal@jal-VirtualBox:~/cassandra/apache-cassandra-1.1.1/bin$ ./cqlsh -3 Connected to Test Cluster at localhost:9160. [cqlsh 2.2.0 | Cassandra 1.1.1 | CQL spec 3.0.0 | Thrift protocol 19.32.0] Use HELP for help. cqlsh use test1; cqlsh:test1 CREATE TABLE testwhere (mykey varchar PRIMARY KEY, ... col1 varchar, ... col2 varchar); cqlsh:test1 INSERT INTO testwhere (mykey, col1, col2) VALUES ('key1', 'abcd', 'efgh'); cqlsh:test1 SELECT mykey, col1 from testwhere where mykey = 'key1'; mykey | col1 ---+-- key1 | abcd cqlsh:test1 SELECT mykey, col1 from testwhere where mykey = 'key1' and col1 = 'abcd'; Bad Request: No indexed columns present in by-columns clause with Equal operator cqlsh:test1 CREATE INDEX col1_idx ON testwhere (col1); cqlsh:test1 SELECT mykey, col1 from testwhere where mykey = 'key1' and col1 = 'abcd'; mykey | col1 ---+-- key1 | abcd cqlsh:test1 My understanding is : The 1st SELECT is working because there is only the row key in the WHERE clause The 2nd SELECT is not working because the row key is in the WHERE clause, but there is no index on col1 The 3rd SELECT (which is the same as the 2nd SELECT) is working because the row key is in the WHERE clause, and a secondary index is created on col1 For this use case, what are the recommendations of the Cassandra community ? - do we need to create a secondary index for each column we want to filter ? - do we need to post-process (filter) the result of the query in our application ? - or is there another solution ? Thanks. Jean-Armel