Re: Announcing Mutagen
On 5/16/13 10:22 PM, Todd Fast wrote: Mutagen Cassandra is a framework providing schema versioning and mutation for Apache Cassandra. It is similar to Flyway for SQL databases. https://github.com/toddfast/mutagen-cassandra Mutagen is a lightweight framework for applying versioned changes (known as mutations) to a resource, in this case a Cassandra schema. Mutagen takes into account the resource's existing state and only applies changes that haven't yet been applied. Hi Todd, Looking at your code and you have the ColumnPrefixDistributedRowLock commented out. Could it be that the mutation is taking longer than a second to run? Are they only happening during testing simultaneous updates? Maybe they aren't being cleaned up? Funny timing, I'm working on porting Scala Migrations [1] to Cassandra and have a working implementation. It's not as fancy as Scala Migrations (it doesn't scan a package for migration subclasses and it currently doesn't do rollbacks) but it gets the basics done. Hoping to release code in the near future. Differences from Mutagen: 1) Mutations are written only in Scala. 2) Since its a new project, it uses a Java Driver session instead of a Astyanax connection since I only intend to use CQL3 tables. Blair [1] http://code.google.com/p/scala-migrations/
Re: how to access data only on specific node
Oh, I finally understand. As I read records one by one they aren't necessarily read from a single node, so if I got 965 records out of 1000, some of them could be read from other nodes which have all of 1000 records. And about range scan - as far as I understand, range scan could be done only with Order Preserving Partitioner, but not with Random Partitioner... It would be cool to have consistency level of LOCAL to examine content of a local node for test purposes. 2013/5/17 aaron morton aa...@thelastpickle.com Are you using a multi get or a range slice ? Read Repair does not run for range slice queries. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 15/05/2013, at 6:51 PM, Sergey Naumov sknau...@gmail.com wrote: see that RR works, but sometimes number of records have been read degrades. RR is enabled on a random 10% of requests, see the read_repair_chance setting for the CF. OK, but I forgot to mention the main thing - each node in my config is a standalone datacenter and distribution is DC1:1, DC2:1, DC3:1. So when I try to read 1000 records with consistency ONE multiple times while connected to node that just have been turned on, I got the following count of records read (approximately): 120 220 310 390 950 960 965 !! 955 !! 970 ... If all other nodes contain 1000 records and read repair already delivered 965 records to local DC (and so - local node), why sometimes I see degradation of total records read? 2013/5/15 aaron morton aa...@thelastpickle.com see that RR works, but sometimes number of records have been read degrades. RR is enabled on a random 10% of requests, see the read_repair_chance setting for the CF. If so, then the question is: how to perform local reads to examine content of specific node? You can check which nodes are replicas for a key using nodetool getendpoints If you want to read all the rows for a particular row you need to use a range scan and limit it by the token ranges assigned to the node. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 14/05/2013, at 10:29 PM, Sergey Naumov sknau...@gmail.com wrote: Hello. I'am playing with demo cassandra cluster and decided to test read repair + hinted handoff. One node of a cluster was put down deliberately, and on the other nodes I inserted some records (say 1000). HH is off on all nodes. Then I turned on the node, connected to it with cql (locally, so to localhost) and performed 1000 reads by row key (with consistency ONE). I see that RR works, but sometimes number of records have been read degrades. Is it because consistency ONE and local reads is not the same thing? If so, then the question is: how to perform local reads to examine content of specific node? Thanks in advance, Sergey Naumov.
Re: How to add new DC to cluster when GossipingPropertyFileSnitch is used
If I understand you correctly, GossipingPropertyFileSnitch is useful for manipulations with nodes within a single DC, but to add a new DC without having to restart every node in all DCs (because seeds are specified in cassandra.yaml and I need to restart a node after addition of a new seed from newly created DC), I anyway have to use cassandra-topology.properties and edit it on every node of a cluster. By the way, it it necessary to specify seeds if I use PropertyFileSnitch and there is info in cassandra-topology.properties about all nodes of a cluster? 2013/5/17 aaron morton aa...@thelastpickle.com You should configure the seeds as recommended regardless of the snitch used. You need to update the yaml file to start using the GossipingPropertyFileSnitch but after that it reads the cassandra-rackdc.properties file to get information about the node. It reads uses the information in gossip to get information about the other nodes in the cluster. If there is no info in gossip about a remote node, because say it has not been upgraded, it will fall back to using cassandra-topology.properties. Hope that helps. - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 15/05/2013, at 8:10 PM, Sergey Naumov sknau...@gmail.com wrote: As far as I understand, GossipingPropertyFileSnitch supposed to provide more flexibility in nodes addition/removal. But what about addition of a DC? In datastax documentation ( http://www.datastax.com/docs/1.2/operations/add_replace_nodes#add-dc) it is said that cassandra-topology.properties could be updated without restart for PropertyFileSnitch. But here ( http://www.datastax.com/docs/1.0/initialize/cluster_init_multi_dc) it it said, that you MUST include at least one node from EACH data center. It is a best practice to have at more than one seed node per data center and the seed list should be the same for each node. At the first glance it seems that PropertyFileSnitch will get necessary info from cassandra-topology.properties, but for GossipingPropertyFileSnitch modification of cassandra.yaml and restart of all nodes in all DCs will be required. Could somebody clarify this topic? Thanks in advance, Sergey Naumov.
Re: How to add new DC to cluster when GossipingPropertyFileSnitch is used
I see no reason to restart all nodes. You can continue to use seed from first DC - seed used for loading ring configuration(locations, token ranges, etc), not data. On 05/17/2013 10:34 AM, Sergey Naumov wrote: If I understand you correctly, GossipingPropertyFileSnitch is useful for manipulations with nodes within a single DC, but to add a new DC without having to restart every node in all DCs (because seeds are specified in cassandra.yaml and I need to restart a node after addition of a new seed from newly created DC), I anyway have to use cassandra-topology.properties and edit it on every node of a cluster. By the way, it it necessary to specify seeds if I use PropertyFileSnitch and there is info in cassandra-topology.properties about all nodes of a cluster? Yes, it is. Cassandra need seed(s), because topology properties have no info about token ranges. 2013/5/17 aaron morton aa...@thelastpickle.com mailto:aa...@thelastpickle.com You should configure the seeds as recommended regardless of the snitch used. You need to update the yaml file to start using the GossipingPropertyFileSnitch but after that it reads the cassandra-rackdc.properties file to get information about the node. It reads uses the information in gossip to get information about the other nodes in the cluster. If there is no info in gossip about a remote node, because say it has not been upgraded, it will fall back to using cassandra-topology.properties. Hope that helps. - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 15/05/2013, at 8:10 PM, Sergey Naumov sknau...@gmail.com mailto:sknau...@gmail.com wrote: As far as I understand, GossipingPropertyFileSnitch supposed to provide more flexibility in nodes addition/removal. But what about addition of a DC? In datastax documentation (http://www.datastax.com/docs/1.2/operations/add_replace_nodes#add-dc) it is said that cassandra-topology.properties could be updated without restart for PropertyFileSnitch. But here (http://www.datastax.com/docs/1.0/initialize/cluster_init_multi_dc) it it said, that you MUST include at least one node from EACH data center. It is a best practice to have at more than one seed node per data center and the seed list should be the same for each node. At the first glance it seems that PropertyFileSnitch will get necessary info from cassandra-topology.properties, but for GossipingPropertyFileSnitch modification of cassandra.yaml and restart of all nodes in all DCs will be required. Could somebody clarify this topic? Thanks in advance, Sergey Naumov.
Re: How to add new DC to cluster when GossipingPropertyFileSnitch is used
But I've read in some sources (for example http://www.datastax.com/docs/1.0/initialize/cluster_init_multi_dc) that seed list MUST include at least one seed from each DC and seed lists should be the same for each node. Or it is fine if nodes from new DC have all seeds specified and nodes from old DCs have all seeds specified except seeds from new DC? In such interpretation rules have to be a bit modified: 1. Nodes from the same DC should have identical seeds lists. 2. At least at one DC nodes MUST have in its seed lists seeds from all other DCs. 2013/5/17 Igor i...@4friends.od.ua I see no reason to restart all nodes. You can continue to use seed from first DC - seed used for loading ring configuration(locations, token ranges, etc), not data. On 05/17/2013 10:34 AM, Sergey Naumov wrote: If I understand you correctly, GossipingPropertyFileSnitch is useful for manipulations with nodes within a single DC, but to add a new DC without having to restart every node in all DCs (because seeds are specified in cassandra.yaml and I need to restart a node after addition of a new seed from newly created DC), I anyway have to use cassandra-topology.properties and edit it on every node of a cluster. By the way, it it necessary to specify seeds if I use PropertyFileSnitch and there is info in cassandra-topology.properties about all nodes of a cluster? Yes, it is. Cassandra need seed(s), because topology properties have no info about token ranges. 2013/5/17 aaron morton aa...@thelastpickle.com You should configure the seeds as recommended regardless of the snitch used. You need to update the yaml file to start using the GossipingPropertyFileSnitch but after that it reads the cassandra-rackdc.properties file to get information about the node. It reads uses the information in gossip to get information about the other nodes in the cluster. If there is no info in gossip about a remote node, because say it has not been upgraded, it will fall back to using cassandra-topology.properties. Hope that helps. - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 15/05/2013, at 8:10 PM, Sergey Naumov sknau...@gmail.com wrote: As far as I understand, GossipingPropertyFileSnitch supposed to provide more flexibility in nodes addition/removal. But what about addition of a DC? In datastax documentation ( http://www.datastax.com/docs/1.2/operations/add_replace_nodes#add-dc) it is said that cassandra-topology.properties could be updated without restart for PropertyFileSnitch. But here ( http://www.datastax.com/docs/1.0/initialize/cluster_init_multi_dc) it it said, that you MUST include at least one node from EACH data center. It is a best practice to have at more than one seed node per data center and the seed list should be the same for each node. At the first glance it seems that PropertyFileSnitch will get necessary info from cassandra-topology.properties, but for GossipingPropertyFileSnitch modification of cassandra.yaml and restart of all nodes in all DCs will be required. Could somebody clarify this topic? Thanks in advance, Sergey Naumov.
Re: How to add new DC to cluster when GossipingPropertyFileSnitch is used
On 05/17/2013 11:19 AM, Sergey Naumov wrote: But I've read in some sources (for example http://www.datastax.com/docs/1.0/initialize/cluster_init_multi_dc) that seed list MUST include at least one seed from each DC and seed lists should be the same for each node. Or it is fine if nodes from new DC have all seeds specified and nodes from old DCs have all seeds specified except seeds from new DC? In such interpretation rules have to be a bit modified: I never have problems with adding new nodes and new DC having single seed per cluster in one old DC. 1. Nodes from the same DC should have identical seeds lists. 2. At least at one DC nodes MUST have in its seed lists seeds from all other DCs. 2013/5/17 Igor i...@4friends.od.ua mailto:i...@4friends.od.ua I see no reason to restart all nodes. You can continue to use seed from first DC - seed used for loading ring configuration(locations, token ranges, etc), not data. On 05/17/2013 10:34 AM, Sergey Naumov wrote: If I understand you correctly, GossipingPropertyFileSnitch is useful for manipulations with nodes within a single DC, but to add a new DC without having to restart every node in all DCs (because seeds are specified in cassandra.yaml and I need to restart a node after addition of a new seed from newly created DC), I anyway have to use cassandra-topology.properties and edit it on every node of a cluster. By the way, it it necessary to specify seeds if I use PropertyFileSnitch and there is info in cassandra-topology.properties about all nodes of a cluster? Yes, it is. Cassandra need seed(s), because topology properties have no info about token ranges. 2013/5/17 aaron morton aa...@thelastpickle.com mailto:aa...@thelastpickle.com You should configure the seeds as recommended regardless of the snitch used. You need to update the yaml file to start using the GossipingPropertyFileSnitch but after that it reads the cassandra-rackdc.properties file to get information about the node. It reads uses the information in gossip to get information about the other nodes in the cluster. If there is no info in gossip about a remote node, because say it has not been upgraded, it will fall back to using cassandra-topology.properties. Hope that helps. - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 15/05/2013, at 8:10 PM, Sergey Naumov sknau...@gmail.com mailto:sknau...@gmail.com wrote: As far as I understand, GossipingPropertyFileSnitch supposed to provide more flexibility in nodes addition/removal. But what about addition of a DC? In datastax documentation (http://www.datastax.com/docs/1.2/operations/add_replace_nodes#add-dc) it is said that cassandra-topology.properties could be updated without restart for PropertyFileSnitch. But here (http://www.datastax.com/docs/1.0/initialize/cluster_init_multi_dc) it it said, that you MUST include at least one node from EACH data center. It is a best practice to have at more than one seed node per data center and the seed list should be the same for each node. At the first glance it seems that PropertyFileSnitch will get necessary info from cassandra-topology.properties, but for GossipingPropertyFileSnitch modification of cassandra.yaml and restart of all nodes in all DCs will be required. Could somebody clarify this topic? Thanks in advance, Sergey Naumov.
update does not apply to any replica if consistency = ALL and one replica is down
As described here ( http://maxgrinev.com/2010/07/12/update-idempotency-why-it-is-important-in-cassandra-applications-2/), if consistency level couldn't be met, updates are applied anyway on functional replicas, and they could be propagated later to other replicas using repair mechanisms or by issuing the same request later, as update operations are idempotent in Cassandra. But... on my configuration (Cassandra 1.2.4, python CQL 1.0.4, DC1 - 3 nodes, DC2 - 3 nodes, DC3 - 1 node, RF={DC1:3, DC2:2, DC3:1}, Random Partitioner, GossipingPropertyFileSnitch, one node in DC1 is deliberately down - and, as RF for DC1 is 3, this down node is a replica node for 100% of records), when I try to insert one record with consistency level of ALL, this insert does not appear on any replica (-s30 - is a serial of UUID1: 001e--1000--x (30 is 1e in hex), -n1 mean that we will insert/update a single record with first id from this series - 001e--1000--): *write with consistency ALL:* cassandra@host11:~/Cassandra$ ./insert.sh -s30 -n1 -cALL Traceback (most recent call last): File ./aux/fastinsert.py, line 54, in insert curs.execute(cmd, consistency_level=p.conlvl) OperationalError: Unable to complete request: one or more nodes were unavailable. Last record UUID is 001e--1000--* * about 10 seconds passed... * read with consistency ONE:* cassandra@host11:~/Cassandra$ ./select.sh -s30 -n1 -cONE Total records read: *0* Last record UUID is 001e--1000-- *read with consistency QUORUM:* cassandra@host11:~/Cassandra$ ./select.sh -s30 -n1 -cQUORUM Total records read: *0* Last record UUID is 001e--1000-- *write with consistency QUORUM:* cassandra@host11:~/Cassandra$ ./insert.sh -s30 -n1 -cQUORUM Last record UUID is 001e--1000-- *read with consistency QUORUM:* cassandra@host11:~/Cassandra$ ./select.sh -s30 -n1 -cQUORUM Total records read: *1* Last record UUID is 001e--1000-- Is it a new feature of Cassandra that it does not perform a write to any replica if consistency couldn't be satisfied? If so, then is it true for all cases, for example when returned error is OperationalError: Request did not complete within rpc_timeout? Thanks in advance, Sergey Naumov.
Re: Announcing Mutagen
Hi Blair-- Thanks for digging into the code. I did indeed experiment with longer timeouts and the result was that trying to obtain the lock hung for whatever amount of time I set the timeout for. I am not an expert on Astyanax and haven't debugged my use of that recipe yet; I don't even know if I've configured it correctly. Perhaps you have some guidance? (Funny you mention your own migration framework--Mutagen is the second one I've done for Cassandra. The first one, a plugin for Mokol, also had schema rollbacks and some other features, but was only command-line.) On Thu, May 16, 2013 at 11:06 PM, Blair Zajac bl...@orcaware.com wrote: On 5/16/13 10:22 PM, Todd Fast wrote: Mutagen Cassandra is a framework providing schema versioning and mutation for Apache Cassandra. It is similar to Flyway for SQL databases. https://github.com/toddfast/**mutagen-cassandrahttps://github.com/toddfast/mutagen-cassandra Mutagen is a lightweight framework for applying versioned changes (known as mutations) to a resource, in this case a Cassandra schema. Mutagen takes into account the resource's existing state and only applies changes that haven't yet been applied. Hi Todd, Looking at your code and you have the ColumnPrefixDistributedRowLock commented out. Could it be that the mutation is taking longer than a second to run? Are they only happening during testing simultaneous updates? Maybe they aren't being cleaned up? Funny timing, I'm working on porting Scala Migrations [1] to Cassandra and have a working implementation. It's not as fancy as Scala Migrations (it doesn't scan a package for migration subclasses and it currently doesn't do rollbacks) but it gets the basics done. Hoping to release code in the near future. Differences from Mutagen: 1) Mutations are written only in Scala. 2) Since its a new project, it uses a Java Driver session instead of a Astyanax connection since I only intend to use CQL3 tables. Blair [1] http://code.google.com/p/**scala-migrations/http://code.google.com/p/scala-migrations/
Re: Announcing Mutagen
Now that comparators can be changed I am internally wondering if every column, rowkey,value in c* should be a dynamic composite and then everything can evolve On Fri, May 17, 2013 at 5:35 AM, Todd Fast t...@toddfast.com wrote: Hi Blair-- Thanks for digging into the code. I did indeed experiment with longer timeouts and the result was that trying to obtain the lock hung for whatever amount of time I set the timeout for. I am not an expert on Astyanax and haven't debugged my use of that recipe yet; I don't even know if I've configured it correctly. Perhaps you have some guidance? (Funny you mention your own migration framework--Mutagen is the second one I've done for Cassandra. The first one, a plugin for Mokol, also had schema rollbacks and some other features, but was only command-line.) On Thu, May 16, 2013 at 11:06 PM, Blair Zajac bl...@orcaware.com wrote: On 5/16/13 10:22 PM, Todd Fast wrote: Mutagen Cassandra is a framework providing schema versioning and mutation for Apache Cassandra. It is similar to Flyway for SQL databases. https://github.com/toddfast/**mutagen-cassandrahttps://github.com/toddfast/mutagen-cassandra Mutagen is a lightweight framework for applying versioned changes (known as mutations) to a resource, in this case a Cassandra schema. Mutagen takes into account the resource's existing state and only applies changes that haven't yet been applied. Hi Todd, Looking at your code and you have the ColumnPrefixDistributedRowLock commented out. Could it be that the mutation is taking longer than a second to run? Are they only happening during testing simultaneous updates? Maybe they aren't being cleaned up? Funny timing, I'm working on porting Scala Migrations [1] to Cassandra and have a working implementation. It's not as fancy as Scala Migrations (it doesn't scan a package for migration subclasses and it currently doesn't do rollbacks) but it gets the basics done. Hoping to release code in the near future. Differences from Mutagen: 1) Mutations are written only in Scala. 2) Since its a new project, it uses a Java Driver session instead of a Astyanax connection since I only intend to use CQL3 tables. Blair [1] http://code.google.com/p/**scala-migrations/http://code.google.com/p/scala-migrations/
C language - cassandra
Hello, new here, What are my options in using cassandra from a program written in c? A) Thrift has no documentation, so it will take me time to understand. Thrift also doesnt have a balancing pool, asking different nodes every time, which is a big problem. B) Should I use the hector (java) client and then send the data to my program with my own protocol? Seems a lot of unnecessary work. Any other suggestions? -- Sincerely yours, Apostolis Xekoukoulotakis
Logging Cassandra queries
Hi! For quite time I've been having some unexpected loadavg in the cassandra servers. I suspect there are lots of uncontrolled queries to the cassandra servers causing this load, but the developers say that there are none, and the load is due to cassandra internal processes. Trying to get to the bottom, I've been looking into completed ReadStage and MutationStage through JMX, and the numbers seem to confirm my theory, but I'd like to go one step forward and, if possible, list all the queries from the webservers to the cassandra cluster (just one node would be enough). I've been playing with cassandra loglevels, and I can see when a Read or a Write is done, but it would be better if I could knew the CF of the query. For my tests I've put the in the log4j.server log4j.rootLogger=DEBUG,stdout,R, writing and reading a test CF, and I can't see the name of it anywhere. For the test I'm using Cassandra 0.8.4 (yes, still), as my production servers, and also 1.0.11. Maybe this changes in 1.1? Maybe I'm doing something wrong? Any hint? And... could I be more precise when enabling logging? Because right now, with *log4j.rootLogger=DEBUG,stdout,R* I'm getting a lot of information I won't use ever, and I'd like to enable just what I need to see gets and seds Thanks in advance, Tomàs
Re: C language - cassandra
Hi Apostolis I'm the author of libcassie, a C library for cassandra that wraps the C++ libcassandra library. It's in use in production where I work, however it has not received much traction elsewhere as far as I know. You can get it here: https://github.com/minaguib/libcassandra/tree/kickstart-libcassie-0.7 It has not been updated for a while (for example no CQL support, no pooling support). I've been waiting for either the thrift C-glibc interface to mature, or the thriftless-CQL-binary protocol to mature, before putting effort into updating/rewriting it. It might however satisfy your needs with its current functionality. On 2013-05-17, at 10:42 AM, Apostolis Xekoukoulotakis xekou...@gmail.com wrote: Hello, new here, What are my options in using cassandra from a program written in c? A) Thrift has no documentation, so it will take me time to understand. Thrift also doesnt have a balancing pool, asking different nodes every time, which is a big problem. B) Should I use the hector (java) client and then send the data to my program with my own protocol? Seems a lot of unnecessary work. Any other suggestions? -- Sincerely yours, Apostolis Xekoukoulotakis
Re: best practices on EC2 question
b) do people skip backups altogether except for huge outages and just let rebooted server instances come up empty to repopulate via C*? This one. Bootstrapping a new node into the cluster has a small impact on the existing nodes and the new nodes to have all the data they need when the finish the process. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 17/05/2013, at 3:17 AM, Janne Jalkanen janne.jalka...@ecyrd.com wrote: On May 16, 2013, at 17:05 , Brian Tarbox tar...@cabotresearch.com wrote: An alternative that we had explored for a while was to do a two stage backup: 1) copy a C* snapshot from the ephemeral drive to an EBS drive 2) do an EBS snapshot to S3. The idea being that EBS is quite reliable, S3 is still the emergency backup and copying back from EBS to ephemeral is likely much faster than the 15 MB/sec we get from S3. Yup, this is what we do. We use rsync with --bwlimit=4000 to copy the snapshots from the eph drive to EBS; this is intentionally very low so that the backup process does not take eat our I/O. This is on m1.xlarge instances; YMMV so measure :). EBS drives are then snapshot with ec2-consistent-snapshot and then old snapshots expired using ec2-expire-snapshots (I believe these scripts are from Alestic). /Janne
Re: update does not apply to any replica if consistency = ALL and one replica is down
I think you're conflating may with must. That article says that updates may still be applied to some replicas when there is a failure and I believe that still is the case. However, if the coordinator knows that the CL can't be met before even attempting the write, I don't think it will attempt the write. -Bryan On Fri, May 17, 2013 at 1:48 AM, Sergey Naumov sknau...@gmail.com wrote: As described here ( http://maxgrinev.com/2010/07/12/update-idempotency-why-it-is-important-in-cassandra-applications-2/), if consistency level couldn't be met, updates are applied anyway on functional replicas, and they could be propagated later to other replicas using repair mechanisms or by issuing the same request later, as update operations are idempotent in Cassandra. But... on my configuration (Cassandra 1.2.4, python CQL 1.0.4, DC1 - 3 nodes, DC2 - 3 nodes, DC3 - 1 node, RF={DC1:3, DC2:2, DC3:1}, Random Partitioner, GossipingPropertyFileSnitch, one node in DC1 is deliberately down - and, as RF for DC1 is 3, this down node is a replica node for 100% of records), when I try to insert one record with consistency level of ALL, this insert does not appear on any replica (-s30 - is a serial of UUID1: 001e--1000--x (30 is 1e in hex), -n1 mean that we will insert/update a single record with first id from this series - 001e--1000--): *write with consistency ALL:* cassandra@host11:~/Cassandra$ ./insert.sh -s30 -n1 -cALL Traceback (most recent call last): File ./aux/fastinsert.py, line 54, in insert curs.execute(cmd, consistency_level=p.conlvl) OperationalError: Unable to complete request: one or more nodes were unavailable. Last record UUID is 001e--1000--* * about 10 seconds passed... * read with consistency ONE:* cassandra@host11:~/Cassandra$ ./select.sh -s30 -n1 -cONE Total records read: *0* Last record UUID is 001e--1000-- *read with consistency QUORUM:* cassandra@host11:~/Cassandra$ ./select.sh -s30 -n1 -cQUORUM Total records read: *0* Last record UUID is 001e--1000-- *write with consistency QUORUM:* cassandra@host11:~/Cassandra$ ./insert.sh -s30 -n1 -cQUORUM Last record UUID is 001e--1000-- *read with consistency QUORUM:* cassandra@host11:~/Cassandra$ ./select.sh -s30 -n1 -cQUORUM Total records read: *1* Last record UUID is 001e--1000-- Is it a new feature of Cassandra that it does not perform a write to any replica if consistency couldn't be satisfied? If so, then is it true for all cases, for example when returned error is OperationalError: Request did not complete within rpc_timeout? Thanks in advance, Sergey Naumov.
Re: C language - cassandra
Thanks Mina for your work. One other option could be to use pycassa and link the code with my c program, but I have no experience with python at all. Maybe this will be better since pycassa seems to have a strong community. 2013/5/17 Mina Naguib mina.nag...@adgear.com Hi Apostolis I'm the author of libcassie, a C library for cassandra that wraps the C++ libcassandra library. It's in use in production where I work, however it has not received much traction elsewhere as far as I know. You can get it here: https://github.com/minaguib/libcassandra/tree/kickstart-libcassie-0.7 It has not been updated for a while (for example no CQL support, no pooling support). I've been waiting for either the thrift C-glibc interface to mature, or the thriftless-CQL-binary protocol to mature, before putting effort into updating/rewriting it. It might however satisfy your needs with its current functionality. On 2013-05-17, at 10:42 AM, Apostolis Xekoukoulotakis xekou...@gmail.com wrote: Hello, new here, What are my options in using cassandra from a program written in c? A) Thrift has no documentation, so it will take me time to understand. Thrift also doesnt have a balancing pool, asking different nodes every time, which is a big problem. B) Should I use the hector (java) client and then send the data to my program with my own protocol? Seems a lot of unnecessary work. Any other suggestions? -- Sincerely yours, Apostolis Xekoukoulotakis -- Sincerely yours, Apostolis Xekoukoulotakis
Re: best practices on EC2 question
On Fri, May 17, 2013 at 11:13 AM, aaron morton aa...@thelastpickle.com wrote: Bootstrapping a new node into the cluster has a small impact on the existing nodes and the new nodes to have all the data they need when the finish the process. Sorry for the pedantry, but bootstrapping from existing replicas cannot guarantee that the new nodes have all the data they need when they finish the process. There is a non-zero chance that the failed node contained the single under-replicated copy of a given datum. In practice if your RF is =2, you are unlikely to experience this type of data loss. But restore-a-backup-then-repair protects you against this unlikely case. =Rob
Re: pycassa failures in large batch cycling
IMHO you are going to have more success breaking up your work load to work with the current settings. The buffers created by thrift are going to eat up the server side memory. They grow dynamically but persist for the life of the connection. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 17/05/2013, at 3:09 PM, John R. Frank j...@mit.edu wrote: On Tue, 14 May 2013, aaron morton wrote: After several cycles, pycassa starts getting connection failures. Do you have the error stack ?Are the TimedOutExceptions or socket time outs or something else. I figured out the problem here and made this ticket in jira: https://issues.apache.org/jira/browse/CASSANDRA-5575 Summary: the Thrift interfaces to Cassandra are simply not able to load large batches without putting the client into an infinite retry loop. Seems that the only robust solutions involve either features added to Thrift and all Cassandra clients, or a new interface mechanism. jrf
Re: how to access data only on specific node
And about range scan - as far as I understand, range scan could be done only with Order Preserving Partitioner, but not with Random Partitioner. Range scan can be used with any partitioner. If you use it with the RP the order of the rows will be ranged. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 17/05/2013, at 7:19 PM, Sergey Naumov sknau...@gmail.com wrote: Oh, I finally understand. As I read records one by one they aren't necessarily read from a single node, so if I got 965 records out of 1000, some of them could be read from other nodes which have all of 1000 records. And about range scan - as far as I understand, range scan could be done only with Order Preserving Partitioner, but not with Random Partitioner... It would be cool to have consistency level of LOCAL to examine content of a local node for test purposes. 2013/5/17 aaron morton aa...@thelastpickle.com Are you using a multi get or a range slice ? Read Repair does not run for range slice queries. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 15/05/2013, at 6:51 PM, Sergey Naumov sknau...@gmail.com wrote: see that RR works, but sometimes number of records have been read degrades. RR is enabled on a random 10% of requests, see the read_repair_chance setting for the CF. OK, but I forgot to mention the main thing - each node in my config is a standalone datacenter and distribution is DC1:1, DC2:1, DC3:1. So when I try to read 1000 records with consistency ONE multiple times while connected to node that just have been turned on, I got the following count of records read (approximately): 120 220 310 390 950 960 965 !! 955 !! 970 ... If all other nodes contain 1000 records and read repair already delivered 965 records to local DC (and so - local node), why sometimes I see degradation of total records read? 2013/5/15 aaron morton aa...@thelastpickle.com see that RR works, but sometimes number of records have been read degrades. RR is enabled on a random 10% of requests, see the read_repair_chance setting for the CF. If so, then the question is: how to perform local reads to examine content of specific node? You can check which nodes are replicas for a key using nodetool getendpoints If you want to read all the rows for a particular row you need to use a range scan and limit it by the token ranges assigned to the node. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 14/05/2013, at 10:29 PM, Sergey Naumov sknau...@gmail.com wrote: Hello. I'am playing with demo cassandra cluster and decided to test read repair + hinted handoff. One node of a cluster was put down deliberately, and on the other nodes I inserted some records (say 1000). HH is off on all nodes. Then I turned on the node, connected to it with cql (locally, so to localhost) and performed 1000 reads by row key (with consistency ONE). I see that RR works, but sometimes number of records have been read degrades. Is it because consistency ONE and local reads is not the same thing? If so, then the question is: how to perform local reads to examine content of specific node? Thanks in advance, Sergey Naumov.
[BLOG] : Cassandra as a Deep Storage Mechanism for Druid Real-Time Analytics Engine
FWIW, we were able to integrate Druid and Cassandra. Its only in PoC right now, but it seems like a powerful combination: http://brianoneill.blogspot.com/2013/05/cassandra-as-deep-storage-mechanism-for.html -brian -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://brianoneill.blogspot.com/ twitter: @boneill42
Re: How to add new DC to cluster when GossipingPropertyFileSnitch is used
But I've read in some sources (for example http://www.datastax.com/docs/1.0/initialize/cluster_init_multi_dc) that seed list MUST include at least one seed from each DC and seed lists should be the same for each node. That article is about creating a new cluster, to add an a DC to an exiting cluster do this: * set the seed list in the new DC to have seeds from both DC's * update the seed list in the old DC to have seeds from both later. Adding a new DC will normally not happen as often as adding nodes. Using the GossipingPropertyFileSnitch means do you not not have to update all nodes when adding a new one. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 17/05/2013, at 8:42 PM, Igor i...@4friends.od.ua wrote: On 05/17/2013 11:19 AM, Sergey Naumov wrote: But I've read in some sources (for example http://www.datastax.com/docs/1.0/initialize/cluster_init_multi_dc) that seed list MUST include at least one seed from each DC and seed lists should be the same for each node. Or it is fine if nodes from new DC have all seeds specified and nodes from old DCs have all seeds specified except seeds from new DC? In such interpretation rules have to be a bit modified: I never have problems with adding new nodes and new DC having single seed per cluster in one old DC. 1. Nodes from the same DC should have identical seeds lists. 2. At least at one DC nodes MUST have in its seed lists seeds from all other DCs. 2013/5/17 Igor i...@4friends.od.ua I see no reason to restart all nodes. You can continue to use seed from first DC - seed used for loading ring configuration(locations, token ranges, etc), not data. On 05/17/2013 10:34 AM, Sergey Naumov wrote: If I understand you correctly, GossipingPropertyFileSnitch is useful for manipulations with nodes within a single DC, but to add a new DC without having to restart every node in all DCs (because seeds are specified in cassandra.yaml and I need to restart a node after addition of a new seed from newly created DC), I anyway have to use cassandra-topology.properties and edit it on every node of a cluster. By the way, it it necessary to specify seeds if I use PropertyFileSnitch and there is info in cassandra-topology.properties about all nodes of a cluster? Yes, it is. Cassandra need seed(s), because topology properties have no info about token ranges. 2013/5/17 aaron morton aa...@thelastpickle.com You should configure the seeds as recommended regardless of the snitch used. You need to update the yaml file to start using the GossipingPropertyFileSnitch but after that it reads the cassandra-rackdc.properties file to get information about the node. It reads uses the information in gossip to get information about the other nodes in the cluster. If there is no info in gossip about a remote node, because say it has not been upgraded, it will fall back to using cassandra-topology.properties. Hope that helps. - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 15/05/2013, at 8:10 PM, Sergey Naumov sknau...@gmail.com wrote: As far as I understand, GossipingPropertyFileSnitch supposed to provide more flexibility in nodes addition/removal. But what about addition of a DC? In datastax documentation (http://www.datastax.com/docs/1.2/operations/add_replace_nodes#add-dc) it is said that cassandra-topology.properties could be updated without restart for PropertyFileSnitch. But here (http://www.datastax.com/docs/1.0/initialize/cluster_init_multi_dc) it it said, that you MUST include at least one node from EACH data center. It is a best practice to have at more than one seed node per data center and the seed list should be the same for each node. At the first glance it seems that PropertyFileSnitch will get necessary info from cassandra-topology.properties, but for GossipingPropertyFileSnitch modification of cassandra.yaml and restart of all nodes in all DCs will be required. Could somebody clarify this topic? Thanks in advance, Sergey Naumov.
Re: Logging Cassandra queries
And... could I be more precise when enabling logging? Because right now, with log4j.rootLogger=DEBUG,stdout,R I'm getting a lot of information I won't use ever, and I'd like to enable just what I need to see gets and seds…. see the example at the bottom of this file about setting the log level for a single class https://github.com/apache/cassandra/blob/trunk/conf/log4j-server.properties You probably want to set it for the org.apache.cassandra.thrift.CassandraServer class. But I cannot remember what the logging is like in 0.8. Cassandra gets faster in the later versions, which normally means doing less work. Upgrading to 1.1 would be the first step I would take in improving performance. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 18/05/2013, at 4:00 AM, Tomàs Núnez tomas.nu...@groupalia.com wrote: Hi! For quite time I've been having some unexpected loadavg in the cassandra servers. I suspect there are lots of uncontrolled queries to the cassandra servers causing this load, but the developers say that there are none, and the load is due to cassandra internal processes. Trying to get to the bottom, I've been looking into completed ReadStage and MutationStage through JMX, and the numbers seem to confirm my theory, but I'd like to go one step forward and, if possible, list all the queries from the webservers to the cassandra cluster (just one node would be enough). I've been playing with cassandra loglevels, and I can see when a Read or a Write is done, but it would be better if I could knew the CF of the query. For my tests I've put the in the log4j.server log4j.rootLogger=DEBUG,stdout,R, writing and reading a test CF, and I can't see the name of it anywhere. For the test I'm using Cassandra 0.8.4 (yes, still), as my production servers, and also 1.0.11. Maybe this changes in 1.1? Maybe I'm doing something wrong? Any hint? And... could I be more precise when enabling logging? Because right now, with log4j.rootLogger=DEBUG,stdout,R I'm getting a lot of information I won't use ever, and I'd like to enable just what I need to see gets and seds Thanks in advance, Tomàs
Re: update does not apply to any replica if consistency = ALL and one replica is down
one node in DC1 is deliberately down - and, as RF for DC1 is 3, this down node is a replica node for 100% of records), when I try to insert one record with consistency level of ALL, this insert does not appear on any replica This insert will fail to start and the client will get an UnavailableException. You are asking for ALL replicas to be available but have disabled one. It's easier to use QUOURM and QUOURM. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 18/05/2013, at 6:30 AM, Bryan Talbot btal...@aeriagames.com wrote: I think you're conflating may with must. That article says that updates may still be applied to some replicas when there is a failure and I believe that still is the case. However, if the coordinator knows that the CL can't be met before even attempting the write, I don't think it will attempt the write. -Bryan On Fri, May 17, 2013 at 1:48 AM, Sergey Naumov sknau...@gmail.com wrote: As described here (http://maxgrinev.com/2010/07/12/update-idempotency-why-it-is-important-in-cassandra-applications-2/), if consistency level couldn't be met, updates are applied anyway on functional replicas, and they could be propagated later to other replicas using repair mechanisms or by issuing the same request later, as update operations are idempotent in Cassandra. But... on my configuration (Cassandra 1.2.4, python CQL 1.0.4, DC1 - 3 nodes, DC2 - 3 nodes, DC3 - 1 node, RF={DC1:3, DC2:2, DC3:1}, Random Partitioner, GossipingPropertyFileSnitch, one node in DC1 is deliberately down - and, as RF for DC1 is 3, this down node is a replica node for 100% of records), when I try to insert one record with consistency level of ALL, this insert does not appear on any replica (-s30 - is a serial of UUID1: 001e--1000--x (30 is 1e in hex), -n1 mean that we will insert/update a single record with first id from this series - 001e--1000--): write with consistency ALL: cassandra@host11:~/Cassandra$ ./insert.sh -s30 -n1 -cALL Traceback (most recent call last): File ./aux/fastinsert.py, line 54, in insert curs.execute(cmd, consistency_level=p.conlvl) OperationalError: Unable to complete request: one or more nodes were unavailable. Last record UUID is 001e--1000-- about 10 seconds passed... read with consistency ONE: cassandra@host11:~/Cassandra$ ./select.sh -s30 -n1 -cONE Total records read: 0 Last record UUID is 001e--1000-- read with consistency QUORUM: cassandra@host11:~/Cassandra$ ./select.sh -s30 -n1 -cQUORUM Total records read: 0 Last record UUID is 001e--1000-- write with consistency QUORUM: cassandra@host11:~/Cassandra$ ./insert.sh -s30 -n1 -cQUORUM Last record UUID is 001e--1000-- read with consistency QUORUM: cassandra@host11:~/Cassandra$ ./select.sh -s30 -n1 -cQUORUM Total records read: 1 Last record UUID is 001e--1000-- Is it a new feature of Cassandra that it does not perform a write to any replica if consistency couldn't be satisfied? If so, then is it true for all cases, for example when returned error is OperationalError: Request did not complete within rpc_timeout? Thanks in advance, Sergey Naumov.
Re: C language - cassandra
Mina, Could you update this page with your client library ? https://wiki.apache.org/cassandra/ClientOptions Thanks Aaron - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 18/05/2013, at 6:00 AM, Mina Naguib mina.nag...@adgear.com wrote: Hi Apostolis I'm the author of libcassie, a C library for cassandra that wraps the C++ libcassandra library. It's in use in production where I work, however it has not received much traction elsewhere as far as I know. You can get it here: https://github.com/minaguib/libcassandra/tree/kickstart-libcassie-0.7 It has not been updated for a while (for example no CQL support, no pooling support). I've been waiting for either the thrift C-glibc interface to mature, or the thriftless-CQL-binary protocol to mature, before putting effort into updating/rewriting it. It might however satisfy your needs with its current functionality. On 2013-05-17, at 10:42 AM, Apostolis Xekoukoulotakis xekou...@gmail.com wrote: Hello, new here, What are my options in using cassandra from a program written in c? A) Thrift has no documentation, so it will take me time to understand. Thrift also doesnt have a balancing pool, asking different nodes every time, which is a big problem. B) Should I use the hector (java) client and then send the data to my program with my own protocol? Seems a lot of unnecessary work. Any other suggestions? -- Sincerely yours, Apostolis Xekoukoulotakis
Re: best practices on EC2 question
I was considering that when bootstrapping starts the nodes receive writes so that when the process is complete they have both the data from the streaming process and all writes from the time they started. So that a repair is not needed. Compared to bootstrapping a node from a backup where a (non -pr) repair is needed on the node to achieve consistency. In that sense the node as all it's data when the bootstrap has finished. If there is data that is replicated to a single node there is always a risk of data loss. The data could have been written in the time between the last backup and the node failing. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 18/05/2013, at 6:32 AM, Robert Coli rc...@eventbrite.com wrote: On Fri, May 17, 2013 at 11:13 AM, aaron morton aa...@thelastpickle.com wrote: Bootstrapping a new node into the cluster has a small impact on the existing nodes and the new nodes to have all the data they need when the finish the process. Sorry for the pedantry, but bootstrapping from existing replicas cannot guarantee that the new nodes have all the data they need when they finish the process. There is a non-zero chance that the failed node contained the single under-replicated copy of a given datum. In practice if your RF is =2, you are unlikely to experience this type of data loss. But restore-a-backup-then-repair protects you against this unlikely case. =Rob
Re: C++ Thrift client
Aaron, whenever I get a GCInspector event log, will it means that I'm having a GC pause? *Atenciosamente,* *Víctor Hugo Molinar - *@vhmolinar http://twitter.com/#!/vhmolinar On Thu, May 16, 2013 at 8:53 PM, aaron morton aa...@thelastpickle.comwrote: (Assuming you have enabled tcp_nodelay on the client socket) Check the server side latency, using nodetool cfstats or nodetool cfhistograms. Check the logs for messages from the GCInspector about ParNew pauses. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 16/05/2013, at 12:58 PM, Bill Hastings bllhasti...@gmail.com wrote: Hi All I am doing very small inserts into Cassandra in the range of say 64 bytes. I use a C++ Thrift client and seem consistently get latencies anywhere between 35-45 ms. Could some one please advise as to what might be happening? thanks
Re: pycassa failures in large batch cycling
IMHO you are going to have more success breaking up your work load to work with the current settings. The buffers created by thrift are going to eat up the server side memory. They grow dynamically but persist for the life of the connection. Amen to that. Already refactoring our workload to minimize record sizes. Smaller fields means more of them, so batched inserts are even more useful compared to many unbatched inserts. IMO there is still a serious bug: even with smaller individual records, it is trivially easy to put too many small records into a batch_mutate. Right now, clients like pycassa, and I imagine others, are forced into an infinite retry loop under the hood because the thrift exception is indistinguishable from the server crashing --- the application layer has no recourse. I'd love to see a work around that still has the benefit of grouping together many inserts. John
Re: C++ Thrift client
On 2013-05-16 02:58, Bill Hastings wrote: Hi All I am doing very small inserts into Cassandra in the range of say 64 bytes. I use a C++ Thrift client and seem consistently get latencies anywhere between 35-45 ms. Could some one please advise as to what might be happening? Sniff the network traffic in order to check whether you use the same connection or you open a new connection for each new insert. Also check if the client does a set_keyspace (or use keyspace) before every insert. That would be wasteful too. In the worst case, the client would perform an authentication too. Inspect timestamps of the network packets in the capture file in order to determine which part takes too long: the connection phase? The authentication? The interval between sending the request and getting the response? I do something similar (C++ Thrift, small inserts of roughly the same size as you) and I get response times of 100ms for the first request when opening the connection, authentifying, and setting the keyspace. But subsequent requests on the same connection have response times in the range of 8-11ms. Sorin
Re: C language - cassandra
On 2013-05-17 16:42, Apostolis Xekoukoulotakis wrote: Hello, new here, What are my options in using cassandra from a program written in c? A) Thrift has no documentation, so it will take me time to understand. Thrift also doesnt have a balancing pool, asking different nodes every time, which is a big problem. Thrift has a sort of documentation. Check interface/cassandra.thrift in cassandra's source files. The file contains quite thorough comments for each method and data structure. Once you've read this file, it is quite easy to browse through the Cassandra.h and cassandra_types.h that are generated from cassandra.thrift by the thrift compiler. Sending requests is quite straightforward. Setting up a connection is more verbose and, imo, relatively complex. About pools, you're right. I guess you'll have to write your own. B) Should I use the hector (java) client and then send the data to my program with my own protocol? Seems a lot of unnecessary work. Any other suggestions? I would go for thrift. After digging one or two days you'll have it working. Sorin -- Sincerely yours, Apostolis Xekoukoulotakis
Re:
On Thu, May 16, 2013 at 8:49 PM, almeida...@yahoo.com wrote: hi [attack_url] Is there anyone taking care of removing these attack spammers from this list? This is the second such mail in two days. =Rob