Re: All writes fail with ONE consistency level when adding second node to cluster?
Besides the obviously confusing error message, this particular case could simply be that the hash value of the primary key belonged to the other node that wasn’t up, so even though one node was up, it didn’t own that particular hash value or token, so CL=ONE could not succeed. What was RF set to for this two node cluster? -- Jack Krupansky From: Andrew Sent: Wednesday, July 23, 2014 1:02 AM To: graham sanderson ; user@cassandra.apache.org Cc: Kevin Burton Subject: Re: All writes fail with ONE consistency level when adding second node to cluster? I looked into this; ONE means it must be written to one replica—i.e., a node the data is supposed to be written to. ANY means a hinted handoff will “count”. So as long as it writes to any node on the cluster—even one that it’s not supposed to be on—it will be a success. Good to know. Andrew On July 22, 2014 at 8:13:57 PM, graham sanderson (gra...@vast.com) wrote: Incorrect, ONE does not refer to the number of “other nodes, it just refers to the number of nodes. so ONE under normal circumstances would only require one node to acknowledge the write. The confusing error message you are getting is related to https://issues.apache.org/jira/browse/CASSANDRA-833… Kevin you are correct in that normally that error message would make no sense. I don’t have much experience adding/removing nodes, but I think what is happening is that your new node is in the middle of taken over ownership of a token range - while that happens C* is trying to write to both the old owner (your original node), AND (hence the 2 not 1 in the error message) the new owner (the new node) so that once the bootstrapping of the new node is complete, it is immediately safe to delete the (no longer owned data) from the old node. For whatever reason the write to the new node is timing out, causing the exception, and the error message is exposing the “2” which happens to be how many C* thinks it is waiting for at the time (i.e. how many it should be waiting for based on the consistency level (1) plus this extra node). On Jul 22, 2014, at 9:46 PM, Andrew redmu...@gmail.com wrote: ONE means write to one replica (in addition to the original). If you want to write to any of them, use ANY. Is that the right understanding? http://www.datastax.com/docs/1.0/dml/data_consistency Andrew On July 22, 2014 at 7:43:43 PM, Kevin Burton (bur...@spinn3r.com) wrote: I'm super confused by this.. and disturbed that this was my failure scenario :-( I had one cassandra node for the alpha of my app… and now we're moving into beta… which means three replicas. So I added the second node… but my app immediately broke with: Cassandra timeout during write query at consistency ONE (2 replica were required but only 1 acknowledged the write) … but that makes no sense… if I'm at ONE and I have one acknowledged write, why does it matter that the second one hasn't ack'd yet… ? -- Founder/CEO Spinn3r.com Location: San Francisco, CA blog: http://burtonator.wordpress.com … or check out my Google+ profile --
Re: All writes fail with ONE consistency level when adding second node to cluster?
On Tue, Jul 22, 2014 at 7:46 PM, Andrew redmu...@gmail.com wrote: ONE means write to one replica (in addition to the original). If you want to write to any of them, use ANY. Is that the right understanding? This has come up a few times, so let me be unambiguous about when to use CL.ANY : NEVER EVER USE CL.ANY. IT ALMOST CERTAINLY SHOULD NOT EVEN EXIST. IF YOU THINK YOU NEED TO USE IT, YOU ARE ALMOST CERTAINLY WRONG. ;D =Rob
Re: All writes fail with ONE consistency level when adding second node to cluster?
Interesting.. it was unclear what it does… ONE sounds right to me so I was curious what was up with ANY. We just set it to ANY so that we could track down what was causing this bug. On Wed, Jul 23, 2014 at 10:15 AM, Robert Coli rc...@eventbrite.com wrote: On Tue, Jul 22, 2014 at 7:46 PM, Andrew redmu...@gmail.com wrote: ONE means write to one replica (in addition to the original). If you want to write to any of them, use ANY. Is that the right understanding? This has come up a few times, so let me be unambiguous about when to use CL.ANY : NEVER EVER USE CL.ANY. IT ALMOST CERTAINLY SHOULD NOT EVEN EXIST. IF YOU THINK YOU NEED TO USE IT, YOU ARE ALMOST CERTAINLY WRONG. ;D =Rob -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com
Re: All writes fail with ONE consistency level when adding second node to cluster?
Hey now; it is GREAT for a 100% write only use case ;-) On Jul 23, 2014, at 12:15 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Jul 22, 2014 at 7:46 PM, Andrew redmu...@gmail.com wrote: ONE means write to one replica (in addition to the original). If you want to write to any of them, use ANY. Is that the right understanding? This has come up a few times, so let me be unambiguous about when to use CL.ANY : NEVER EVER USE CL.ANY. IT ALMOST CERTAINLY SHOULD NOT EVEN EXIST. IF YOU THINK YOU NEED TO USE IT, YOU ARE ALMOST CERTAINLY WRONG. ;D =Rob smime.p7s Description: S/MIME cryptographic signature
Re: All writes fail with ONE consistency level when adding second node to cluster?
On Wed, Jul 23, 2014 at 12:01 PM, graham sanderson gra...@vast.com wrote: Hey now; it is GREAT for a 100% write only use case ;-) A well WORN [1] path in databases, for sure. =Rob [1] Write Once Read Never
Re: All writes fail with ONE consistency level when adding second node to cluster?
Why that ? In worst case, CL.ANY will write hints for replicas that are down. If will be extraordinary unlucky to have all replicas down at the same time On Wed, Jul 23, 2014 at 9:26 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Jul 23, 2014 at 12:01 PM, graham sanderson gra...@vast.com wrote: Hey now; it is GREAT for a 100% write only use case ;-) A well WORN [1] path in databases, for sure. =Rob [1] Write Once Read Never
Re: All writes fail with ONE consistency level when adding second node to cluster?
Granted, for “normal” apps it is unlikely to be appropriate but... From an old post by Jonathan: --- Extreme write availability For applications that want Cassandra to accept writes even when all the normal replicas are down (so even ConsistencyLevel.ONE cannot be satisfied), Cassandra provides ConsistencyLevel.ANY. ConsistencyLevel.ANY guarantees that the write is durable and will be readable once an appropriate replica target becomes available and receives the hint replay. --- See: http://www.datastax.com/dev/blog/understanding-hinted-handoff I can think of a couple of use cases: sensor data where the devices are streaming frequently, so losing a reading is not a big deal because another reading is coming soon anyway, and a Twitter firehose where you are after a robust sample rather than absolute consistency. Minimizing network latency may be a bigger deal than whether immediate queries can see the data. And as the description notes, hinted handoff will eventually propagate the data (unless it times out and drops the hint.) -- Jack Krupansky From: Robert Coli Sent: Wednesday, July 23, 2014 1:15 PM To: user@cassandra.apache.org Cc: Kevin Burton Subject: Re: All writes fail with ONE consistency level when adding second node to cluster? On Tue, Jul 22, 2014 at 7:46 PM, Andrew redmu...@gmail.com wrote: ONE means write to one replica (in addition to the original). If you want to write to any of them, use ANY. Is that the right understanding? This has come up a few times, so let me be unambiguous about when to use CL.ANY : NEVER EVER USE CL.ANY. IT ALMOST CERTAINLY SHOULD NOT EVEN EXIST. IF YOU THINK YOU NEED TO USE IT, YOU ARE ALMOST CERTAINLY WRONG. ;D =Rob
Re: All writes fail with ONE consistency level when adding second node to cluster?
I was being a little tongue in cheek! On Jul 23, 2014, at 3:20 PM, Jack Krupansky j...@basetechnology.com wrote: Granted, for “normal” apps it is unlikely to be appropriate but... From an old post by Jonathan: --- Extreme write availability For applications that want Cassandra to accept writes even when all the normal replicas are down (so even ConsistencyLevel.ONE cannot be satisfied), Cassandra provides ConsistencyLevel.ANY. ConsistencyLevel.ANY guarantees that the write is durable and will be readable once an appropriate replica target becomes available and receives the hint replay. --- See: http://www.datastax.com/dev/blog/understanding-hinted-handoff I can think of a couple of use cases: sensor data where the devices are streaming frequently, so losing a reading is not a big deal because another reading is coming soon anyway, and a Twitter firehose where you are after a robust sample rather than absolute consistency. Minimizing network latency may be a bigger deal than whether immediate queries can see the data. And as the description notes, hinted handoff will eventually propagate the data (unless it times out and drops the hint.) -- Jack Krupansky From: Robert Coli Sent: Wednesday, July 23, 2014 1:15 PM To: user@cassandra.apache.org Cc: Kevin Burton Subject: Re: All writes fail with ONE consistency level when adding second node to cluster? On Tue, Jul 22, 2014 at 7:46 PM, Andrew redmu...@gmail.com wrote: ONE means write to one replica (in addition to the original). If you want to write to any of them, use ANY. Is that the right understanding? This has come up a few times, so let me be unambiguous about when to use CL.ANY : NEVER EVER USE CL.ANY. IT ALMOST CERTAINLY SHOULD NOT EVEN EXIST. IF YOU THINK YOU NEED TO USE IT, YOU ARE ALMOST CERTAINLY WRONG. ;D =Rob smime.p7s Description: S/MIME cryptographic signature
Re: All writes fail with ONE consistency level when adding second node to cluster?
On Wed, Jul 23, 2014 at 1:18 PM, DuyHai Doan doanduy...@gmail.com wrote: Why that ? In worst case, CL.ANY will write hints for replicas that are down. If will be extraordinary unlucky to have all replicas down at the same time Hints are not writes for the purposes of consistency or durability, so your write hasn't actually succeeded. Most people don't have applications which need a database to potentially persist a write. In addition, the implementation details of Hinted Handoff can make ANY a meaningful contributor to cascading failure mode when nodes are actually hard down, because instead of excepting with not available exception (which gives your app a chance to back off), you write hints. There is some throttling in terms of how many hints can be in flight at once, but ones over the threshold are dropped on the floor. I've seen nodes with more hints data than actual data, and which were completely unable to ever deliver and purge these hints, though they uselessly compacted them for weeks on end. In most configs, you will end up discarding some subset of these hints in the course of your cascading failure, but you will probably not know which ones. You will also discard 100% of hints after three hours in the default config. You might be happier to just get an exception at the start of the incident, back off your application access a bit, and fix the small subset of affected nodes? In the future when hints are not handled via Column Families, ANY probably gets a lot less risky in terms of overload-with-undelivered-hints, but probably still doesn't actually provide what I consider worthwhile benefit. It is of course possible that I have just never had or heard of a case for which it was appropriate or necessary. tl;dr - CL.ANY creates more risk of cases where you will write a bunch of hints, and cases where you write a bunch of hints are almost never the solution to any actual problem, because hints are not writes. If you really really need extreme availability and can't do it via increasing RF, maybe you might want to consider using CL.ANY. But probably not. =Rob
Re: All writes fail with ONE consistency level when adding second node to cluster?
ONE means write to one replica (in addition to the original). If you want to write to any of them, use ANY. Is that the right understanding? http://www.datastax.com/docs/1.0/dml/data_consistency Andrew On July 22, 2014 at 7:43:43 PM, Kevin Burton (bur...@spinn3r.com) wrote: I'm super confused by this.. and disturbed that this was my failure scenario :-( I had one cassandra node for the alpha of my app… and now we're moving into beta… which means three replicas. So I added the second node… but my app immediately broke with: Cassandra timeout during write query at consistency ONE (2 replica were required but only 1 acknowledged the write) … but that makes no sense… if I'm at ONE and I have one acknowledged write, why does it matter that the second one hasn't ack'd yet… ? -- Founder/CEO Spinn3r.com Location: San Francisco, CA blog: http://burtonator.wordpress.com … or check out my Google+ profile
Re: All writes fail with ONE consistency level when adding second node to cluster?
WEIRD that it was working before… with one node. Granted that this is a rare config (one cassandra node) but it shouldn't work then. If you attempt to write ONE to a single cassandra node, there is no (in addition to) additional node to write to… So this should have failed. Bug? … and I know why this is failing… my cassandra node is joining the cluster now, but none of the ports are open. So all writes will fail… I have NO idea why the ports aren't open yet .. but it's not a firewall issue. On Tue, Jul 22, 2014 at 7:46 PM, Andrew redmu...@gmail.com wrote: ONE means write to one replica (in addition to the original). If you want to write to any of them, use ANY. Is that the right understanding? http://www.datastax.com/docs/1.0/dml/data_consistency Andrew On July 22, 2014 at 7:43:43 PM, Kevin Burton (bur...@spinn3r.com) wrote: I'm super confused by this.. and disturbed that this was my failure scenario :-( I had one cassandra node for the alpha of my app… and now we're moving into beta… which means three replicas. So I added the second node… but my app immediately broke with: Cassandra timeout during write query at consistency ONE (2 replica were required but only 1 acknowledged the write) … but that makes no sense… if I'm at ONE and I have one acknowledged write, why does it matter that the second one hasn't ack'd yet… ? -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com
Re: All writes fail with ONE consistency level when adding second node to cluster?
Yeah.. that's fascinating … so now I get something that's even worse: Cassandra timeout during write query at consistency ANY (2 replica were required but only 1 acknowledged the write) … the issue is that the new cassandra node has all its ports closed. Only the storage port is open. So obviously writes are going to fail to it. … is this by design? Perhaps it's not going to open the ports until the node joins the ring? It's currently joining … so… basically, my entire cluster is offline during this join? I assume this is either a bug or some weird state base on growing from 1-2 nodes? frustrating :-( On Tue, Jul 22, 2014 at 8:13 PM, graham sanderson gra...@vast.com wrote: Incorrect, ONE does not refer to the number of “other nodes, it just refers to the number of nodes. so ONE under normal circumstances would only require one node to acknowledge the write. The confusing error message you are getting is related to https://issues.apache.org/jira/browse/CASSANDRA-833… Kevin you are correct in that normally that error message would make no sense. I don’t have much experience adding/removing nodes, but I think what is happening is that your new node is in the middle of taken over ownership of a token range - while that happens C* is trying to write to both the old owner (your original node), AND (hence the 2 not 1 in the error message) the new owner (the new node) so that once the bootstrapping of the new node is complete, it is immediately safe to delete the (no longer owned data) from the old node. For whatever reason the write to the new node is timing out, causing the exception, and the error message is exposing the “2” which happens to be how many C* thinks it is waiting for at the time (i.e. how many it should be waiting for based on the consistency level (1) plus this extra node). On Jul 22, 2014, at 9:46 PM, Andrew redmu...@gmail.com wrote: ONE means write to one replica (in addition to the original). If you want to write to any of them, use ANY. Is that the right understanding? http://www.datastax.com/docs/1.0/dml/data_consistency Andrew On July 22, 2014 at 7:43:43 PM, Kevin Burton (bur...@spinn3r.com) wrote: I'm super confused by this.. and disturbed that this was my failure scenario :-( I had one cassandra node for the alpha of my app… and now we're moving into beta… which means three replicas. So I added the second node… but my app immediately broke with: Cassandra timeout during write query at consistency ONE (2 replica were required but only 1 acknowledged the write) … but that makes no sense… if I'm at ONE and I have one acknowledged write, why does it matter that the second one hasn't ack'd yet… ? -- Founder/CEO Spinn3r.com http://spinn3r.com/ Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com/ -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com
Re: All writes fail with ONE consistency level when adding second node to cluster?
and there are literally zero google hits on the query: Cassandra timeout during write query at consistency ANY (2 replica were required but only 1 acknowledged the write) .. so I imagine I'm the first to find this bug! Aren't I lucky! On Tue, Jul 22, 2014 at 8:46 PM, Kevin Burton bur...@spinn3r.com wrote: Yeah.. that's fascinating … so now I get something that's even worse: Cassandra timeout during write query at consistency ANY (2 replica were required but only 1 acknowledged the write) … the issue is that the new cassandra node has all its ports closed. Only the storage port is open. So obviously writes are going to fail to it. … is this by design? Perhaps it's not going to open the ports until the node joins the ring? It's currently joining … so… basically, my entire cluster is offline during this join? I assume this is either a bug or some weird state base on growing from 1-2 nodes? frustrating :-( On Tue, Jul 22, 2014 at 8:13 PM, graham sanderson gra...@vast.com wrote: Incorrect, ONE does not refer to the number of “other nodes, it just refers to the number of nodes. so ONE under normal circumstances would only require one node to acknowledge the write. The confusing error message you are getting is related to https://issues.apache.org/jira/browse/CASSANDRA-833… Kevin you are correct in that normally that error message would make no sense. I don’t have much experience adding/removing nodes, but I think what is happening is that your new node is in the middle of taken over ownership of a token range - while that happens C* is trying to write to both the old owner (your original node), AND (hence the 2 not 1 in the error message) the new owner (the new node) so that once the bootstrapping of the new node is complete, it is immediately safe to delete the (no longer owned data) from the old node. For whatever reason the write to the new node is timing out, causing the exception, and the error message is exposing the “2” which happens to be how many C* thinks it is waiting for at the time (i.e. how many it should be waiting for based on the consistency level (1) plus this extra node). On Jul 22, 2014, at 9:46 PM, Andrew redmu...@gmail.com wrote: ONE means write to one replica (in addition to the original). If you want to write to any of them, use ANY. Is that the right understanding? http://www.datastax.com/docs/1.0/dml/data_consistency Andrew On July 22, 2014 at 7:43:43 PM, Kevin Burton (bur...@spinn3r.com) wrote: I'm super confused by this.. and disturbed that this was my failure scenario :-( I had one cassandra node for the alpha of my app… and now we're moving into beta… which means three replicas. So I added the second node… but my app immediately broke with: Cassandra timeout during write query at consistency ONE (2 replica were required but only 1 acknowledged the write) … but that makes no sense… if I'm at ONE and I have one acknowledged write, why does it matter that the second one hasn't ack'd yet… ? -- Founder/CEO Spinn3r.com http://spinn3r.com/ Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com/ -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com -- Founder/CEO Spinn3r.com Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com
Re: All writes fail with ONE consistency level when adding second node to cluster?
I assumed you must have now switched to ANY which you probably didn’t want to do, and likely won’t help (and very few people use ANY which may explain the lack of google hits, plus this particular “Cassandra timeout during write query at consistency” error message comes from the datastax CQL java driver not C* itself. In any case… my original response was just to explain to you that your understanding of what ONE means in general was correct, and this incorrect looking error message was a weird case during adding a node. I have no idea what is going on with your bootstrapping node others may be able to help, but in the meanwhile I’d look for errors in the server log and google those and/or google for instructions on how to add nodes to a cassandra cluster on whatever version you are running. On Jul 22, 2014, at 10:47 PM, Kevin Burton bur...@spinn3r.com wrote: and there are literally zero google hits on the query: Cassandra timeout during write query at consistency ANY (2 replica were required but only 1 acknowledged the write) .. so I imagine I'm the first to find this bug! Aren't I lucky! On Tue, Jul 22, 2014 at 8:46 PM, Kevin Burton bur...@spinn3r.com wrote: Yeah.. that's fascinating … so now I get something that's even worse: Cassandra timeout during write query at consistency ANY (2 replica were required but only 1 acknowledged the write) … the issue is that the new cassandra node has all its ports closed. Only the storage port is open. So obviously writes are going to fail to it. … is this by design? Perhaps it's not going to open the ports until the node joins the ring? It's currently joining … so… basically, my entire cluster is offline during this join? I assume this is either a bug or some weird state base on growing from 1-2 nodes? frustrating :-( On Tue, Jul 22, 2014 at 8:13 PM, graham sanderson gra...@vast.com wrote: Incorrect, ONE does not refer to the number of “other nodes, it just refers to the number of nodes. so ONE under normal circumstances would only require one node to acknowledge the write. The confusing error message you are getting is related to https://issues.apache.org/jira/browse/CASSANDRA-833… Kevin you are correct in that normally that error message would make no sense. I don’t have much experience adding/removing nodes, but I think what is happening is that your new node is in the middle of taken over ownership of a token range - while that happens C* is trying to write to both the old owner (your original node), AND (hence the 2 not 1 in the error message) the new owner (the new node) so that once the bootstrapping of the new node is complete, it is immediately safe to delete the (no longer owned data) from the old node. For whatever reason the write to the new node is timing out, causing the exception, and the error message is exposing the “2” which happens to be how many C* thinks it is waiting for at the time (i.e. how many it should be waiting for based on the consistency level (1) plus this extra node). On Jul 22, 2014, at 9:46 PM, Andrew redmu...@gmail.com wrote: ONE means write to one replica (in addition to the original). If you want to write to any of them, use ANY. Is that the right understanding? http://www.datastax.com/docs/1.0/dml/data_consistency Andrew On July 22, 2014 at 7:43:43 PM, Kevin Burton (bur...@spinn3r.com) wrote: I'm super confused by this.. and disturbed that this was my failure scenario :-( I had one cassandra node for the alpha of my app… and now we're moving into beta… which means three replicas. So I added the second node… but my app immediately broke with: Cassandra timeout during write query at consistency ONE (2 replica were required but only 1 acknowledged the write) … but that makes no sense… if I'm at ONE and I have one acknowledged write, why does it matter that the second one hasn't ack'd yet… ? -- Founder/CEO Spinn3r.com Location: San Francisco, CA blog: http://burtonator.wordpress.com … or check out my Google+ profile -- Founder/CEO Spinn3r.com Location: San Francisco, CA blog: http://burtonator.wordpress.com … or check out my Google+ profile -- Founder/CEO Spinn3r.com Location: San Francisco, CA blog: http://burtonator.wordpress.com … or check out my Google+ profile smime.p7s Description: S/MIME cryptographic signature
Re: All writes fail with ONE consistency level when adding second node to cluster?
Thanks of the feedback… In hindsight.. I think what happened was that the new node started up… and the driver wanted to write records to it… but the ports weren't up. so I wonder if this is a bug in the datastax driver. On bootstrap, and when joining, does cassandra always keep the ports offline and then only open them up with the node has joined? On Tue, Jul 22, 2014 at 8:55 PM, graham sanderson gra...@vast.com wrote: I assumed you must have now switched to ANY which you probably didn’t want to do, and likely won’t help (and very few people use ANY which may explain the lack of google hits, plus this particular “Cassandra timeout during write query at consistency” error message comes from the datastax CQL java driver not C* itself. In any case… my original response was just to explain to you that your understanding of what ONE means in general was correct, and this incorrect looking error message was a weird case during adding a node. I have no idea what is going on with your bootstrapping node others may be able to help, but in the meanwhile I’d look for errors in the server log and google those and/or google for instructions on how to add nodes to a cassandra cluster on whatever version you are running. On Jul 22, 2014, at 10:47 PM, Kevin Burton bur...@spinn3r.com wrote: and there are literally zero google hits on the query: Cassandra timeout during write query at consistency ANY (2 replica were required but only 1 acknowledged the write) .. so I imagine I'm the first to find this bug! Aren't I lucky! On Tue, Jul 22, 2014 at 8:46 PM, Kevin Burton bur...@spinn3r.com wrote: Yeah.. that's fascinating … so now I get something that's even worse: Cassandra timeout during write query at consistency ANY (2 replica were required but only 1 acknowledged the write) … the issue is that the new cassandra node has all its ports closed. Only the storage port is open. So obviously writes are going to fail to it. … is this by design? Perhaps it's not going to open the ports until the node joins the ring? It's currently joining … so… basically, my entire cluster is offline during this join? I assume this is either a bug or some weird state base on growing from 1-2 nodes? frustrating :-( On Tue, Jul 22, 2014 at 8:13 PM, graham sanderson gra...@vast.com wrote: Incorrect, ONE does not refer to the number of “other nodes, it just refers to the number of nodes. so ONE under normal circumstances would only require one node to acknowledge the write. The confusing error message you are getting is related to https://issues.apache.org/jira/browse/CASSANDRA-833… Kevin you are correct in that normally that error message would make no sense. I don’t have much experience adding/removing nodes, but I think what is happening is that your new node is in the middle of taken over ownership of a token range - while that happens C* is trying to write to both the old owner (your original node), AND (hence the 2 not 1 in the error message) the new owner (the new node) so that once the bootstrapping of the new node is complete, it is immediately safe to delete the (no longer owned data) from the old node. For whatever reason the write to the new node is timing out, causing the exception, and the error message is exposing the “2” which happens to be how many C* thinks it is waiting for at the time (i.e. how many it should be waiting for based on the consistency level (1) plus this extra node). On Jul 22, 2014, at 9:46 PM, Andrew redmu...@gmail.com wrote: ONE means write to one replica (in addition to the original). If you want to write to any of them, use ANY. Is that the right understanding? http://www.datastax.com/docs/1.0/dml/data_consistency Andrew On July 22, 2014 at 7:43:43 PM, Kevin Burton (bur...@spinn3r.com) wrote: I'm super confused by this.. and disturbed that this was my failure scenario :-( I had one cassandra node for the alpha of my app… and now we're moving into beta… which means three replicas. So I added the second node… but my app immediately broke with: Cassandra timeout during write query at consistency ONE (2 replica were required but only 1 acknowledged the write) … but that makes no sense… if I'm at ONE and I have one acknowledged write, why does it matter that the second one hasn't ack'd yet… ? -- Founder/CEO Spinn3r.com http://spinn3r.com/ Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com/ -- Founder/CEO Spinn3r.com http://spinn3r.com/ Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile https://plus.google.com/102718274791889610666/posts http://spinn3r.com/ -- Founder/CEO Spinn3r.com http://spinn3r.com/ Location: *San Francisco, CA* blog: http://burtonator.wordpress.com … or check out my Google+ profile
Re: All writes fail with ONE consistency level when adding second node to cluster?
I looked into this; ONE means it must be written to one replica—i.e., a node the data is supposed to be written to. ANY means a hinted handoff will “count”. So as long as it writes to any node on the cluster—even one that it’s not supposed to be on—it will be a success. Good to know. Andrew On July 22, 2014 at 8:13:57 PM, graham sanderson (gra...@vast.com) wrote: Incorrect, ONE does not refer to the number of “other nodes, it just refers to the number of nodes. so ONE under normal circumstances would only require one node to acknowledge the write. The confusing error message you are getting is related to https://issues.apache.org/jira/browse/CASSANDRA-833… Kevin you are correct in that normally that error message would make no sense. I don’t have much experience adding/removing nodes, but I think what is happening is that your new node is in the middle of taken over ownership of a token range - while that happens C* is trying to write to both the old owner (your original node), AND (hence the 2 not 1 in the error message) the new owner (the new node) so that once the bootstrapping of the new node is complete, it is immediately safe to delete the (no longer owned data) from the old node. For whatever reason the write to the new node is timing out, causing the exception, and the error message is exposing the “2” which happens to be how many C* thinks it is waiting for at the time (i.e. how many it should be waiting for based on the consistency level (1) plus this extra node). On Jul 22, 2014, at 9:46 PM, Andrew redmu...@gmail.com wrote: ONE means write to one replica (in addition to the original). If you want to write to any of them, use ANY. Is that the right understanding? http://www.datastax.com/docs/1.0/dml/data_consistency Andrew On July 22, 2014 at 7:43:43 PM, Kevin Burton (bur...@spinn3r.com) wrote: I'm super confused by this.. and disturbed that this was my failure scenario :-( I had one cassandra node for the alpha of my app… and now we're moving into beta… which means three replicas. So I added the second node… but my app immediately broke with: Cassandra timeout during write query at consistency ONE (2 replica were required but only 1 acknowledged the write) … but that makes no sense… if I'm at ONE and I have one acknowledged write, why does it matter that the second one hasn't ack'd yet… ? -- Founder/CEO Spinn3r.com Location: San Francisco, CA blog: http://burtonator.wordpress.com … or check out my Google+ profile