Re: Nodes Dying in 2.1.2
Looks like someone else is experiencing almost exactly what we are seeing: https://issues.apache.org/jira/browse/CASSANDRA-8552 On Mon, Dec 29, 2014 at 5:14 PM, Robert Coli rc...@eventbrite.com wrote: Might be https://issues.apache.org/jira/browse/CASSANDRA-8061 or one of the linked/duplicate tickets. =Rob On Mon, Dec 29, 2014 at 1:40 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Dec 24, 2014 at 9:41 AM, Phil Burress philtburr...@gmail.com wrote: Just upgraded our cluster from 2.1.1 to 2.1.2 and our nodes keep dying. The kernel is killing the process due to out of memory: https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ Appears to only occur during compactions. We've tried playing with the heap settings but nothing has worked thus far. We did not have this issue until we upgraded. Anyone else run into this or have suggestions? I would : 1) see if downgrade is possible (while unsupported, it probably is possible) and downgrade if so 2) search JIRA 2.1 era for related issues 3) examine changes from 2.1.1 to 2.1.2 which relate to compaction 4) file a JIRA describing yr experience if no prior one exists =Rob
Re: 答复: Downgrade from 2.1.2 to 2.1.1
Why don't you use incremental repairs? Is there a known issue with incremental repairs in 2.1.x? On Tue, Dec 30, 2014 at 10:22 PM, 李建奇 lijia...@jd.com wrote: We also suffer some problem from 2.1.2 . But I think we can deal with . First I don’t use incremental repair. Second we restart node after repair . It will release sstable tmplink . Third , don’t use stop COMPACTION command. If we read 2.1.2 release notes ,we find it solve some issues with 2.1.1 . We wait for 2.1.3 . -- 李建奇 京东商城 【 运营研发 架构师】 = 手机:18600360358 地址:北京市大兴区 朝林广场25层 邮编:100101 = 网购上京东,省钱又放心! www.jd.com *发件人:* Phil Burress [mailto:philtburr...@gmail.com] *发送时间:* 2014年12月31日 2:53 *收件人:* user@cassandra.apache.org *主题:* Re: Downgrade from 2.1.2 to 2.1.1 Thanks Rob. On Tue, Dec 30, 2014 at 1:38 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Dec 30, 2014 at 9:42 AM, Phil Burress philtburr...@gmail.com wrote: We are having a lot of problems with release 2.1.2. It was suggested here we should downgrade to 2.1.1 if possible. For the experts out there, do you foresee any issues in doing this? Not sure if advice from the person who suggested the downgrade is what you're looking for, but... The classes of risk are : - Incompatible changes to system keyspace values (unlikely, but possible in a minor version) - File format changes (very unlikely in a minor version) - Network protocol changes (very unlikely in a minor version) - Unexpected exceptions of other classes (monitor in logs) Really, read the CHANGES.txt for 2.1.2 and look for the above classes of risk. If you have any questions about specific tickets, feel free to ask on-thread? It's also worth pointing out that you can just downgrade a single node and see if it still works. If it does, and doesn't except, you're probably fine? =Rob PS - Pro-forma disclaimer that downgrading is officially unsupported.
Re: Downgrade from 2.1.2 to 2.1.1
Thanks Rob. On Tue, Dec 30, 2014 at 1:38 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Dec 30, 2014 at 9:42 AM, Phil Burress philtburr...@gmail.com wrote: We are having a lot of problems with release 2.1.2. It was suggested here we should downgrade to 2.1.1 if possible. For the experts out there, do you foresee any issues in doing this? Not sure if advice from the person who suggested the downgrade is what you're looking for, but... The classes of risk are : - Incompatible changes to system keyspace values (unlikely, but possible in a minor version) - File format changes (very unlikely in a minor version) - Network protocol changes (very unlikely in a minor version) - Unexpected exceptions of other classes (monitor in logs) Really, read the CHANGES.txt for 2.1.2 and look for the above classes of risk. If you have any questions about specific tickets, feel free to ask on-thread? It's also worth pointing out that you can just downgrade a single node and see if it still works. If it does, and doesn't except, you're probably fine? =Rob PS - Pro-forma disclaimer that downgrading is officially unsupported.
Nodes Dying in 2.1.2
Just upgraded our cluster from 2.1.1 to 2.1.2 and our nodes keep dying. The kernel is killing the process due to out of memory: kernel: Out of memory: Kill process 6267 (java) score 998 or sacrifice child Appears to only occur during compactions. We've tried playing with the heap settings but nothing has worked thus far. We did not have this issue until we upgraded. Anyone else run into this or have suggestions? Thanks! Phil
Re: nodetool repair -snapshot option?
Thanks! We retrieved all the ranges and started running repair on them. We ran through all of them but found one single range which brought the ENTIRE cluster down. All of the other ranges ran quickly and smoothly. This one problematic range reliably brings it down every time we try to run repair on it. Any thoughts on why one specific range would be a troublemaker? On Tue, Jul 1, 2014 at 11:44 AM, Ken Hancock ken.hanc...@schange.com wrote: I also expanded on a script originally written by Matt Stump @ Datastax. The readme has the reasoning behind requiring sub-range repairs. https://github.com/hancockks/cassandra_range_repair On Mon, Jun 30, 2014 at 10:20 PM, Phil Burress philburress...@gmail.com wrote: @Paulo, this is very cool! Thanks very much for the link! On Mon, Jun 30, 2014 at 9:37 PM, Paulo Ricardo Motta Gomes paulo.mo...@chaordicsystems.com wrote: If you find it useful, I created a tool where you input the node IP, keyspace, column family, and optionally the number of partitions (default: 32K), and it outputs the list of subranges for that node, CF, partition size: https://github.com/pauloricardomg/cassandra-list-subranges So you can basically iterate over the output of that and do subrange repair for each node and cf, maybe in parallel. :) On Mon, Jun 30, 2014 at 10:26 PM, Phil Burress philburress...@gmail.com wrote: One last question. Any tips on scripting a subrange repair? On Mon, Jun 30, 2014 at 7:12 PM, Phil Burress philburress...@gmail.com wrote: We are running repair -pr. We've tried subrange manually and that seems to work ok. I guess we'll go with that going forward. Thanks for all the info! On Mon, Jun 30, 2014 at 6:52 PM, Jaydeep Chovatia chovatia.jayd...@gmail.com wrote: Are you running full repair or on subset? If you are running full repair then try running on sub-set of ranges which means less data to worry during repair and that would help JAVA heap in general. You will have to do multiple iterations to complete entire range but at-least it will work. -jaydeep On Mon, Jun 30, 2014 at 3:22 PM, Robert Coli rc...@eventbrite.com wrote: On Mon, Jun 30, 2014 at 3:08 PM, Yuki Morishita mor.y...@gmail.com wrote: Repair uses snapshot option by default since 2.0.2 (see NEWS.txt). As a general meta comment, the process by which operationally important defaults change in Cassandra seems ad-hoc and sub-optimal. For to record, my view was that this change, which makes repair even slower than it previously was, was probably overly optimistic. It's also weird in that it changes default behavior which has been unchanged since the start of Cassandra time and is therefore probably automated against. Why was it so critically important to switch to snapshot repair that it needed to be shotgunned as a new default in 2.0.2? =Rob -- *Paulo Motta* Chaordic | *Platform* *www.chaordic.com.br http://www.chaordic.com.br/* +55 48 3232.3200 -- *Ken Hancock *| System Architect, Advanced Advertising SeaChange International 50 Nagog Park Acton, Massachusetts 01720 ken.hanc...@schange.com | www.schange.com | NASDAQ:SEAC http://www.schange.com/en-US/Company/InvestorRelations.aspx Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hanc...@schange.com | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks [image: LinkedIn] http://www.linkedin.com/in/kenhancock [image: SeaChange International] http://www.schange.com/This e-mail and any attachments may contain information which is SeaChange International confidential. The information enclosed is intended only for the addressees herein and may not be copied or forwarded without permission from SeaChange International.
nodetool repair -snapshot option?
We are running into an issue with nodetool repair. One or more of our nodes will die with OOM errors when running nodetool repair on a single node. Was reading this http://www.datastax.com/dev/blog/advanced-repair-techniques and it mentioned using the -snapshot option, however, that doesn't appear to be an option in the version we have. We are running 2.0.7 with vnodes. Any insight into what might be causing these OOMs and/or what version this -snapshot option is available in? Thanks! Phil
Re: nodetool repair -snapshot option?
We are running repair -pr. We've tried subrange manually and that seems to work ok. I guess we'll go with that going forward. Thanks for all the info! On Mon, Jun 30, 2014 at 6:52 PM, Jaydeep Chovatia chovatia.jayd...@gmail.com wrote: Are you running full repair or on subset? If you are running full repair then try running on sub-set of ranges which means less data to worry during repair and that would help JAVA heap in general. You will have to do multiple iterations to complete entire range but at-least it will work. -jaydeep On Mon, Jun 30, 2014 at 3:22 PM, Robert Coli rc...@eventbrite.com wrote: On Mon, Jun 30, 2014 at 3:08 PM, Yuki Morishita mor.y...@gmail.com wrote: Repair uses snapshot option by default since 2.0.2 (see NEWS.txt). As a general meta comment, the process by which operationally important defaults change in Cassandra seems ad-hoc and sub-optimal. For to record, my view was that this change, which makes repair even slower than it previously was, was probably overly optimistic. It's also weird in that it changes default behavior which has been unchanged since the start of Cassandra time and is therefore probably automated against. Why was it so critically important to switch to snapshot repair that it needed to be shotgunned as a new default in 2.0.2? =Rob
Re: nodetool repair -snapshot option?
One last question. Any tips on scripting a subrange repair? On Mon, Jun 30, 2014 at 7:12 PM, Phil Burress philburress...@gmail.com wrote: We are running repair -pr. We've tried subrange manually and that seems to work ok. I guess we'll go with that going forward. Thanks for all the info! On Mon, Jun 30, 2014 at 6:52 PM, Jaydeep Chovatia chovatia.jayd...@gmail.com wrote: Are you running full repair or on subset? If you are running full repair then try running on sub-set of ranges which means less data to worry during repair and that would help JAVA heap in general. You will have to do multiple iterations to complete entire range but at-least it will work. -jaydeep On Mon, Jun 30, 2014 at 3:22 PM, Robert Coli rc...@eventbrite.com wrote: On Mon, Jun 30, 2014 at 3:08 PM, Yuki Morishita mor.y...@gmail.com wrote: Repair uses snapshot option by default since 2.0.2 (see NEWS.txt). As a general meta comment, the process by which operationally important defaults change in Cassandra seems ad-hoc and sub-optimal. For to record, my view was that this change, which makes repair even slower than it previously was, was probably overly optimistic. It's also weird in that it changes default behavior which has been unchanged since the start of Cassandra time and is therefore probably automated against. Why was it so critically important to switch to snapshot repair that it needed to be shotgunned as a new default in 2.0.2? =Rob
Re: nodetool repair -snapshot option?
@Paulo, this is very cool! Thanks very much for the link! On Mon, Jun 30, 2014 at 9:37 PM, Paulo Ricardo Motta Gomes paulo.mo...@chaordicsystems.com wrote: If you find it useful, I created a tool where you input the node IP, keyspace, column family, and optionally the number of partitions (default: 32K), and it outputs the list of subranges for that node, CF, partition size: https://github.com/pauloricardomg/cassandra-list-subranges So you can basically iterate over the output of that and do subrange repair for each node and cf, maybe in parallel. :) On Mon, Jun 30, 2014 at 10:26 PM, Phil Burress philburress...@gmail.com wrote: One last question. Any tips on scripting a subrange repair? On Mon, Jun 30, 2014 at 7:12 PM, Phil Burress philburress...@gmail.com wrote: We are running repair -pr. We've tried subrange manually and that seems to work ok. I guess we'll go with that going forward. Thanks for all the info! On Mon, Jun 30, 2014 at 6:52 PM, Jaydeep Chovatia chovatia.jayd...@gmail.com wrote: Are you running full repair or on subset? If you are running full repair then try running on sub-set of ranges which means less data to worry during repair and that would help JAVA heap in general. You will have to do multiple iterations to complete entire range but at-least it will work. -jaydeep On Mon, Jun 30, 2014 at 3:22 PM, Robert Coli rc...@eventbrite.com wrote: On Mon, Jun 30, 2014 at 3:08 PM, Yuki Morishita mor.y...@gmail.com wrote: Repair uses snapshot option by default since 2.0.2 (see NEWS.txt). As a general meta comment, the process by which operationally important defaults change in Cassandra seems ad-hoc and sub-optimal. For to record, my view was that this change, which makes repair even slower than it previously was, was probably overly optimistic. It's also weird in that it changes default behavior which has been unchanged since the start of Cassandra time and is therefore probably automated against. Why was it so critically important to switch to snapshot repair that it needed to be shotgunned as a new default in 2.0.2? =Rob -- *Paulo Motta* Chaordic | *Platform* *www.chaordic.com.br http://www.chaordic.com.br/* +55 48 3232.3200
Ec2 Network I/O
Has anyone experienced network i/o issues with ec2? We are seeing a lot of these in our logs: HintedHandOffManager.java (line 477) Timed out replaying hints to /10.0.x.xxx; aborting (15 delivered) and these... Cannot handshake version with /10.0.x.xxx and these... java.io.IOException: Cannot proceed on repair because a neighbor (/10.0.x.xxx) is dead: session failed Occurs on all of our nodes. Even though in all cases, the host that is being reported as down or unavailable is up and readily 'pingable'. We are using shared tenancy on all our nodes (instance type m1.xlarge) with cassandra 2.0.7. Any suggestions on how to debug these errors? Is there a recommendation to move to Placement Groups for Cassandra? Thanks! Phil
Re: Recommended Approach for Config Changes
Thanks for all the good info. We have found that running drain first before restarting should always be done, even if there is not much data or I/O. Also, we've found that node tool drain returns often before it's finished, so it's important to watch the logs (or opscenter) for it and any compaction tasks to complete before restarting. On Apr 25, 2014 1:09 PM, Jon Haddad j...@jonhaddad.com wrote: You might want to take a peek at what’s happening in the process via strace -p or tcpdump. I can’t remember ever waiting an hour for a node to rejoin. On Apr 25, 2014, at 8:59 AM, Tyler Hobbs ty...@datastax.com wrote: On Fri, Apr 25, 2014 at 10:43 AM, Phil Burress philburress...@gmail.comwrote: Thanks. I made a change to a single node and it took almost an hour to rejoin the cluster (go from DN to UP in nodetool status). The cluster is pretty much idle right now and has a very small dataset. Is that normal? Not unless it had to replay a lot of commitlogs on startup. If you look at your logs and see that that's the case, you may want to run 'nodetool drain' before stopping the node. -- Tyler Hobbs DataStax http://datastax.com/
Recommended Approach for Config Changes
If I wanted to make a configuration change to a single node in a cluster, what is the recommended approach for doing that? Is it ok to just stop that instance, make the change and then restart it?
Re: Bootstrap Timing
Just a follow-up on this for any interested parties. Ultimately we've determined that the bootstrap/join process is broken in Cassandra. We ended up creating an entirely new cluster and migrating the data. On Mon, Apr 21, 2014 at 10:32 AM, Phil Burress philburress...@gmail.comwrote: The new node has managed to stay up without dying for about 24 hours now... but it still is in JOINING state. A new concern has popped up. Disk usage is at 500GB on the new node. The three original nodes have about 40GB each. Any ideas why this is happening? On Sat, Apr 19, 2014 at 9:19 PM, Phil Burress philburress...@gmail.comwrote: Thank you all for your advice and good info. The node has died a couple of times with out of memory errors. I've restarted each time but it starts re - running compaction and then dies again. Is there a better way to do this? On Apr 18, 2014 6:06 PM, Steven A Robenalt srobe...@stanford.edu wrote: That's what I'd be doing, but I wouldn't expect it to run for 3 days this time. My guess is that whatever was going wrong with the bootstrap when you had 3 nodes starting at once was interfering with the completion of the 1 remaining node of those 3. A clean bootstrap of a single node should complete eventually, and I would think it'll be a lot less than 3 days. Our database is much smaller than yours at the moment, so I can't really guide you on how long it should take, but I'd think that others on the list with similar database sizes might be able to give you a better idea. Steve On Fri, Apr 18, 2014 at 1:43 PM, Phil Burress philburress...@gmail.comwrote: First, I just stopped 2 of the nodes and left one running. But this morning, I stopped that third node, cleared out the data, restarted and let it rejoin again. It appears streaming is done (according to netstats), right now it appears to be running compaction and building secondary index (according to compactionstats). Just sit and wait I guess? On Fri, Apr 18, 2014 at 2:23 PM, Steven A Robenalt srobe...@stanford.edu wrote: Looking back through this email chain, it looks like Phil said he wasn't using vnodes. For the record, we are using vnodes since we brought up our first cluster, and have not seen any issues with bootstrapping new nodes either to replace existing nodes, or to grow/shrink the cluster. We did adhere to the caveats that new nodes should not be seed nodes, and that we should allow each node to join the cluster completely before making any other changes. Phil, when you dropped to adding just the single node to your cluster, did you start over with the newly added node (blowing away the database created on the previous startup), or did you shut down the other 2 added nodes and leave the remaining one in progress to continue? Steve On Fri, Apr 18, 2014 at 10:40 AM, Robert Coli rc...@eventbrite.comwrote: On Fri, Apr 18, 2014 at 5:05 AM, Phil Burress philburress...@gmail.com wrote: nodetool netstats shows 84 files. They are all at 100%. Nothing showing in Pending or Active for Read Repair Stats. I'm assuming this means it's done. But it still shows JOINING. Is there an undocumented step I'm missing here? This whole process seems broken to me. Lately it seems like a lot more people than usual are : 1) using vnodes 2) unable to bootstrap new nodes If I were you, I would likely file a JIRA detailing your negative experience with this core functionality. =Rob -- Steve Robenalt Software Architect HighWire | Stanford University 425 Broadway St, Redwood City, CA 94063 srobe...@stanford.edu http://highwire.stanford.edu -- Steve Robenalt Software Architect HighWire | Stanford University 425 Broadway St, Redwood City, CA 94063 srobe...@stanford.edu http://highwire.stanford.edu
Re: Bootstrap Timing
Cassandra 2.0.6 On Fri, Apr 25, 2014 at 10:31 AM, James Rothering jrother...@codojo.mewrote: What version of C* is this? On Fri, Apr 25, 2014 at 6:55 AM, Phil Burress philburress...@gmail.comwrote: Just a follow-up on this for any interested parties. Ultimately we've determined that the bootstrap/join process is broken in Cassandra. We ended up creating an entirely new cluster and migrating the data. On Mon, Apr 21, 2014 at 10:32 AM, Phil Burress philburress...@gmail.comwrote: The new node has managed to stay up without dying for about 24 hours now... but it still is in JOINING state. A new concern has popped up. Disk usage is at 500GB on the new node. The three original nodes have about 40GB each. Any ideas why this is happening? On Sat, Apr 19, 2014 at 9:19 PM, Phil Burress philburress...@gmail.comwrote: Thank you all for your advice and good info. The node has died a couple of times with out of memory errors. I've restarted each time but it starts re - running compaction and then dies again. Is there a better way to do this? On Apr 18, 2014 6:06 PM, Steven A Robenalt srobe...@stanford.edu wrote: That's what I'd be doing, but I wouldn't expect it to run for 3 days this time. My guess is that whatever was going wrong with the bootstrap when you had 3 nodes starting at once was interfering with the completion of the 1 remaining node of those 3. A clean bootstrap of a single node should complete eventually, and I would think it'll be a lot less than 3 days. Our database is much smaller than yours at the moment, so I can't really guide you on how long it should take, but I'd think that others on the list with similar database sizes might be able to give you a better idea. Steve On Fri, Apr 18, 2014 at 1:43 PM, Phil Burress philburress...@gmail.com wrote: First, I just stopped 2 of the nodes and left one running. But this morning, I stopped that third node, cleared out the data, restarted and let it rejoin again. It appears streaming is done (according to netstats), right now it appears to be running compaction and building secondary index (according to compactionstats). Just sit and wait I guess? On Fri, Apr 18, 2014 at 2:23 PM, Steven A Robenalt srobe...@stanford.edu wrote: Looking back through this email chain, it looks like Phil said he wasn't using vnodes. For the record, we are using vnodes since we brought up our first cluster, and have not seen any issues with bootstrapping new nodes either to replace existing nodes, or to grow/shrink the cluster. We did adhere to the caveats that new nodes should not be seed nodes, and that we should allow each node to join the cluster completely before making any other changes. Phil, when you dropped to adding just the single node to your cluster, did you start over with the newly added node (blowing away the database created on the previous startup), or did you shut down the other 2 added nodes and leave the remaining one in progress to continue? Steve On Fri, Apr 18, 2014 at 10:40 AM, Robert Coli rc...@eventbrite.comwrote: On Fri, Apr 18, 2014 at 5:05 AM, Phil Burress philburress...@gmail.com wrote: nodetool netstats shows 84 files. They are all at 100%. Nothing showing in Pending or Active for Read Repair Stats. I'm assuming this means it's done. But it still shows JOINING. Is there an undocumented step I'm missing here? This whole process seems broken to me. Lately it seems like a lot more people than usual are : 1) using vnodes 2) unable to bootstrap new nodes If I were you, I would likely file a JIRA detailing your negative experience with this core functionality. =Rob -- Steve Robenalt Software Architect HighWire | Stanford University 425 Broadway St, Redwood City, CA 94063 srobe...@stanford.edu http://highwire.stanford.edu -- Steve Robenalt Software Architect HighWire | Stanford University 425 Broadway St, Redwood City, CA 94063 srobe...@stanford.edu http://highwire.stanford.edu
Re: Recommended Approach for Config Changes
Thanks. I made a change to a single node and it took almost an hour to rejoin the cluster (go from DN to UP in nodetool status). The cluster is pretty much idle right now and has a very small dataset. Is that normal? On Fri, Apr 25, 2014 at 10:08 AM, Chris Lohfink clohf...@blackbirdit.comwrote: Yes. Some changes you can manually have take affect without a restart (ie compactionthroughput, things settable from jmx). There is also config changes you cant really make like switching the snitch and such without a big todo. --- Chris On Apr 25, 2014, at 8:53 AM, Phil Burress philburress...@gmail.com wrote: If I wanted to make a configuration change to a single node in a cluster, what is the recommended approach for doing that? Is it ok to just stop that instance, make the change and then restart it?
Re: Bootstrap Timing
The new node has managed to stay up without dying for about 24 hours now... but it still is in JOINING state. A new concern has popped up. Disk usage is at 500GB on the new node. The three original nodes have about 40GB each. Any ideas why this is happening? On Sat, Apr 19, 2014 at 9:19 PM, Phil Burress philburress...@gmail.comwrote: Thank you all for your advice and good info. The node has died a couple of times with out of memory errors. I've restarted each time but it starts re - running compaction and then dies again. Is there a better way to do this? On Apr 18, 2014 6:06 PM, Steven A Robenalt srobe...@stanford.edu wrote: That's what I'd be doing, but I wouldn't expect it to run for 3 days this time. My guess is that whatever was going wrong with the bootstrap when you had 3 nodes starting at once was interfering with the completion of the 1 remaining node of those 3. A clean bootstrap of a single node should complete eventually, and I would think it'll be a lot less than 3 days. Our database is much smaller than yours at the moment, so I can't really guide you on how long it should take, but I'd think that others on the list with similar database sizes might be able to give you a better idea. Steve On Fri, Apr 18, 2014 at 1:43 PM, Phil Burress philburress...@gmail.comwrote: First, I just stopped 2 of the nodes and left one running. But this morning, I stopped that third node, cleared out the data, restarted and let it rejoin again. It appears streaming is done (according to netstats), right now it appears to be running compaction and building secondary index (according to compactionstats). Just sit and wait I guess? On Fri, Apr 18, 2014 at 2:23 PM, Steven A Robenalt srobe...@stanford.edu wrote: Looking back through this email chain, it looks like Phil said he wasn't using vnodes. For the record, we are using vnodes since we brought up our first cluster, and have not seen any issues with bootstrapping new nodes either to replace existing nodes, or to grow/shrink the cluster. We did adhere to the caveats that new nodes should not be seed nodes, and that we should allow each node to join the cluster completely before making any other changes. Phil, when you dropped to adding just the single node to your cluster, did you start over with the newly added node (blowing away the database created on the previous startup), or did you shut down the other 2 added nodes and leave the remaining one in progress to continue? Steve On Fri, Apr 18, 2014 at 10:40 AM, Robert Coli rc...@eventbrite.comwrote: On Fri, Apr 18, 2014 at 5:05 AM, Phil Burress philburress...@gmail.com wrote: nodetool netstats shows 84 files. They are all at 100%. Nothing showing in Pending or Active for Read Repair Stats. I'm assuming this means it's done. But it still shows JOINING. Is there an undocumented step I'm missing here? This whole process seems broken to me. Lately it seems like a lot more people than usual are : 1) using vnodes 2) unable to bootstrap new nodes If I were you, I would likely file a JIRA detailing your negative experience with this core functionality. =Rob -- Steve Robenalt Software Architect HighWire | Stanford University 425 Broadway St, Redwood City, CA 94063 srobe...@stanford.edu http://highwire.stanford.edu -- Steve Robenalt Software Architect HighWire | Stanford University 425 Broadway St, Redwood City, CA 94063 srobe...@stanford.edu http://highwire.stanford.edu
Re: Bootstrap Timing
Thank you all for your advice and good info. The node has died a couple of times with out of memory errors. I've restarted each time but it starts re - running compaction and then dies again. Is there a better way to do this? On Apr 18, 2014 6:06 PM, Steven A Robenalt srobe...@stanford.edu wrote: That's what I'd be doing, but I wouldn't expect it to run for 3 days this time. My guess is that whatever was going wrong with the bootstrap when you had 3 nodes starting at once was interfering with the completion of the 1 remaining node of those 3. A clean bootstrap of a single node should complete eventually, and I would think it'll be a lot less than 3 days. Our database is much smaller than yours at the moment, so I can't really guide you on how long it should take, but I'd think that others on the list with similar database sizes might be able to give you a better idea. Steve On Fri, Apr 18, 2014 at 1:43 PM, Phil Burress philburress...@gmail.comwrote: First, I just stopped 2 of the nodes and left one running. But this morning, I stopped that third node, cleared out the data, restarted and let it rejoin again. It appears streaming is done (according to netstats), right now it appears to be running compaction and building secondary index (according to compactionstats). Just sit and wait I guess? On Fri, Apr 18, 2014 at 2:23 PM, Steven A Robenalt srobe...@stanford.edu wrote: Looking back through this email chain, it looks like Phil said he wasn't using vnodes. For the record, we are using vnodes since we brought up our first cluster, and have not seen any issues with bootstrapping new nodes either to replace existing nodes, or to grow/shrink the cluster. We did adhere to the caveats that new nodes should not be seed nodes, and that we should allow each node to join the cluster completely before making any other changes. Phil, when you dropped to adding just the single node to your cluster, did you start over with the newly added node (blowing away the database created on the previous startup), or did you shut down the other 2 added nodes and leave the remaining one in progress to continue? Steve On Fri, Apr 18, 2014 at 10:40 AM, Robert Coli rc...@eventbrite.comwrote: On Fri, Apr 18, 2014 at 5:05 AM, Phil Burress philburress...@gmail.com wrote: nodetool netstats shows 84 files. They are all at 100%. Nothing showing in Pending or Active for Read Repair Stats. I'm assuming this means it's done. But it still shows JOINING. Is there an undocumented step I'm missing here? This whole process seems broken to me. Lately it seems like a lot more people than usual are : 1) using vnodes 2) unable to bootstrap new nodes If I were you, I would likely file a JIRA detailing your negative experience with this core functionality. =Rob -- Steve Robenalt Software Architect HighWire | Stanford University 425 Broadway St, Redwood City, CA 94063 srobe...@stanford.edu http://highwire.stanford.edu -- Steve Robenalt Software Architect HighWire | Stanford University 425 Broadway St, Redwood City, CA 94063 srobe...@stanford.edu http://highwire.stanford.edu
Re: Bootstrap Timing
nodetool netstats shows 84 files. They are all at 100%. Nothing showing in Pending or Active for Read Repair Stats. I'm assuming this means it's done. But it still shows JOINING. Is there an undocumented step I'm missing here? This whole process seems broken to me. On Thu, Apr 17, 2014 at 4:32 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Apr 16, 2014 at 1:56 PM, Phil Burress philburress...@gmail.comwrote: I've shut down two of the nodes and am bootstrapping one right now. Is there any way to tell when it will finish bootstrapping? nodetool netstats will show the progress of the streams involved, which could help you estimate. =Rob
Re: Bootstrap Timing
First, I just stopped 2 of the nodes and left one running. But this morning, I stopped that third node, cleared out the data, restarted and let it rejoin again. It appears streaming is done (according to netstats), right now it appears to be running compaction and building secondary index (according to compactionstats). Just sit and wait I guess? On Fri, Apr 18, 2014 at 2:23 PM, Steven A Robenalt srobe...@stanford.eduwrote: Looking back through this email chain, it looks like Phil said he wasn't using vnodes. For the record, we are using vnodes since we brought up our first cluster, and have not seen any issues with bootstrapping new nodes either to replace existing nodes, or to grow/shrink the cluster. We did adhere to the caveats that new nodes should not be seed nodes, and that we should allow each node to join the cluster completely before making any other changes. Phil, when you dropped to adding just the single node to your cluster, did you start over with the newly added node (blowing away the database created on the previous startup), or did you shut down the other 2 added nodes and leave the remaining one in progress to continue? Steve On Fri, Apr 18, 2014 at 10:40 AM, Robert Coli rc...@eventbrite.comwrote: On Fri, Apr 18, 2014 at 5:05 AM, Phil Burress philburress...@gmail.comwrote: nodetool netstats shows 84 files. They are all at 100%. Nothing showing in Pending or Active for Read Repair Stats. I'm assuming this means it's done. But it still shows JOINING. Is there an undocumented step I'm missing here? This whole process seems broken to me. Lately it seems like a lot more people than usual are : 1) using vnodes 2) unable to bootstrap new nodes If I were you, I would likely file a JIRA detailing your negative experience with this core functionality. =Rob -- Steve Robenalt Software Architect HighWire | Stanford University 425 Broadway St, Redwood City, CA 94063 srobe...@stanford.edu http://highwire.stanford.edu
Bootstrap Timing
Greetings, How long does bootstrapping typically take? I have 3 existing nodes in our cluster with about 40GB each. I've added three new nodes to the cluster. They have been in bootstrap mode for a little over 3 days now. Should I be concerned? Is there a way to tell how long it will take to finish? Running Cassandra version 2.0.6. on Ubuntu 12.04. Thanks very much! Phil
Re: Bootstrap Timing
Thanks very much for the response. I'm not using vnodes, does that matter? On Wed, Apr 16, 2014 at 2:13 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Apr 16, 2014 at 11:10 AM, Phil Burress philburress...@gmail.comwrote: How long does bootstrapping typically take? I have 3 existing nodes in our cluster with about 40GB each. I've added three new nodes to the cluster. They have been in bootstrap mode for a little over 3 days now. Should I be concerned? Is there a way to tell how long it will take to finish? Adding more than one node at a time to a cluster (especially with vnodes) is Not Supported. If I were you, I would stop all 3 bootstraps and then do one at a time. =Rob
Re: Bootstrap Timing
Also, one more quick question. For the new nodes, do I add all three existing nodes as seeds? Or just add one? On Wed, Apr 16, 2014 at 2:16 PM, Phil Burress philburress...@gmail.comwrote: Thanks very much for the response. I'm not using vnodes, does that matter? On Wed, Apr 16, 2014 at 2:13 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Apr 16, 2014 at 11:10 AM, Phil Burress philburress...@gmail.comwrote: How long does bootstrapping typically take? I have 3 existing nodes in our cluster with about 40GB each. I've added three new nodes to the cluster. They have been in bootstrap mode for a little over 3 days now. Should I be concerned? Is there a way to tell how long it will take to finish? Adding more than one node at a time to a cluster (especially with vnodes) is Not Supported. If I were you, I would stop all 3 bootstraps and then do one at a time. =Rob
Re: Bootstrap Timing
Thanks! On Wed, Apr 16, 2014 at 2:50 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Apr 16, 2014 at 11:16 AM, Phil Burress philburress...@gmail.comwrote: Thanks very much for the response. I'm not using vnodes, does that matter? Not in your case. In some cases it is safe to bootstrap multiple nodes into a cluster at once AT SPECIFIC TOKENS, because there is more than one replica set to bootstrap them into safely. Even in this case, it is not recommended. For the new nodes, do I add all three existing nodes as seeds? Or just add one? One should be sufficient, but all three could not hurt. =Rob
Re: Bootstrap Timing
I've shut down two of the nodes and am bootstrapping one right now. Is there any way to tell when it will finish bootstrapping? On Wed, Apr 16, 2014 at 2:56 PM, Phil Burress philburress...@gmail.comwrote: Thanks! On Wed, Apr 16, 2014 at 2:50 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Apr 16, 2014 at 11:16 AM, Phil Burress philburress...@gmail.comwrote: Thanks very much for the response. I'm not using vnodes, does that matter? Not in your case. In some cases it is safe to bootstrap multiple nodes into a cluster at once AT SPECIFIC TOKENS, because there is more than one replica set to bootstrap them into safely. Even in this case, it is not recommended. For the new nodes, do I add all three existing nodes as seeds? Or just add one? One should be sufficient, but all three could not hurt. =Rob