Re: Re-adding Decommissioned node

2017-06-27 Thread Anuj Wadehra
Hi Mark,
Please ensure that the node is not defined as seed node in the yaml. Seed nodes 
don't bootstrap.
ThanksAnuj

 
 
  On Tue, Jun 27, 2017 at 9:56 PM, Mark Furlong wrote:   
 
I have a node that has been decommissioned and it showed ‘UL’, the data volume 
and the commitlogs have been removed, and I now want to add that node back into 
my ring. When I add this node, (bootstrap=true, start cassandra service) it 
comes back up in the ring as an existing node and shows as ‘UN’ instead of 
‘UJ’. Why is this? It has no data.
 
  
 
| 
Mark Furlong
  |
| 
Sr. Database Administrator
  |
| 
mfurl...@ancestry.com
M: 801-859-7427
 
O: 801-705-7115
 
1300 W Traverse Pkwy
 
Lehi, UT 84043
 
| 
  
  | 
  
  |

 |
| 
​
  |


  
 
  
   

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Re: Hints files are not getting truncated

2017-06-27 Thread Anuj Wadehra
Hi Meg,
max_hint_window_in_ms =3 hrs means that if a node is down/unresponsive for more 
than 3 hrs, hints would not be stored for it any further until it becomes 
responsive again. It should not mean that already stored hints would be 
truncated after 3 hours.
Regarding connection timeouts between DCs, please check your firewall settings 
and tcp settings on node. Firewall between the DC must not kill an idle 
connection which is still considered to be usable by Cassandra. Please see 
http://docs.datastax.com/en/archived/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html
 .
In multi dc setup, documentation recommends to increase number of threads. You 
can try increasing it and check whether it improves the situation.
ThanksAnuj


 
 
  On Tue, Jun 27, 2017 at 9:47 PM, Meg Mara wrote:

Hello,
 
  
 
I am facing an issue with Hinted Handoff files in Cassandra v3.0.10. A DC1 node 
is storing large number of hints for DC2 nodes (we are facing connection 
timeout issues). The problem is that the hint files which are created on DC1 
are not getting deleted after the 3 hour window. Hints are now being stored as 
flat files in the Cassandra home directory and I can see that old hints are 
being deleted but at a very slow pace. It still contains hints from May.
 
max_hint_window_in_ms: 1080
 
max_hints_delivery_threads: 2
 
  
 
Why do you suppose this is happening? Any suggestions or recommendations would 
be much appreciated.
 
  
 
Thanks for your time.
 
Meg Mara
 
  
   


Re: Linux version update on DSE

2017-06-27 Thread Anuj Wadehra
Hi Nitan,
I think it would be simpler to take one node down at a time and replace it by 
bringing the new node up after linux upgrade, doing same cassandra setup, using 
replace_address option and setting autobootstrap=false ( as data is already 
there). No downtime as it would be a rolling upgrade. No streaming as same 
tokens would work.
If you have latest C*, use replace_address_first_boot. If option not available, 
use replace_address and make sure you remove it once new node is up.
Try it and let us know if it works for you.
ThanksAnuj

 
 
  On Tue, Jun 27, 2017 at 4:56 AM, Nitan Kainth wrote:   
Right, we are just upgrading Linux on AWS. C* will remain at same version.


On Jun 26, 2017, at 6:05 PM, Hannu Kröger  wrote:
I understood he is updating linux, not C*
Hannu
On 27 June 2017 at 02:04:34, Jonathan Haddad (j...@jonhaddad.com) wrote:

It sounds like you're suggesting adding new nodes in to replace existing ones.  
You can't do that because it requires streaming between versions, which isn't 
supported. 
You need to take a node down, upgrade the C* version, then start it back up.  
Jon
On Mon, Jun 26, 2017 at 3:56 PM Nitan Kainth  wrote:

It's vnodes. We will add to replace new ip in yaml as well.

Thank you.

Sent from my iPhone

> On Jun 26, 2017, at 4:47 PM, Hannu Kröger  wrote:
>
> Looks Ok. Step 1.5 would be to stop cassandra on existing node but apart from 
> that looks fine. Assuming you are using same configs and if you have hard 
> coded the token(s), you use the same.
>
> Hannu
>
>> On 26 Jun 2017, at 23.24, Nitan Kainth  wrote:
>>
>> Hi,
>>
>> We are planning to update linux for C* nodes version 3.0. Anybody has steps 
>> who did it recent past.
>>
>> Here are draft steps, we are thinking:
>> 1. Create new node. It might have a different IP address.
>> 2. Detach mounts from existing node
>> 3. Attach mounts to new Node
>> 4. Start C*

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



  


Re: Partition range incremental repairs

2017-06-06 Thread Anuj Wadehra
Hi Chris,
Using pr with incremental repairs does not make sense. Primary range repair is 
an optimization over full repair. If you run full repair on a n node cluster 
with RF=3, you would be repairing each data thrice. E.g. in a 5 node cluster 
with RF=3, a range may exist on node A,B and C . When full repair is run on 
node A, the entire data in that range gets synced with replicas on node B and 
C. Now, when you run full repair on nodes B and C, you are wasting resources on 
repairing data which is already repaired. 
Primary range repair ensures that when you run repair on a node, it ONLY 
repairs the data which is owned by the node. Thus, no node repairs data which 
is not owned by it and must be repaired by other node. Redundant work is 
eliminated. 
Even in pr, each time you run pr on all nodes, you repair 100% of data. Why to 
repair complete data in each cycle?? ..even data which has not even changed 
since the last repair cycle?
This is where Incremental repair comes as an improvement. Once repaired, a data 
would be marked repaired so that the next repair cycle could just focus on 
repairing the delta. Now, lets go back to the example of 5 node cluster with RF 
=3.This time we run incremental repair on all nodes. When you repair entire 
data on node A, all 3 replicas are marked as repaired. Even if you run inc 
repair on all ranges on the second node, you would not re-repair the already 
repaired data. Thus, there is no advantage of repairing only the data owned by 
the node (primary range of the node). You can run inc repair on all the data 
present on a node and Cassandra would make sure that when you repair data on 
other nodes, you only repair unrepaired data.
ThanksAnuj


Sent from Yahoo Mail on Android 
 
  On Tue, Jun 6, 2017 at 4:27 PM, Chris 
Stokesmore wrote:   Hi all,

Wondering if anyone had any thoughts on this? At the moment the long running 
repairs cause us to be running them on two nodes at once for a bit of time, 
which obivould increases the cluster load.

On 2017-05-25 16:18 (+0100), Chris Stokesmore  wrote: 
> Hi,> 
> 
> We are running a 7 node Cassandra 2.2.8 cluster, RF=3, and had been running 
> repairs with the -pr option, via a cron job that runs on each node once per 
> week.> 
> 
> We changed that as some advice on the Cassandra IRC channel said it would 
> cause more anticompaction and  
> http://docs.datastax.com/en/archived/cassandra/2.2/cassandra/tools/toolsRepair.html
>   says 'Performing partitioner range repairs by using the -pr option is 
> generally considered a good choice for doing manual repairs. However, this 
> option cannot be used with incremental repairs (default for Cassandra 2.2 and 
> later)'
> 
> Only problem is our -pr repairs were taking about 8 hours, and now the non-pr 
> repair are taking 24+ - I guess this makes sense, repairing 1/7 of data 
> increased to 3/7, except I was hoping to see a speed up after the first loop 
> through the cluster as each repair will be marking much more data as 
> repaired, right?> 
> 
> 
> Is running -pr with incremental repairs really that bad? > 
-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org
  


Re: Partition range incremental repairs

2017-06-06 Thread Anuj Wadehra
Hi Chris,
Can your share following info:
1. Exact repair commands you use for inc repair and pr repair
2. Repair time should be measured at cluster level for inc repair. So, whats 
the total time it takes to run repair on all nodes for incremental vs pr 
repairs?
3. You are repairing one dc DC3. How many DCs are there in total and whats the 
RF for keyspaces? Running pr on a specific dc would not repair entire data.
4. 885 ranges? From where did you get this number? Logs? Can you share the 
number ranges printed in logs for both inc and pr case?

ThanksAnuj

Sent from Yahoo Mail on Android 
 
  On Tue, Jun 6, 2017 at 9:33 PM, Chris 
Stokesmore<chris.elsm...@demandlogic.co> wrote:   Thank you for the excellent 
and clear description of the different versions of repair Anuj, that has 
cleared up what I expect to be happening.
The problem now is in our cluster, we are running repairs with options 
(parallelism: parallel, primary range: false, incremental: true, job threads: 
1, ColumnFamilies: [], dataCenters: [DC3], hosts: [], # of ranges: 885) and 
when we do our repairs are taking over a day to complete when previously when 
running with the partition range option they were taking more like 8-9 hours.
As I understand it, using incremental should have sped this process up as all 
three sets of data on each repair job should be marked as repaired however this 
does not seem to be the case. Any ideas?
Chris

On 6 Jun 2017, at 16:08, Anuj Wadehra <anujw_2...@yahoo.co.in.INVALID> wrote:
Hi Chris,
Using pr with incremental repairs does not make sense. Primary range repair is 
an optimization over full repair. If you run full repair on a n node cluster 
with RF=3, you would be repairing each data thrice. E.g. in a 5 node cluster 
with RF=3, a range may exist on node A,B and C . When full repair is run on 
node A, the entire data in that range gets synced with replicas on node B and 
C. Now, when you run full repair on nodes B and C, you are wasting resources on 
repairing data which is already repaired. 
Primary range repair ensures that when you run repair on a node, it ONLY 
repairs the data which is owned by the node. Thus, no node repairs data which 
is not owned by it and must be repaired by other node. Redundant work is 
eliminated. 
Even in pr, each time you run pr on all nodes, you repair 100% of data. Why to 
repair complete data in each cycle?? ..even data which has not even changed 
since the last repair cycle?
This is where Incremental repair comes as an improvement. Once repaired, a data 
would be marked repaired so that the next repair cycle could just focus on 
repairing the delta. Now, lets go back to the example of 5 node cluster with RF 
=3.This time we run incremental repair on all nodes. When you repair entire 
data on node A, all 3 replicas are marked as repaired. Even if you run inc 
repair on all ranges on the second node, you would not re-repair the already 
repaired data. Thus, there is no advantage of repairing only the data owned by 
the node (primary range of the node). You can run inc repair on all the data 
present on a node and Cassandra would make sure that when you repair data on 
other nodes, you only repair unrepaired data.
ThanksAnuj


Sent from Yahoo Mail on Android 
 
 On Tue, Jun 6, 2017 at 4:27 PM, Chris Stokesmore<chris.elsm...@demandlogic.co> 
wrote:  Hi all,

Wondering if anyone had any thoughts on this? At the moment the long running 
repairs cause us to be running them on two nodes at once for a bit of time, 
which obivould increases the cluster load.

On 2017-05-25 16:18 (+0100), Chris Stokesmore <c...@demandlogic.co> wrote: 
> Hi,> 
> 
> We are running a 7 node Cassandra 2.2.8 cluster, RF=3, and had been running 
> repairs with the -pr option, via a cron job that runs on each node once per 
> week.> 
> 
> We changed that as some advice on the Cassandra IRC channel said it would 
> cause more anticompaction and  
> http://docs.datastax.com/en/archived/cassandra/2.2/cassandra/tools/toolsRepair.html
>   says 'Performing partitioner range repairs by using the -pr option is 
> generally considered a good choice for doing manual repairs. However, this 
> option cannot be used with incremental repairs (default for Cassandra 2.2 and 
> later)'
> 
> Only problem is our -pr repairs were taking about 8 hours, and now the non-pr 
> repair are taking 24+ - I guess this makes sense, repairing 1/7 of data 
> increased to 3/7, except I was hoping to see a speed up after the first loop 
> through the cluster as each repair will be marking much more data as 
> repaired, right?> 
> 
> 
> Is running -pr with incremental repairs really that bad? > 
-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org
  


  


Re: LWT and non-LWT mixed

2017-10-10 Thread Anuj Wadehra
Hi Daniel,
What is the RF and CL for Delete?Are you using asynchronous writes?Are you 
firing both statements from same node sequentially?Are you firing these queries 
in a loop such that more than one delete and LWT is fired for same partition?
I think if you have the same client executing both statements sequentially in 
same thread i.e. one after another and delete is synchronous, it should work 
fine. LWT will be executed after Cassandra has written on Quorum of nodes and 
will see the data. Paxos of LWT shall only be initiated when delete completes. 
I think, LWT should not be mixed with normal write when you have such writes 
fired from multiple nodes/threads on the same partition.

ThanksAnuj
Sent from Yahoo Mail on Android 
 
  On Tue, 10 Oct 2017 at 14:10, Daniel Woo wrote:   The 
document explains you cannot mix 
themhttp://docs.datastax.com/en/archived/cassandra/2.2/cassandra/dml/dmlLtwtTransactions.html

But what happens under the hood if I do? e.g, 
DELETE INSERT ... IF NOT EXISTS
The coordinator has 4 steps to do the second statement (INSERT)1. 
prepare/promise a ballot2. read current row from replicas3. propose new value 
along with the ballot to replicas4. commit and wait for ack from replicas
My question is, once the row is DELETed, the next INSERT LWT should be able to 
see that row's tombstone in step 2, then successfully inserts the new value. 
But my tests shows that this often fails, does anybody know why? 
-- 
Thanks & Regards,
Daniel  


Cassandra Upgrade with Different Protocol Version

2018-07-05 Thread Anuj Wadehra
Hi,
I woud like to know how people are doing rolling upgrade of Casandra clustes 
when there is a change in native protocol version say from 2.1 to 3.11. During 
rolling upgrade, if client application is restarted on nodes, the client driver 
may first contact an upgraded Cassandra node with v4 and permanently mark all 
old Casandra nodes on v3 as down. This may lead to request failures. Datastax 
recommends two ways to deal with this:
1. Before upgrade, set protocol version to lower protocol version. And move to 
higher version once entire cluster is upgraded.2. Make sure driver only 
contacts upraded Cassandra nodes during rolling upgrade.
Second workaround will lead to failures as you may not be able to meet required 
consistency for some time.
Lets consider first workaround. Now imagine an application where protocol 
version is not configurable and code uses default protocol version. You can not 
apply first workaroud because you have to upgrade your application on all nodes 
to first make the protocol version configurable. How would you upgrade such a 
cluster without downtime? Thoughts?
ThanksAnuj



Re: [External] Re: Whch version is the best version to run now?

2018-03-06 Thread Anuj Wadehra
We evaluated both 3.0.x and 3.11.x. +1 for 3.11.2 as we faced major performance 
issues with 3.0.x. We have NOT evaluated new features on 3.11.x.
Anuj

Sent from Yahoo Mail on Android 
 
  On Tue, 6 Mar 2018 at 19:35, Alain RODRIGUEZ wrote:   
Hello Tom,

It's good to hear this kind of feedbacks,
Thanks for sharing.

3.11.x seems to get more love from the community wrt patches. This is why I'd 
recommend 3.11.x for new projects.


I also agree with this analysis.

Stay away from any of the 2.x series, they're going EOL soonish and the newer 
versions are very stable.


+1 here as well. Maybe add that 3.11.x, that is described as 'very stable' 
above, aims at stabilizing Cassandra after the tick-tock releases and is a 'bug 
fix' series and brings features developed during this period, even though it is 
needed to be careful with of some the new features, even in latest 3.11.x 
versions.
I did not work that much with it yet, but I think I would pick 3.11.2 as well 
for a new cluster at the moment.
C*heers,
---Alain Rodriguez - @arodream - 
alain@thelastpickle.comFrance / Spain
The Last Pickle - Apache Cassandra Consultinghttp://www.thelastpickle.com

2018-03-05 12:39 GMT+00:00 Tom van der Woerdt :

We run on the order of a thousand Cassandra nodes in production. Most of that 
is 3.0.16, but new clusters are defaulting to 3.11.2 and some older clusters 
have been upgraded to it as well.
All of the bugs I encountered in 3.11.x were also seen in 3.0.x, but 3.11.x 
seems to get more love from the community wrt patches. This is why I'd 
recommend 3.11.x for new projects.
Stay away from any of the 2.x series, they're going EOL soonish and the newer 
versions are very stable.

Tom van der WoerdtSite Reliability Engineer

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL NetherlandsThe world's #1 accommodation 
site 
43 languages, 198+ offices worldwide, 120,000+ global destinations, 1,550,000+ 
room nights booked every day 
No booking fees, best price always guaranteed 
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)

On Sat, Mar 3, 2018 at 12:25 AM, Jeff Jirsa  wrote:

I’d personally be willing to run 3.0.16
3.11.2 or 3 whatever should also be similar, but I haven’t personally tested it 
at any meaningful scale

-- Jeff Jirsa

On Mar 2, 2018, at 2:37 PM, Kenneth Brotman  
wrote:



Seems like a lot of people are running old versions of Cassandra.  What is the 
best version, most reliable stable version to use now?

 

Kenneth Brotman





  


Re: Upgrade to v3.11.3

2019-01-16 Thread Anuj Wadehra
Hi Shalom,
Just a suggestion. Before upgrading to 3.11.3 make sure you are not impacted by 
any open crtitical defects especially related to RT which may cause data loss 
e.g.14861.
Please find my response below:

The upgrade process that I know of is from 2.0.14 to 2.1.x (higher than 2.1.9 I 
think) and then from 2.1.x to 3.x. Do I need to upgrade first to 3.0.x or can I 
upgraded directly from 2.1.x to 3.11.3?

Response: Yes, you can upgrade from 2.0.14 to some latest stable version of 
2.1.x (only 2.1.9+)  and then upgrade to 3.11.3.

Can I run upgradesstables on several nodes in parallel? Is it crucial to run it 
one node at a time?

Response: Yes, you can run in parallel.





When running upgradesstables on a node, does that node still serves writes and 
reads?

Response: Yes.





Can I use open JDK 8 (instead of Oracle JDK) with C* 3.11.3?

Response: We have not tried but it should be okay. See 
https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-13916.





Is there a way to speed up the upgradesstables process? (besides 
compaction_throughput)


Response: If clearing pending compactions caused by rewriting sstable is a 
concern,probably you can also try increasing concurrent compactors.






Disclaimer: The information provided in above response is my personal opinion 
based on the best of my knowledge and experience. We do not take any 
responsibility and we are not liable for any damage caused by actions taken 
based on above information.

ThanksAnuj
 
 
  On Wed, 16 Jan 2019 at 19:15, shalom sagges wrote:   
Hi All, 

I'm about to start a rolling upgrade process from version 2.0.14 to version 
3.11.3. 
I have a few small questions:   
   - The upgrade process that I know of is from 2.0.14 to 2.1.x (higher than 
2.1.9 I think) and then from 2.1.x to 3.x. Do I need to upgrade first to 3.0.x 
or can I upgraded directly from 2.1.x to 3.11.3?   
   

   - Can I run upgradesstables on several nodes in parallel? Is it crucial to 
run it one node at a time?   
   

   - When running upgradesstables on a node, does that node still serves writes 
and reads?   
   

   - Can I use open JDK 8 (instead of Oracle JDK) with C* 3.11.3?   
   

   - Is there a way to speed up the upgradesstables process? (besides 
compaction_throughput)   


Thanks!
  


ApacheCon Europe 2019

2019-05-13 Thread Anuj Wadehra
Hi,
Do we have any plans for dedicated Apache Cassandra track or sessions at 
ApacheCon Berlin in Oct 2019?
CFP closes 26 May, 2019.
ThanksAnuj Wadehra


<    1   2   3