Re: Adding a new node with the double of disk space

2017-08-19 Thread Jeff Jirsa
You'd use different num_tokens only if you wanted an imbalance (e.g. New 
hardware specs where you wanted to use fewer, larger machines).

-- 
Jeff Jirsa


> On Aug 19, 2017, at 6:04 PM, Subroto Barua  
> wrote:
> 
> Jeff,
> 
> is it ok to have different values of num_tokens per node in a cluster? won't 
> it create cluster imbalance? or it better to initiate it on a separate DC?
> 
> Subroto
> 
> 
> On Friday, August 18, 2017, 5:34:11 AM PDT, Durity, Sean R 
>  wrote:
> 
> 
> I am doing some on-the-job-learning on this newer feature of the 3.x line, 
> where the token generation algorithm will compensate for different size nodes 
> in a cluster. In fact, it is one of the main reasons I upgraded to 3.0.13, 
> because I have a number of original nodes in a cluster that are about half 
> the size of the newer nodes. With the same number of vnodes, they can get 
> overwhelmed with too much data and have to be rebuilt, etc.
> 
>  
> 
> So, I am cutting vnodes in half on those original nodes and rebuilding them. 
> So far, it is working as designed. The data size is about half on the smaller 
> nodes.
> 
>  
> 
> With the more current advice being to use less vnodes, for the original 
> question below, I might consider adding the new node in at 256 vnodes and 
> then rebuilding all the other nodes at 128. Of course the cluster size and 
> amount of data would be important factors, as well as the future growth of 
> the cluster and the expected size of any additional nodes.
> 
>  
> 
>  
> 
> Sean Durity
> 
>  
> 
> From: Jeff Jirsa [mailto:jji...@gmail.com] 
> Sent: Thursday, August 17, 2017 4:20 PM
> To: cassandra 
> Subject: Re: Adding a new node with the double of disk space
> 
>  
> 
> If you really double the hardware in every way, it's PROBABLY reasonable to 
> double num_tokens. It won't be quite the same as doubling all-the-things, 
> because you still have a single JVM, and you'll still have to deal with GC as 
> you're now reading twice as much and generating twice as much garbage, but 
> you can probably adjust the tuning of the heap to compensate.
> 
>  
> 
>  
> 
>  
> 
> On Thu, Aug 17, 2017 at 1:00 PM, Kevin O'Connor  
> wrote:
> 
> Are you saying if a node had double the hardware capacity in every way it 
> would be a bad idea to up num_tokens? I thought that was the whole idea of 
> that setting though?
> 
>  
> 
> On Thu, Aug 17, 2017 at 9:52 AM, Carlos Rolo  wrote:
> 
> No.
> 
>  
> 
> If you would double all the hardware on that node vs the others would still 
> be a bad idea.
> 
> Keep the cluster uniform vnodes wise.
> 
> 
> Regards,
> 
>  
> 
> Carlos Juzarte Rolo
> 
> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
> 
>  
> 
> Pythian - Love your data
> 
>  
> 
> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: 
> linkedin.com/in/carlosjuzarterolo 
> 
> Mobile: +351 918 918 100
> 
> www.pythian.com
> 
>  
> 
> On Thu, Aug 17, 2017 at 5:47 PM, Cogumelos Maravilha 
>  wrote:
> 
> Hi all,
> 
> I need to add a new node to my cluster but this time the new node will
> have the double of disk space comparing to the other nodes.
> 
> I'm using the default vnodes (num_tokens: 256). To fully use the disk
> space in the new node I just have to configure num_tokens: 512?
> 
> Thanks in advance.
> 
> 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>  
> 
>  
> 
> --
> 
>  
> 
>  
> 
>  
> 
> 
> 
> The information in this Internet Email is confidential and may be legally 
> privileged. It is intended solely for the addressee. Access to this Email by 
> anyone else is unauthorized. If you are not the intended recipient, any 
> disclosure, copying, distribution or any action taken or omitted to be taken 
> in reliance on it, is prohibited and may be unlawful. When addressed to our 
> clients any opinions or advice contained in this Email are subject to the 
> terms and conditions expressed in any applicable governing The  Home Depot 
> terms of business or client engagement letter. The Home Depot disclaims all 
> responsibility and liability for the accuracy and content of this attachment 
> and for any damages or losses arising from any inaccuracies, errors, viruses, 
> e.g., worms, trojan horses, etc., or other items of a destructive nature, 
> which may be contained in this attachment and shall not be liable for direct, 
> indirect, consequential or special damages in connection with this e-mail 
> message or its attachment.
> 


Re: RE: Adding a new node with the double of disk space

2017-08-19 Thread Subroto Barua
Jeff,
is it ok to have different values of num_tokens per node in a cluster? won't it 
create cluster imbalance? or it better to initiate it on a separate DC?
Subroto

On Friday, August 18, 2017, 5:34:11 AM PDT, Durity, Sean R 
 wrote:

#yiv5432100827 #yiv5432100827 -- _filtered #yiv5432100827 {panose-1:2 4 5 3 5 4 
6 3 2 4;} _filtered #yiv5432100827 {font-family:Calibri;panose-1:2 15 5 2 2 2 4 
3 2 4;}#yiv5432100827 #yiv5432100827 p.yiv5432100827MsoNormal, #yiv5432100827 
li.yiv5432100827MsoNormal, #yiv5432100827 div.yiv5432100827MsoNormal 
{margin:0in;margin-bottom:.0001pt;font-size:12.0pt;}#yiv5432100827 a:link, 
#yiv5432100827 span.yiv5432100827MsoHyperlink 
{color:blue;text-decoration:underline;}#yiv5432100827 a:visited, #yiv5432100827 
span.yiv5432100827MsoHyperlinkFollowed 
{color:purple;text-decoration:underline;}#yiv5432100827 p 
{margin-right:0in;margin-left:0in;font-size:12.0pt;}#yiv5432100827 
p.yiv5432100827msonormal0, #yiv5432100827 li.yiv5432100827msonormal0, 
#yiv5432100827 div.yiv5432100827msonormal0 
{margin-right:0in;margin-left:0in;font-size:12.0pt;}#yiv5432100827 
span.yiv5432100827EmailStyle19 {color:#1F497D;}#yiv5432100827 
.yiv5432100827MsoChpDefault {} _filtered #yiv5432100827 {margin:1.0in 1.0in 
1.0in 1.0in;}#yiv5432100827 div.yiv5432100827WordSection1 {}#yiv5432100827 
I am doing some on-the-job-learning on this newer feature of the 3.x line, 
where the token generation algorithm will compensate for different size nodes 
in a cluster. In fact, it is one of the main reasons I upgraded to 3.0.13, 
because I have a number of original nodes in a cluster that are about half the 
size of the newer nodes. With the same number of vnodes, they can get 
overwhelmed with too much data and have to be rebuilt, etc. 
 
  
 
So, I am cutting vnodes in half on those original nodes and rebuilding them. So 
far, it is working as designed. The data size is about half on the smaller 
nodes.
 
  
 
With the more current advice being to use less vnodes, for the original 
question below, I might consider adding the new node in at 256 vnodes and then 
rebuilding all the other nodes at 128. Of course the cluster size and amount of 
data would be important factors, as well as the future growth of the cluster 
and the expected size of any additional nodes.
 
  
 
  
 
Sean Durity
 
  
 
From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Thursday, August 17, 2017 4:20 PM
To: cassandra 
Subject: Re: Adding a new node with the double of disk space
 
  
 
If you really double the hardware in every way, it's PROBABLY reasonable to 
double num_tokens. It won't be quite the same as doubling all-the-things, 
because you still have a single JVM, and you'll still have to deal with GC as 
you're now reading twice as much and generating twice as much garbage, but you 
can probably adjust the tuning of the heap to compensate.
 
  
 
  
 
  
 
On Thu, Aug 17, 2017 at 1:00 PM, Kevin O'Connor  
wrote:
 

Are you saying if a node had double the hardware capacity in every way it would 
be a bad idea to up num_tokens? I thought that was the whole idea of that 
setting though?
 
  
 
On Thu, Aug 17, 2017 at 9:52 AM, Carlos Rolo  wrote:
 

No.
 
  
 
If you would double all the hardware on that node vs the others would still be 
a bad idea.
 
Keep the cluster uniform vnodes wise.
 


 
Regards,
 
  
 
Carlos Juzarte Rolo
 
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
 
 
 
Pythian - Love your data
 
  
 
rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: 
linkedin.com/in/carlosjuzarterolo

 
Mobile: +351 918 918 100 
 
www.pythian.com
 
  
 
On Thu, Aug 17, 2017 at 5:47 PM, Cogumelos Maravilha 
 wrote:
 


Hi all,

I need to add a new node to my cluster but this time the new node will
have the double of disk space comparing to the other nodes.

I'm using the default vnodes (num_tokens: 256). To fully use the disk
space in the new node I just have to configure num_tokens: 512?

Thanks in advance.



-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org 

  
 
  
 
--
 
  
 

  
 

  
 

The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and 

Re: Moving all LCS SSTables to a repaired state

2017-08-19 Thread Sotirios Delimanolis
That's the only way to get this done then, break writes and fix them with 
incremental repairs?

On Friday, August 18, 2017, 5:17:38 PM PDT, kurt greaves  
wrote:

You need to run an incremental repair for sstables to be marked repaired. 
However only if all of the data in that Sstable is repaired during the repair 
will you end up with it being marked repaired, otherwise an anticompaction will 
occur and split the unrepaired data into its own sstable.It's pretty unlikely 
you will get all SSTables marked as repaired unless you stop writing data and 
run inc repair multiple times.