?????? Tuning bootstrap new node

2017-10-31 Thread Peng Xiao
We noticed that streaming will take about 10 hours,all the left is 
compaction,we will improve  concurrent_compactors first.


Thanks all for your reply.




--  --
??: "Jon Haddad";;
: 2017??11??1??(??) 4:06
??: "user";

: Re: Tuning bootstrap new node



Of all the settings you could change, why one that??s related to memtables?  
Streaming doesn??t go through the write path, memtables aren??t involved unless 
you??re using materialized views or CDC.

On Oct 31, 2017, at 11:44 AM, Anubhav Kale  
wrote:

You can change YAML setting of memtable_cleanup_threshold to 0.7 (from the 
default of 0.3). This will push SSTables to disk less often and will reduce the 
compaction time.
 
While this won??t change the streaming time, it will reduce the overall time 
for your node to be healthy.
 
From: Harikrishnan Pillai [mailto:hpil...@walmartlabs.com] 
Sent: Tuesday, October 31, 2017 11:28 AM
To: user@cassandra.apache.org
Subject: Re: Re: Tuning bootstrap new node


 

There is no magic in speeding up the node addition other than increasing stream 
throughput and compaction throughput.

it has been noticed that with heavy compactions the latency may go up if the 
node also start serving data.

if you really don't want  this node to service traffic till all compactions 
settle down, you can disable gossip and binary protocol using the nodetool 
command. This will allow compactions to continue but requires a repair to fix 
the stale data later.

Regards

Hari

 


From: Nitan Kainth 
Sent: Tuesday, October 31, 2017 5:47 AM
To: user@cassandra.apache.org
Subject: EXT: Re: Tuning bootstrap new node
 


Do not stop compaction, you will end up with thousands of sstables.
 


You increase stream throughput from default 200 to a heifer value if your 
network can handle it.
Sent from my iPhone



On Oct 31, 2017, at 6:35 AM, Peng Xiao <2535...@qq.com> wrote:

Can we stop the compaction during the new node bootstraping and enable it after 
the new node joined?

 

Thanks

-- Original --

From:  "";<2535...@qq.com>;

Date:  Tue, Oct 31, 2017 07:18 PM

To:  "user";

Subject:  Tuning bootstrap new node


 

Dear All,
 

Can we make some tuning to make bootstrap new node more quick?We have a three 
DC cluster(RF=3 in two DCs,RF=1 in another ,48 nodes in the DC with RF=3).As 
the Cluster is becoming larger and larger,we need to spend more than 24 hours 
to bootstrap a new node.

Could you please advise how to tune this ?

 

Many Thanks,

Peng Xiao

Re: Snapshot verification

2017-10-31 Thread Pradeep Chhetri
Hi Varun,

Thank you for the reply. I was looking for some kind of automated way (eg.
if i can get some kind of md5 per table while taking snapshot and compare
it with md5 after restoring that snapshot).

Regards.

On Tue, Oct 31, 2017 at 10:47 PM, Varun Gupta  wrote:

> We use COPY command to generate a file, from source and destination. After
> that you can use diff tool.
>
> On Mon, Oct 30, 2017 at 10:11 PM Pradeep Chhetri 
> wrote:
>
>> Hi,
>>
>> We are taking daily snapshots for backing up our cassandra data and then
>> use our backups to restore in a different environment. I would like to
>> verify that the data is consistent and all the data during the time backup
>> was taken is actually restored.
>>
>> Currently I just count the number of rows in each table. Was wondering if
>> there any inbuilt way to accomplish this.
>>
>> Thank you.
>> Pradeep
>>
>


Error 1609 while installing Cassandra on Windows 2012 R2

2017-10-31 Thread varun bhatnagar
Hi,

I am new to Cassandra and I am trying to install cassandra 2.2.8 using
saltstack on Windows 2012 R2 but when I do that I get the below error:

Action start 1:17:58: CreateFolders.
MSI (s) (94:A8) [01:17:58:072]:
Error 1609. An error occurred while applying security settings.
WORKGROUP\SYSTEM is not a valid user or group. This could be a problem with
the package, or a problem connecting to a domain controller on the network.
Check your network connection and click Retry, or Cancel to end the
install. Unable to locate the user's SID, system error 1332
Action ended 1:17:58: CreateFolders. Return value 2.
Action ended 1:17:58: INSTALL. Return value 2.

I have pasted detailed log here https://pastebin.com/2JdFcLyU
Can anyone please suggest a way to fix this?

BR,
Varun


Virus-free.
www.avg.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>


Re: Tuning bootstrap new node

2017-10-31 Thread Jon Haddad
Of all the settings you could change, why one that’s related to memtables?  
Streaming doesn’t go through the write path, memtables aren’t involved unless 
you’re using materialized views or CDC.

> On Oct 31, 2017, at 11:44 AM, Anubhav Kale 
>  wrote:
> 
> You can change YAML setting of memtable_cleanup_threshold to 0.7 (from the 
> default of 0.3). This will push SSTables to disk less often and will reduce 
> the compaction time.
>  
> While this won’t change the streaming time, it will reduce the overall time 
> for your node to be healthy.
>  
> From: Harikrishnan Pillai [mailto:hpil...@walmartlabs.com] 
> Sent: Tuesday, October 31, 2017 11:28 AM
> To: user@cassandra.apache.org
> Subject: Re: Re: Tuning bootstrap new node
>  
> There is no magic in speeding up the node addition other than increasing 
> stream throughput and compaction throughput.
> 
> it has been noticed that with heavy compactions the latency may go up if the 
> node also start serving data.
> 
> if you really don't want  this node to service traffic till all compactions 
> settle down, you can disable gossip and binary protocol using the nodetool 
> command. This will allow compactions to continue but requires a repair to fix 
> the stale data later.
> 
> Regards
> 
> Hari
> 
>  
> 
> From: Nitan Kainth >
> Sent: Tuesday, October 31, 2017 5:47 AM
> To: user@cassandra.apache.org 
> Subject: EXT: Re: Tuning bootstrap new node
>  
> Do not stop compaction, you will end up with thousands of sstables.
>  
> You increase stream throughput from default 200 to a heifer value if your 
> network can handle it.
> 
> Sent from my iPhone
> 
> On Oct 31, 2017, at 6:35 AM, Peng Xiao <2535...@qq.com 
> > wrote:
> 
> Can we stop the compaction during the new node bootstraping and enable it 
> after the new node joined?
>  
> Thanks
> -- Original --
> From:  "我自己的邮箱";<2535...@qq.com >;
> Date:  Tue, Oct 31, 2017 07:18 PM
> To:  "user">;
> Subject:  Tuning bootstrap new node
>  
> Dear All,
>  
> Can we make some tuning to make bootstrap new node more quick?We have a three 
> DC cluster(RF=3 in two DCs,RF=1 in another ,48 nodes in the DC with RF=3).As 
> the Cluster is becoming larger and larger,we need to spend more than 24 hours 
> to bootstrap a new node.
> Could you please advise how to tune this ?
>  
> Many Thanks,
> Peng Xiao



RE: Re: Tuning bootstrap new node

2017-10-31 Thread Anubhav Kale
You can change YAML setting of memtable_cleanup_threshold to 0.7 (from the 
default of 0.3). This will push SSTables to disk less often and will reduce the 
compaction time.

While this won’t change the streaming time, it will reduce the overall time for 
your node to be healthy.

From: Harikrishnan Pillai [mailto:hpil...@walmartlabs.com]
Sent: Tuesday, October 31, 2017 11:28 AM
To: user@cassandra.apache.org
Subject: Re: Re: Tuning bootstrap new node


There is no magic in speeding up the node addition other than increasing stream 
throughput and compaction throughput.

it has been noticed that with heavy compactions the latency may go up if the 
node also start serving data.

if you really don't want  this node to service traffic till all compactions 
settle down, you can disable gossip and binary protocol using the nodetool 
command. This will allow compactions to continue but requires a repair to fix 
the stale data later.

Regards

Hari


From: Nitan Kainth >
Sent: Tuesday, October 31, 2017 5:47 AM
To: user@cassandra.apache.org
Subject: EXT: Re: Tuning bootstrap new node

Do not stop compaction, you will end up with thousands of sstables.

You increase stream throughput from default 200 to a heifer value if your 
network can handle it.
Sent from my iPhone

On Oct 31, 2017, at 6:35 AM, Peng Xiao <2535...@qq.com> 
wrote:
Can we stop the compaction during the new node bootstraping and enable it after 
the new node joined?

Thanks
-- Original --
From:  "我自己的邮箱";<2535...@qq.com>;
Date:  Tue, Oct 31, 2017 07:18 PM
To:  "user">;
Subject:  Tuning bootstrap new node

Dear All,

Can we make some tuning to make bootstrap new node more quick?We have a three 
DC cluster(RF=3 in two DCs,RF=1 in another ,48 nodes in the DC with RF=3).As 
the Cluster is becoming larger and larger,we need to spend more than 24 hours 
to bootstrap a new node.
Could you please advise how to tune this ?

Many Thanks,
Peng Xiao


Re: Re: Tuning bootstrap new node

2017-10-31 Thread Harikrishnan Pillai
There is no magic in speeding up the node addition other than increasing stream 
throughput and compaction throughput.

it has been noticed that with heavy compactions the latency may go up if the 
node also start serving data.

if you really don't want  this node to service traffic till all compactions 
settle down, you can disable gossip and binary protocol using the nodetool 
command. This will allow compactions to continue but requires a repair to fix 
the stale data later.

Regards

Hari



From: Nitan Kainth 
Sent: Tuesday, October 31, 2017 5:47 AM
To: user@cassandra.apache.org
Subject: EXT: Re: Tuning bootstrap new node

Do not stop compaction, you will end up with thousands of sstables.

You increase stream throughput from default 200 to a heifer value if your 
network can handle it.

Sent from my iPhone

On Oct 31, 2017, at 6:35 AM, Peng Xiao <2535...@qq.com> 
wrote:

Can we stop the compaction during the new node bootstraping and enable it after 
the new node joined?

Thanks
-- Original --
From:  "我自己的邮箱";<2535...@qq.com>;
Date:  Tue, Oct 31, 2017 07:18 PM
To:  "user">;
Subject:  Tuning bootstrap new node

Dear All,

Can we make some tuning to make bootstrap new node more quick?We have a three 
DC cluster(RF=3 in two DCs,RF=1 in another ,48 nodes in the DC with RF=3).As 
the Cluster is becoming larger and larger,we need to spend more than 24 hours 
to bootstrap a new node.
Could you please advise how to tune this ?

Many Thanks,
Peng Xiao


Re: Snapshot verification

2017-10-31 Thread Varun Gupta
We use COPY command to generate a file, from source and destination. After
that you can use diff tool.
On Mon, Oct 30, 2017 at 10:11 PM Pradeep Chhetri 
wrote:

> Hi,
>
> We are taking daily snapshots for backing up our cassandra data and then
> use our backups to restore in a different environment. I would like to
> verify that the data is consistent and all the data during the time backup
> was taken is actually restored.
>
> Currently I just count the number of rows in each table. Was wondering if
> there any inbuilt way to accomplish this.
>
> Thank you.
> Pradeep
>


RE: Cassandra proxy to control read/write throughput

2017-10-31 Thread Anubhav Kale
There are some caveats with coordinator only nodes. You can read about our 
experience in detail 
here.

From: Nate McCall [mailto:n...@thelastpickle.com]
Sent: Sunday, October 29, 2017 2:12 PM
To: Cassandra Users 
Subject: Re: Cassandra proxy to control read/write throughput

The following presentation describes in detail a technique for using 
coordinator-only nodes which will give you similar behavior (particularly 
slides 12 to 14):
https://www.slideshare.net/DataStax/optimizing-your-cluster-with-coordinator-nodes-eric-lubow-simplereach-cassandra-summit-2016

On Thu, Oct 26, 2017 at 12:07 PM, AI Rumman 
> wrote:
Hi,

I am using different versions of Casandra in my environment where I have 60 
nodes are running for different applications. Each application is connecting to 
its own cluster. I am thinking about abstracting the Cassandra IP from app 
drivers.
App will communicate to one proxy IP which will redirect traffic to appropriate 
Cassandra cluster. The reason behind this thinking is to merge multiple 
clusters and control the read/write throughput from proxy based on the 
application.
If anyone knows about pg_bouncer for Postgresql, I am thinking something 
similar to that.
Have anyone worked in such a project? Can you please share some idea?

Thanks.



--
-
Nate McCall
Wellington, NZ
@zznate

CTO
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: Cassandra Compaction Metrics - CompletedTasks vs TotalCompactionCompleted

2017-10-31 Thread Chris Lohfink
CompactionMetrics is a combination of the compaction executor (sstable
compactions, secondary index build, view building, relocate,
garbagecollect, cleanup, scrub etc) and validation executor (repairs). Keep
in mind not all jobs execute 1 task per operation, things that use the
parallelAllSSTableOperation like cleanup will create 1 task per sstable.

The "CompletedTasks" metric is a measure of how many tasks ran on these two
executors combined.
The "TotalCompactionsCompleted" metric is a measure of how many compactions
issued from the compaction manager ran (normal compactions, cache writes,
scrub, 2i and MVs).  So while they may be close, depending on whats
happening on the system, theres no assurance that they will be within any
bounds of each other.

So I would suspect validation compactions from repairs would be one major
difference. If you run other operational tasks there will likely be more.


On Mon, Oct 30, 2017 at 12:22 PM, Lucas Benevides <
lu...@maurobenevides.com.br> wrote:

> Kurt,
>
> I apreciate your answer but I don't believe CompletedTasks count the
> "validation compactions". These are compactions that occur from repair
> operations. I am running tests on 10 cluster nodes in the same physical
> rack, with Cassandra Stress Tool and I didn't make any Repair commands. The
> tables only last for seven hours, so it is not reasonable that tens of
> thousands of these validation compactions occur per node.
>
> I tried to see the code and the CompletedTasks counter seems to be
> populated by a method from the class java.util.concurrent.
> ThreadPoolExecutor.
> So I really don't know what it is but surely is not the amount of
> Compaction Completed Tasks.
>
> Thank you
> Lucas Benevides
>
>-
>
>
> 2017-10-30 8:05 GMT-02:00 kurt greaves :
>
>> I believe (may be wrong) that CompletedTasks counts Validation
>> compactions while TotalCompactionsCompleted does not. Considering a lot of
>> validation compactions can be created every repair it might explain the
>> difference. I'm not sure why they are named that way or work the way they
>> do. There appears to be no documentation around this in the code (what a
>> surprise) and looks like it was last touched in CASSANDRA-4009
>> , which also has
>> no useful info.
>>
>> On 27 October 2017 at 13:48, Lucas Benevides > > wrote:
>>
>>> Dear community,
>>>
>>> I am studying the behaviour of the Cassandra
>>> TimeWindowCompactionStragegy. To do so I am watching some metrics. Two of
>>> these metrics are important: Compaction.CompletedTasks, a gauge, and the
>>> TotalCompactionsCompleted, a Meter.
>>>
>>> According to the documentation (http://cassandra.apache.org/d
>>> oc/latest/operating/metrics.html#table-metrics):
>>> Completed Taks = Number of completed compactions since server [re]start.
>>> TotalCompactionsCompleted = Throughput of completed compactions since
>>> server [re]start.
>>>
>>> As I realized, the TotalCompactionsCompleted, in the Meter object, has a
>>> counter, which I supposed would be numerically close to the CompletedTasks
>>> gauge. But they are very different, with the Completed Tasks being much
>>> higher than the TotalCompactions Completed.
>>>
>>> According to the code, in github (class metrics.CompactionMetrics.java):
>>> Completed Taks - Number of completed compactions since server [re]start
>>> TotalCompactionsCompleted - Total number of compactions since server
>>> [re]start
>>>
>>> Can you help me and explain the difference between these two metrics, as
>>> they seem to have very distinct values, with the Completed Tasks being
>>> around 1000 times the value of the counter in TotalCompactionsCompleted.
>>>
>>> Thanks in Advance,
>>> Lucas Benevides
>>>
>>>
>>
>


Databasde connections based Cassandra users

2017-10-31 Thread Chuck Reynolds
Is there a way to see who is connected to Cassandra based on a Cassandra user?


Re: Tuning bootstrap new node

2017-10-31 Thread Nitan Kainth
Do not stop compaction, you will end up with thousands of sstables.

You increase stream throughput from default 200 to a heifer value if your 
network can handle it.

Sent from my iPhone

> On Oct 31, 2017, at 6:35 AM, Peng Xiao <2535...@qq.com> wrote:
> 
> Can we stop the compaction during the new node bootstraping and enable it 
> after the new node joined?
> 
> Thanks
> -- Original --
> From:  "我自己的邮箱";<2535...@qq.com>;
> Date:  Tue, Oct 31, 2017 07:18 PM
> To:  "user";
> Subject:  Tuning bootstrap new node
> 
> Dear All,
> 
> Can we make some tuning to make bootstrap new node more quick?We have a three 
> DC cluster(RF=3 in two DCs,RF=1 in another ,48 nodes in the DC with RF=3).As 
> the Cluster is becoming larger and larger,we need to spend more than 24 hours 
> to bootstrap a new node.
> Could you please advise how to tune this ?
> 
> Many Thanks,
> Peng Xiao


Re: How do TTLs generate tombstones

2017-10-31 Thread eugene miretsky
Thanks,

We have turned off read repair, and read with consistency = one. This
leaves repairs and old timestamps (generate by the client) as possible
causes for the overlap. We are writing from Spark, and don't have NTP set
up on the cluster - I think that was causing some of the issues, but we
have fixed it, and the problem remains.

It is hard for me to believe the C* repair has a bug, so before creating a
JIRA, I would appreciate if you could take a look at the attached sstables
(produced using sstablemetadata) from two different time points over the
last 2 week (we ran compaction between).

In both cases, there are sstables generated around 8 pm that span over very
long time periods (sometimes over a day). We run repair daily at 8 pm.

Cheers,
Eugene









On Wed, Oct 11, 2017 at 12:53 PM, Jeff Jirsa  wrote:

> Anti-entropy repairs ("nodetool repair") and bootstrap/decom/removenode
> should stream sections of (and/or possibly entire) sstables from one
> replica to another. Assuming the original sstable was entirely contained in
> a single time window, the resulting sstable fragment streamed to the
> neighbor node will similarly be entirely contained within a single time
> window, and will be joined with the sstables in that window. If you find
> this isn't the case, open a JIRA, that's a bug (it was explicitly a design
> goal of TWCS, as it was one of my biggest gripes with early versions of
> DTCS).
>
> Read repairs, however, will pollute the memtable and cause overlaps. There
> are two types of read repairs:
> - Blocking read repair due to consistency level (read at quorum, and one
> of the replicas is missing data, the coordinator will issue mutations to
> the missing replica, which will go into the memtable and flush into the
> newest time window). This can not be disabled (period), and is probably the
> reason most people have overlaps (because people tend to read their writes
> pretty quickly after writes in time series use cases, often before hints or
> normal repair can be successful, especially in environments where nodes are
> bounced often).
> - Background read repair (tunable with the read_repair_chance and
> dclocal_read_repair_chance table options), which is like blocking read
> repair, but happens probabilistically (ie: there's a 1% chance on any read
> that the coordinator will scan the partition and copy any missing data to
> the replicas missing that data. Again, this goes to the memtable, and will
> flush into the newest time window).
>
> There's a pretty good argument to be made against manual repairs if (and
> only if) you only use TTLs, never explicitly delete data, and can tolerate
> the business risk of losing two machines at a time (that is: in the very
> very rare case that you somehow lose 2 machines before you can rebuild,
> you'll lose some subset of data that never made it to the sole remaining
> replica; is your business going to lose millions of dollars, or will you
> just have a gap in an analytics dashboard somewhere that nobody's going to
> worry about).
>
> - Jeff
>
>
> On Wed, Oct 11, 2017 at 9:24 AM, Sumanth Pasupuleti <
> spasupul...@netflix.com.invalid> wrote:
>
>> Hi Eugene,
>>
>> Common contributors to overlapping SSTables are
>> 1. Hints
>> 2. Repairs
>> 3. New writes with old timestamps (should be rare but technically
>> possible)
>>
>> I would not run repairs with TWCS - as you indicated, it is going to
>> result in overlapping SSTables which impacts disk space and read latency
>> since reads now have to encompass multiple SSTables.
>>
>> As for https://issues.apache.org/jira/browse/CASSANDRA-13418, I would
>> not worry about data resurrection as long as all the writes carry TTL with
>> them.
>>
>> We faced similar overlapping issues with TWCS (it wss due to
>> dclocal_read_repair_chance) - we developed an SSTable tool that would give
>> topN or bottomN keys in an SSTable based on writetime/deletion time - we
>> used this to identify the specific keys responsible for overlap between
>> SSTables.
>>
>> Thanks,
>> Sumanth
>>
>>
>> On Mon, Oct 9, 2017 at 6:36 PM, eugene miretsky <
>> eugene.miret...@gmail.com> wrote:
>>
>>> Thanks Alain!
>>>
>>> We are using TWCS compaction, and I read your blog multiple times - it
>>> was very useful, thanks!
>>>
>>> We are seeing a lot of overlapping SSTables, leading to a lot of
>>> problems: (a) large number of tombstones read in queries, (b) high CPU
>>> usage, (c) fairly long Young Gen GC collection (300ms)
>>>
>>> We have read_repair_change = 0, and unchecked_tombstone_compaction =
>>> true, gc_grace_seconds = 3h,  but we read and write with consistency =
>>> 1.
>>>
>>> I'm suspecting the overlap is coming from either hinted handoff or a
>>> repair job we run nightly.
>>>
>>> 1) Is running repair with TWCS recommended? It seems like it will
>>> always create a neverending overlap (the repair SSTable will have data from
>>> all 24 hours), an effect that seems to get amplified with anti-compaction.

Re: Tuning bootstrap new node

2017-10-31 Thread Peng Xiao
Can we stop the compaction during the new node bootstraping and enable it after 
the new node joined?


Thanks
-- Original --
From:  "";<2535...@qq.com>;
Date:  Tue, Oct 31, 2017 07:18 PM
To:  "user";

Subject:  Tuning bootstrap new node



Dear All,

Can we make some tuning to make bootstrap new node more quick?We have a three 
DC cluster(RF=3 in two DCs,RF=1 in another ,48 nodes in the DC with RF=3).As 
the Cluster is becoming larger and larger,we need to spend more than 24 hours 
to bootstrap a new node.
Could you please advise how to tune this ?


Many Thanks,
Peng Xiao