Re: Alter table gc_grace_seconds

2017-10-01 Thread Gábor Auth
Hi,

On Mon, Oct 2, 2017 at 5:55 AM Varun Barala  wrote:

> *select gc_grace_seconds from system_schema.tables where keyspace_name =
> 'keyspace' and table_name = 'number_item;*
>

cassandra@cqlsh:mat> DESCRIBE TABLE mat.number_item;


CREATE TABLE mat.number_item (
   nodeid uuid,
   type text,
   created timeuuid,
   value float,
   PRIMARY KEY (nodeid, type, created)
) WITH CLUSTERING ORDER BY (type ASC, created ASC)
   AND bloom_filter_fp_chance = 0.01
   AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
   AND cdc = false
   AND comment = ''
   AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
   AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
   AND crc_check_chance = 1.0
   AND dclocal_read_repair_chance = 0.1
   AND default_time_to_live = 0
   AND gc_grace_seconds = 3600
   AND max_index_interval = 2048
   AND memtable_flush_period_in_ms = 0
   AND min_index_interval = 128
   AND read_repair_chance = 0.0
   AND speculative_retry = '99PERCENTILE';

cassandra@cqlsh:mat> select gc_grace_seconds from system_schema.tables
where keyspace_name = 'mat' and table_name = 'number_item';

gc_grace_seconds
--
3600

(1 rows)

Bye,
Gábor Auth


Re: Alter table gc_grace_seconds

2017-10-01 Thread Varun Barala
Can you share result of query:-

*select gc_grace_seconds from system_schema.tables where keyspace_name =
'keyspace' and table_name = 'number_item;*

Thanks!!

On Mon, Oct 2, 2017 at 3:42 AM, Gábor Auth  wrote:

> Hi,
>
> On Sun, Oct 1, 2017 at 9:36 PM Varun Barala 
> wrote:
>
>> * You should not try on real clusters directly.
>>
>
> Why not? :)
>
> Did you change gc_grace for all column families?
>>
>
> Not, only on the `number_item` CF.
>
> > But not in the `number_item` CF... :(
>> Could you please explain?
>>
>
> I've tried the test case that you described and it is works (the compact
> removed the marked_deleted rows) on a newly created CF. But the same
> gc_grace_seconds settings has no effect in the `number_item` CF (millions
> of rows has been deleted during a last week migration).
>
> Bye,
> Gábor Auth
>
>


Re:

2017-10-01 Thread daemeon reiydelle
What specifically are you looking to monitor? As per above, Datadog has
superb components for monitoring, and no need do develop and support
anything, for a price of course. I have found management sometimes sees
devops resources as pretty low cost (pay for 40, get 70 hours work per
week). Depends on how big your clusters are, whether they are Hadoop MR,
add Hive, add Spark, add Ignite, etc.

Same sort of questions apply to your etl/ingest: Kafka/NiFi, Streaming, etc.

We like to say that we don’t get to choose our parents, that they were
given by chance – yet, we can truly choose whose children we wish to be. -
Seneca the Younger



*Daemeon C.M. ReiydelleSan Francisco 1.415.501.0198London 44 020 8144 9872*


On Sun, Oct 1, 2017 at 9:57 AM, Jeff Jirsa  wrote:

> I've seen successful AWS deployments in the past with Datadog and
> Graphite+Seyren
>
>
>
> On Sun, Oct 1, 2017 at 9:14 AM, Bill Walters 
> wrote:
>
>> Hi All,
>>
>> I need some help with deploying a monitoring and alerting system for our
>> new Cassandra 3.0.4 cluster that we are setting up in AWS East region.
>> I have a good experience with Cassandra as we are running some 2.0.16
>> clusters in production on our on-prem servers. We use Nagios tool to
>> monitor and alert our on-call people if the any of the nodes in our on-prem
>> servers go down. (Nagios is the default monitoring and alerting system used
>> by our company)
>> Since, our leadership started a plan to migrate our infrastructure to
>> cloud, we have chosen AWS as our public cloud.
>> We are planning to use same old Nagios as our monitoring and alerting
>> system even for our cloud servers.
>> But not sure if this is the ideal approach, I have seen uses cases where Yelp
>> used Sensu
>> 
>>  and Netflix wrote their own tool
>> 
>>  for
>> monitoring their cloud Cassandra clusters.
>>
>> Please let me know if there are any cloud native monitoring systems that
>> work well with Cassandra, we will review it for our setup.
>>
>>
>>
>> Thank You,
>> Bill Walters.
>>
>
>


Re: space left for compaction

2017-10-01 Thread Justin Cameron
Hi Avi,

Actually, in Thomas' example you would need an additional 100G of free disk
space to complete the compaction, in the worst-case situation (the
worst-case would be that neither input SSTable contains any overlapping
data or tombstones, therefore the output SSTable would also be roughly
100G).

STCS progressively compacts SSTables of similar size together, with the
output being a single SSTable containing the data of the input SSTables.

Eventually you may end up with some very large SSTables that combined will
take up 50% of your total disk space. In order to compact those SSTables
together, STCS requires an equal amount of free disk space, which would be
the other (unused) 50% of your total disk space.

Cheers,
Justin

On Mon, 2 Oct 2017 at 12:42 Avi Levi  wrote:

> Hi Thomas ,
> So IIUC in this case you should leave at least 50G for compaction  (half
> of the sstables size). Is that makes sense?
> Cheers
> Avi
>
>
> On Oct 1, 2017 11:39 AM, "Steinmaurer, Thomas" <
> thomas.steinmau...@dynatrace.com> wrote:
>
> Hi,
>
>
>
> half of free space does not make sense. Imagine your SSTables need 100G
> space and you have 20G free disk. Compaction won’t be able to do its job
> with 10G.
>
>
>
> Half free of total disk makes more sense and is what you need for a major
> compaction worst case.
>
>
>
> Thomas
>
>
>
> *From:* Peng Xiao [mailto:2535...@qq.com]
> *Sent:* Samstag, 30. September 2017 10:21
> *To:* user 
> *Subject:* space left for compaction
>
>
>
> Dear All,
>
>
>
> As for STCS,datastax suggest us to keep half of the free space for
> compaction,this is not strict,could anyone advise how many space should we
> left for one node?
>
>
>
> Thanks,
>
> Peng Xiao
> The contents of this e-mail are intended for the named addressee only. It
> contains information that may be confidential. Unless you are the named
> addressee or an authorized designee, you may not copy or use it, or
> disclose it to anyone else. If you received it in error please notify us
> immediately and then destroy it. Dynatrace Austria GmbH (registration
> number FN 91482h) is a company registered in Linz whose registered office
> is at 4040 Linz, Austria, Freist
> 
> ädterstra
> 
> ße 313
> 
>
>
> --


*Justin Cameron*Senior Software Engineer





This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


RE: space left for compaction

2017-10-01 Thread Avi Levi
Hi Thomas ,
So IIUC in this case you should leave at least 50G for compaction  (half of
the sstables size). Is that makes sense?
Cheers
Avi


On Oct 1, 2017 11:39 AM, "Steinmaurer, Thomas" <
thomas.steinmau...@dynatrace.com> wrote:

Hi,



half of free space does not make sense. Imagine your SSTables need 100G
space and you have 20G free disk. Compaction won’t be able to do its job
with 10G.



Half free of total disk makes more sense and is what you need for a major
compaction worst case.



Thomas



*From:* Peng Xiao [mailto:2535...@qq.com]
*Sent:* Samstag, 30. September 2017 10:21
*To:* user 
*Subject:* space left for compaction



Dear All,



As for STCS,datastax suggest us to keep half of the free space for
compaction,this is not strict,could anyone advise how many space should we
left for one node?



Thanks,

Peng Xiao
The contents of this e-mail are intended for the named addressee only. It
contains information that may be confidential. Unless you are the named
addressee or an authorized designee, you may not copy or use it, or
disclose it to anyone else. If you received it in error please notify us
immediately and then destroy it. Dynatrace Austria GmbH (registration
number FN 91482h) is a company registered in Linz whose registered office
is at 4040 Linz, Austria, Freist

ädterstra

ße 313



Re: Alter table gc_grace_seconds

2017-10-01 Thread Gábor Auth
Hi,

On Sun, Oct 1, 2017 at 9:36 PM Varun Barala  wrote:

> * You should not try on real clusters directly.
>

Why not? :)

Did you change gc_grace for all column families?
>

Not, only on the `number_item` CF.

> But not in the `number_item` CF... :(
> Could you please explain?
>

I've tried the test case that you described and it is works (the compact
removed the marked_deleted rows) on a newly created CF. But the same
gc_grace_seconds settings has no effect in the `number_item` CF (millions
of rows has been deleted during a last week migration).

Bye,
Gábor Auth


Re: Alter table gc_grace_seconds

2017-10-01 Thread Varun Barala
* You should not try on real clusters directly.

Did you change gc_grace for all column families?

> But not in the `number_item` CF... :(
Could you please explain?

Thanks!!

On Mon, Oct 2, 2017 at 2:24 AM, Gábor Auth  wrote:

> Hi,
>
> On Sun, Oct 1, 2017 at 7:44 PM Varun Barala 
> wrote:
>
>> Sorry If I misunderstood the situation.
>>
>
> Ok, I'm confused... :/
>
> I've just tested it on the same cluster and the compact removed the
> marked_deleted rows. But not in the `number_item` CF... :(
>
> Cassandra 3.11.0, two DC (with 4-4 nodes).
>
> Bye,
> Gábor Auth
>


Re: Alter table gc_grace_seconds

2017-10-01 Thread Gábor Auth
Hi,

On Sun, Oct 1, 2017 at 7:44 PM Varun Barala  wrote:

> Sorry If I misunderstood the situation.
>

Ok, I'm confused... :/

I've just tested it on the same cluster and the compact removed the
marked_deleted rows. But not in the `number_item` CF... :(

Cassandra 3.11.0, two DC (with 4-4 nodes).

Bye,
Gábor Auth


Re: Alter table gc_grace_seconds

2017-10-01 Thread Varun Barala
Sorry If I misunderstood the situation.


{
  "type" : "row",
  "position" : 146160,
  "clustering" : [ "humidity", "97781fd0-9dab-11e7-a3d5-7f6ef9a844c7" ],
  "deletion_info" : { "marked_deleted" : "2017-09-25T11:51:19.165276Z",
"local_delete_time" : "2017-09-25T11:51:19Z" },
  "cells" : [ ]
}


How did we come to know that TTL is applied by looking at this data?


I performed a basic test in my local:-


*# C* version* 3.0.14,

*# Table schema:-*
CREATE TABLE test.table1 (
id int,
ck1 text,
ck2 text,
nk1 text,
PRIMARY KEY (id,ck1, ck2)
);


*# insert statement:-*
insert into test.table1 (id, ck1, ck2, nk1) VALUES (1,'1','1','1');


*# performed flush to generate the sstable(json format):-*
  {
"partition" : {
  "key" : [ "1" ],
  "position" : 0
},
"rows" : [
  {
"type" : "row",
"position" : 30,
"clustering" : [ "1", "1" ],
"liveness_info" : { "tstamp" : "2017-10-01T17:28:56.889Z" },
"cells" : [
  { "name" : "nk1", "value" : "1" }
]
  }
]
  }

*# delete statement:-*
delete from test.table1 where id=1 and ck1='1' and ck2='1';

*# performed flush to generate new sstable(json format):-*
 {
"partition" : {
  "key" : [ "1" ],
  "position" : 0
},
"rows" : [
  {
"type" : "row",
"position" : 32,
"clustering" : [ "1", "1" ],
"deletion_info" : { "marked_deleted" : "2017-10-01T17:30:15.397Z",
"local_delete_time" : "2017-10-01T17:30:15Z" },
"cells" : [ ]
  }
]
  }


*# performed compaction to compact above two sstables (latest won,
tombstone has not been purged):-*  {
"partition" : {
  "key" : [ "1" ],
  "position" : 0
},
"rows" : [
  {
"type" : "row",
"position" : 32,
"clustering" : [ "1", "1" ],
"deletion_info" : { "marked_deleted" : "2017-10-01T17:30:15.397Z",
"local_delete_time" : "2017-10-01T17:30:15Z" },
"cells" : [ ]
  }
]
  }

*# changed gc_grace:-*
alter TABLE  test.table1 WITH gc_grace_seconds = 0;

*# performed compaction again(no sstable exists, since it has purged the
tombstones older than gc_grace):-*
no sstable exist in data dir


Please let me know If I'm missing some fact.

Thanks!!

On Mon, Oct 2, 2017 at 1:11 AM, Gábor Auth  wrote:

> Hi,
>
> On Sun, Oct 1, 2017 at 6:53 PM Jonathan Haddad  wrote:
>
>> The TTL is applied to the cells on insert. Changing it doesn't change the
>> TTL on data that was inserted previously.
>>
>
> Is there any way to purge out these tombstoned data?
>
> Bye,
> Gábor Auth
>


Re: Alter table gc_grace_seconds

2017-10-01 Thread Gábor Auth
Hi,

On Sun, Oct 1, 2017 at 6:53 PM Jonathan Haddad  wrote:

> The TTL is applied to the cells on insert. Changing it doesn't change the
> TTL on data that was inserted previously.
>

Is there any way to purge out these tombstoned data?

Bye,
Gábor Auth


Re: Alter table gc_grace_seconds

2017-10-01 Thread Gábor Auth
Hi,

On Sun, Oct 1, 2017 at 6:53 PM Jonathan Haddad  wrote:

> The TTL is applied to the cells on insert. Changing it doesn't change the
> TTL on data that was inserted previously.
>

Oh! So that the tombstoned cell's TTL is equals with the CF's
gc_grace_seconds value and the repair will be remove it. Am I right?

Bye,
Gábor Auth


Re:

2017-10-01 Thread Jeff Jirsa
I've seen successful AWS deployments in the past with Datadog and
Graphite+Seyren



On Sun, Oct 1, 2017 at 9:14 AM, Bill Walters 
wrote:

> Hi All,
>
> I need some help with deploying a monitoring and alerting system for our
> new Cassandra 3.0.4 cluster that we are setting up in AWS East region.
> I have a good experience with Cassandra as we are running some 2.0.16
> clusters in production on our on-prem servers. We use Nagios tool to
> monitor and alert our on-call people if the any of the nodes in our on-prem
> servers go down. (Nagios is the default monitoring and alerting system used
> by our company)
> Since, our leadership started a plan to migrate our infrastructure to
> cloud, we have chosen AWS as our public cloud.
> We are planning to use same old Nagios as our monitoring and alerting
> system even for our cloud servers.
> But not sure if this is the ideal approach, I have seen uses cases where Yelp
> used Sensu
> 
>  and Netflix wrote their own tool
> 
>  for
> monitoring their cloud Cassandra clusters.
>
> Please let me know if there are any cloud native monitoring systems that
> work well with Cassandra, we will review it for our setup.
>
>
>
> Thank You,
> Bill Walters.
>


Re: Alter table gc_grace_seconds

2017-10-01 Thread Jonathan Haddad
The TTL is applied to the cells on insert. Changing it doesn't change the
TTL on data that was inserted previously.

On Sun, Oct 1, 2017 at 6:23 AM Gábor Auth  wrote:

> Hi,
>
> The `alter table number_item with gc_grace_seconds = 3600;` is sets the
> grace seconds of tombstones of the future modification of number_item
> column family or affects all existing data?
>
> Bye,
> Gábor Auth
>
>


Re: Alter table gc_grace_seconds

2017-10-01 Thread Varun Barala
* Which C* version are you using?
* How many nodes are there in this cluster?

These tombstones will not be deleted if they are not older than
gc_grace_seconds.


On Sun, Oct 1, 2017 at 10:14 PM, Gábor Auth  wrote:

> Hi,
>
> On Sun, Oct 1, 2017 at 3:44 PM Varun Barala 
> wrote:
>
>> This is the property of table and It's not written in sstables. If you
>> change gc_grace, It'll get applied for all the data.
>>
>
> Hm... I've migrated lot of data from `number_item` to `measurement` CF
> because of scheme upgrade. During the migration, the script created rows in
> the `measurement` CF and deleted the migrated rows in the `number_item` CF,
> one-by-one.
>
> I've just take a look on the sstables of `number_item` and it is full of
> deleted rows:
> {
>   "type" : "row",
>   "position" : 146160,
>   "clustering" : [ "humidity", "97781fd0-9dab-11e7-a3d5-7f6ef9a844c7" ],
>   "deletion_info" : { "marked_deleted" : "2017-09-25T11:51:19.165276Z",
> "local_delete_time" : "2017-09-25T11:51:19Z" },
>   "cells" : [ ]
> }
>
> How can I purge these old rows? :)
>
> I've tried: compact, scrub, cleanup, clearsnapshot, flush and full repair.
>
> Bye,
> Gábor Auth
>
>


Re:

2017-10-01 Thread Lutaya Shafiq Holmes
AWS is a good choice, you will have to use Amazon Web Services EC2 .

Regards

On Sunday, October 1, 2017, Bill Walters  wrote:

> Hi All,
>
> I need some help with deploying a monitoring and alerting system for our
> new Cassandra 3.0.4 cluster that we are setting up in AWS East region.
> I have a good experience with Cassandra as we are running some 2.0.16
> clusters in production on our on-prem servers. We use Nagios tool to
> monitor and alert our on-call people if the any of the nodes in our on-prem
> servers go down. (Nagios is the default monitoring and alerting system used
> by our company)
> Since, our leadership started a plan to migrate our infrastructure to
> cloud, we have chosen AWS as our public cloud.
> We are planning to use same old Nagios as our monitoring and alerting
> system even for our cloud servers.
> But not sure if this is the ideal approach, I have seen uses cases where Yelp
> used Sensu
> 
>  and Netflix wrote their own tool
> 
>  for
> monitoring their cloud Cassandra clusters.
>
> Please let me know if there are any cloud native monitoring systems that
> work well with Cassandra, we will review it for our setup.
>
>
>
> Thank You,
> Bill Walters.
>


-- 
Lutaaya Shafiq
Web: www.ronzag.com | i...@ronzag.com
Mobile: +256702772721 | +256783564130
Twitter: @lutayashafiq
Skype: lutaya5
Blog: lutayashafiq.com
http://www.fourcornersalliancegroup.com/?a=shafiqholmes

"The most beautiful people we have known are those who have known defeat,
known suffering, known struggle, known loss and have found their way out of
the depths. These persons have an appreciation, a sensitivity and an
understanding of life that fills them with compassion, gentleness and a
deep loving concern. Beautiful people do not just happen." - *Elisabeth
Kubler-Ross*


[no subject]

2017-10-01 Thread Bill Walters
Hi All,

I need some help with deploying a monitoring and alerting system for our
new Cassandra 3.0.4 cluster that we are setting up in AWS East region.
I have a good experience with Cassandra as we are running some 2.0.16
clusters in production on our on-prem servers. We use Nagios tool to
monitor and alert our on-call people if the any of the nodes in our on-prem
servers go down. (Nagios is the default monitoring and alerting system used
by our company)
Since, our leadership started a plan to migrate our infrastructure to
cloud, we have chosen AWS as our public cloud.
We are planning to use same old Nagios as our monitoring and alerting
system even for our cloud servers.
But not sure if this is the ideal approach, I have seen uses cases where Yelp
used Sensu

 and Netflix wrote their own tool

for
monitoring their cloud Cassandra clusters.

Please let me know if there are any cloud native monitoring systems that
work well with Cassandra, we will review it for our setup.



Thank You,
Bill Walters.


Re: Alter table gc_grace_seconds

2017-10-01 Thread Gábor Auth
Hi,

On Sun, Oct 1, 2017 at 3:44 PM Varun Barala  wrote:

> This is the property of table and It's not written in sstables. If you
> change gc_grace, It'll get applied for all the data.
>

Hm... I've migrated lot of data from `number_item` to `measurement` CF
because of scheme upgrade. During the migration, the script created rows in
the `measurement` CF and deleted the migrated rows in the `number_item` CF,
one-by-one.

I've just take a look on the sstables of `number_item` and it is full of
deleted rows:
{
  "type" : "row",
  "position" : 146160,
  "clustering" : [ "humidity", "97781fd0-9dab-11e7-a3d5-7f6ef9a844c7" ],
  "deletion_info" : { "marked_deleted" : "2017-09-25T11:51:19.165276Z",
"local_delete_time" : "2017-09-25T11:51:19Z" },
  "cells" : [ ]
}

How can I purge these old rows? :)

I've tried: compact, scrub, cleanup, clearsnapshot, flush and full repair.

Bye,
Gábor Auth


Re: Alter table gc_grace_seconds

2017-10-01 Thread Varun Barala
Hi,
This is the property of table and It's not written in sstables. If you
change gc_grace, It'll get applied for all the data. Thanks!!

C* stores this info inside schema_columnfamilies

Regards,
Varun Barala

On Sun, Oct 1, 2017 at 9:23 PM, Gábor Auth  wrote:

> Hi,
>
> The `alter table number_item with gc_grace_seconds = 3600;` is sets the
> grace seconds of tombstones of the future modification of number_item
> column family or affects all existing data?
>
> Bye,
> Gábor Auth
>
>


Alter table gc_grace_seconds

2017-10-01 Thread Gábor Auth
Hi,

The `alter table number_item with gc_grace_seconds = 3600;` is sets the
grace seconds of tombstones of the future modification of number_item
column family or affects all existing data?

Bye,
Gábor Auth


Cassandra 3.11.1 (snapshot build) - io.netty.util.Recycler$Stack memory leak

2017-10-01 Thread Steinmaurer, Thomas
Hello,

we were facing a memory leak with 3.11.0 
(https://issues.apache.org/jira/browse/CASSANDRA-13754) thus upgraded our 
loadtest environment to a snapshot build of 3.11.1. Having it running for > 48 
hrs now, we still see a steady increase on heap utilization.

Eclipse memory analyzer shows 147 instances of io.netty.util.Recycler$Stack 
with a total retained heap usage of ~ 1,8G, growing over time.

Should this be fixed already by CASSANDRA-13754 or is this something new?

Thanks,
Thomas

The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freist?dterstra?e 313


RE: space left for compaction

2017-10-01 Thread Steinmaurer, Thomas
Hi,

half of free space does not make sense. Imagine your SSTables need 100G space 
and you have 20G free disk. Compaction won't be able to do its job with 10G.

Half free of total disk makes more sense and is what you need for a major 
compaction worst case.

Thomas

From: Peng Xiao [mailto:2535...@qq.com]
Sent: Samstag, 30. September 2017 10:21
To: user 
Subject: space left for compaction

Dear All,

As for STCS,datastax suggest us to keep half of the free space for 
compaction,this is not strict,could anyone advise how many space should we left 
for one node?

Thanks,
Peng Xiao
The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freist?dterstra?e 313