Re: Cassandra nightly process

2023-01-16 Thread Gábor Auth
Hi,

On Mon, Jan 16, 2023 at 3:07 PM Loïc CHANEL via user <
user@cassandra.apache.org> wrote:

> So my question here is : am I missing a Cassandra internal process that is
> triggered on a daily basis at 0:00 and 2:00 ?
>

I bet, it's not a Cassandra issue. Have you any other metrics about your
VPSs (CPU, memory, load, IO stat, disk throughput, network traffic, etc.)?
I think, some process (on another virtual machine or host) steals your
resources and your Cassandra cannot process the request and the other
instance need to put data to hints.

-- 
Bye,
Gábor Auth


Re: Change IP address (on 3.11.14)

2022-12-06 Thread Gábor Auth
Hi,

On Tue, Dec 6, 2022 at 12:41 PM Lapo Luchini  wrote:

> I'm trying to change IP address of an existing live node (possibly
> without deleting data and streaming terabytes all over again) following
> these steps:

https://stackoverflow.com/a/57455035/166524
> 1. echo 'auto_bootstrap: false' >> cassandra.yaml
> 2. add "-Dcassandra.replace_address=oldAddress" in cassandra-env.sh
> 3. restart node
>

As I know, you need to change the IP address like this only, when the node
data vanished and need to tell the cluster the replaced IP of the vanished
node.

Or should I delete all the DB on disk and bootstrap from scratch?
>

No! Just start it and the other nodes in the cluster will acknowledge the
new IP, they recognize the node by id, stored in the data folder of the
node.

-- 
Bye,
Auth Gábor (https://iotguru.cloud)


Re: TWCS repair and compact help

2021-06-29 Thread Gábor Auth
Hi,

On Tue, Jun 29, 2021 at 12:34 PM Erick Ramirez 
wrote:

> You definitely shouldn't perform manual compactions -- you should let the
> normal compaction tasks take care of it. It is unnecessary to manually run
> compactions since it creates more problems than it solves as I've explained
> in this post -- https://community.datastax.com/questions/6396/. Cheers!
>

Same issue here... Iwant to replace SizeTieredCompactionStrategy to
TimeWindowCompactionStrategy but I cannot achieve to split of the existing
SSTables to daily SSTables. Any idea about it? :)

-- 
Bye,
Auth Gábor (https://iotguru.cloud)


Re: Last stored value metadata table

2020-11-10 Thread Gábor Auth
Hi,

On Tue, Nov 10, 2020 at 6:29 PM Alex Ott  wrote:

> What about using  "per partition limit 1" on that table?
>

Oh, it is almost a good solution, but actually the key is ((epoch_day,
name), timestamp), to support more distributed partitioning, so... it is
not good... :/

-- 
Bye,
Auth Gábor (https://iotguru.cloud)


Re: Last stored value metadata table

2020-11-10 Thread Gábor Auth
Hi,

On Tue, Nov 10, 2020 at 5:29 PM Durity, Sean R 
wrote:

> Updates do not create tombstones. Deletes create tombstones. The above
> scenario would not create any tombstones. For a full solution, though, I
> would probably suggest a TTL on the data so that old/unchanged data
> eventually gets removed (if that is desirable). TTLs can create tombstones,
> but should not be a major problem if expired data is relatively infrequent.
>

Okay, there are no tombstones (I misused the term) but every updated
`value` are sitting in the memory and on the disk before the next
compaction... Does it degrade the read performance?

-- 
Bye,
Auth Gábor (https://iotguru.cloud)


Re: Last stored value metadata table

2020-11-10 Thread Gábor Auth
Hi,

On Tue, Nov 10, 2020 at 3:18 PM Durity, Sean R 
wrote:

> My answer would depend on how many “names” you expect. If it is a
> relatively small and constrained list (under a few hundred thousand), I
> would start with something like:
>

At the moment, the number of names is more than 10,000 but not than 100,000.

>
> Create table last_values (
>
> arbitrary_partition text, -- use an app name or something static to define
> the partition
>
> name text,
>
> value text,
>
> last_upd_ts timestamp,
>
> primary key (arbitrary_partition, name);
>

What is the purpose of the partition key?

(NOTE: every insert would just overwrite the last value. You only keep the
> last one.)
>

This is the behavior that I want. :)


> I’m assuming that your data arrives in time series order, so that it is
> easy to just insert the last value into last_values. If you have to read
> before write, that would be a Cassandra anti-pattern that needs a different
> solution. (Based on how regular the data points are, I would look at
> something time-series related with a short TTL.)
>

Okay, but as I know, this is the scenario when every update of the
`last_values` generates two tombstones because of the update of the `value`
and `last_upd_ts` field. Maybe I know it wrong?

-- 
Bye,
Auth Gábor (https://iotguru.cloud)


Last stored value metadata table

2020-11-09 Thread Gábor Auth
Hi,

Short story: storing time series of measurements (key(name, timestamp),
value).

The problem: get the list of the last `value` of every `name`.

Is there a Cassandra friendly solution to store the last value of every
`name` in a separate metadata table? It will come with a lot of
tombstones... any other solution? :)

-- 
Bye,
Auth Gábor


Re: Cassandra Delete vs Update

2020-05-23 Thread Gábor Auth
Hi,

On Sat, May 23, 2020 at 6:26 PM Laxmikant Upadhyay 
wrote:

> Thanks you so much  for quick response. I completely agree with Jeff and
> Gabor that it is an anti-pattern to build queue in Cassandra. But plan is
> to reuse the existing Cassandra infrastructure without any additional cost
> (like kafka).
> So even if the data is partioned properly (max 10mb per date ) ..so still
> it will be an issue if I read the partition only once a day ? Even with
> update status and don't delete the row?
>

Both options generate unnecessary records, there is no big difference
between them. But, if the load isn't too high - so, 10 MByte per day isn't
too much, it doesn't matter.

I also have a lot of little tables (oh, column families), that wouldn't be
in Cassandra, but since they have a very minimal load, I don't give a
shit... :)

-- 
Bye,
Auth Gábor (https://iotguru.cloud)


Re: Cassandra Delete vs Update

2020-05-23 Thread Gábor Auth
Hi,

On Sat, May 23, 2020 at 4:09 PM Laxmikant Upadhyay 
wrote:

> I think that we should avoid tombstones specially row-level so should go
> with option-1. Kindly suggest on above or any other better approach ?
>

Why don't you use a queue implementation, like AcitiveMQ, Kafka and
something? Cassandra is not suitable for this at all, it is anti-pattern in
the Cassandra world.

-- 
Bye,
Auth Gábor (https://iotguru.cloud)


Re: Schema disagreement

2018-05-01 Thread Gábor Auth
Hi,

On Tue, May 1, 2018 at 10:27 PM Gábor Auth <auth.ga...@gmail.com> wrote:

> One or two years ago I've tried the CDC feature but switched off... maybe
> is it a side effect of switched off CDC? How can I fix it? :)
>

Okay, I've worked out. Updated the schema of the affected keyspaces on the
new nodes with 'cdc=false' and everything is okay now.

I think, it is a strange bug around the CDC...

Bye,
Gábor Auth


Re: Schema disagreement

2018-05-01 Thread Gábor Auth
Hi,

On Tue, May 1, 2018 at 7:40 PM Gábor Auth <auth.ga...@gmail.com> wrote:

> What can I do? Any suggestion? :(
>

Okay, I've diffed the good and the bad system_scheme tables. The only
difference is the `cdc` field in three keyspaces (in `tables` and `views`):
- the value of `cdc` field on the good node is `False`
- the value of `cdc` field on the bad node is `null`

The value of `cdc` field on the other keyspaces is `null`.

One or two years ago I've tried the CDC feature but switched off... maybe
is it a side effect of switched off CDC? How can I fix it? :)

Bye,
Gábor Auth


Re: Schema disagreement

2018-05-01 Thread Gábor Auth
Hi,

On Mon, Apr 30, 2018 at 11:11 PM Gábor Auth <auth.ga...@gmail.com> wrote:

> On Mon, Apr 30, 2018 at 11:03 PM Ali Hubail <ali.hub...@petrolink.com>
> wrote:
>
>> What steps have you performed to add the new DC? Have you tried to follow
>> certain procedures like this?
>>
>> https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddDCToCluster.html
>>
>
> Yes, exactly. :/
>

Okay, removed all new nodes (with `removenode`). Cleared all new node
(removed data and logs).

I did all the steps described in the link (again).

Same result:

Cluster Information:
   Name: cluster
   Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
   Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
   Schema versions:
   5de14758-887d-38c1-9105-fc60649b0edf: [new, new, ...]

   f4ed784a-174a-38dd-a7e5-55ff6f3002b2: [old, old, ...]

The old nodes try to gossip their own schema:
DEBUG [InternalResponseStage:1] 2018-05-01 17:36:36,266
MigrationManager.java:572 - Gossiping my schema version
f4ed784a-174a-38dd-a7e5-55ff6f3002b2
DEBUG [InternalResponseStage:1] 2018-05-01 17:36:36,863
MigrationManager.java:572 - Gossiping my schema version
f4ed784a-174a-38dd-a7e5-55ff6f3002b2

The new nodes try to gossip their own schema:
DEBUG [InternalResponseStage:4] 2018-05-01 17:36:26,329
MigrationManager.java:572 - Gossiping my schema version
5de14758-887d-38c1-9105-fc60649b0edf
DEBUG [InternalResponseStage:4] 2018-05-01 17:36:27,595
MigrationManager.java:572 - Gossiping my schema version
5de14758-887d-38c1-9105-fc60649b0edf

What can I do? Any suggestion? :(

Bye,
Gábor Auth


Re: Schema disagreement

2018-04-30 Thread Gábor Auth
Hi,

On Mon, Apr 30, 2018 at 11:03 PM Ali Hubail <ali.hub...@petrolink.com>
wrote:

> What steps have you performed to add the new DC? Have you tried to follow
> certain procedures like this?
>
> https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddDCToCluster.html
>

Yes, exactly. :/

Bye,
Gábor Auth


Re: Schema disagreement

2018-04-30 Thread Gábor Auth
Hi,

On Mon, Apr 30, 2018 at 11:39 AM Gábor Auth <auth.ga...@gmail.com> wrote:

> 've just tried to add a new DC and new node to my cluster (3 DCs and 10
> nodes) and the new node has a different schema version:
>

Is it normal? Node is marked down but doing a repair successfully?

WARN  [MigrationStage:1] 2018-04-30 20:36:56,579 MigrationTask.java:67 -
Can't send schema pull request: node /x.x.216.121 is down.
INFO  [AntiEntropyStage:1] 2018-04-30 20:36:56,611 Validator.java:281 -
[repair #323bf873-4cb6-11e8-bdd5-5feb84046dc9] Sending completed merkle
tree to /x.x.216.121 for keyspace.table

The `nodetool status` is looking good:
UN  x.x.216.121  959.29 MiB  32   ?
  322e4e9b-4d9e-43e3-94a3-bbe012058516  RACK01

Bye,
Gábor Auth


Schema disagreement

2018-04-30 Thread Gábor Auth
Hi,

I've just tried to add a new DC and new node to my cluster (3 DCs and 10
nodes) and the new node has a different schema version:

Cluster Information:
Name: cluster
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
7e12a13e-dcca-301b-a5ce-b1ad29fbbacb: [x.x.x.x, ..., ...]
bb186922-82b5-3a61-9c12-bf4eb87b9155: [new.new.new.new]

I've tried:
- node decommission and node re-addition
- resetlocalschema
- rebuild
- replace node
- repair
- cluster restart (node-by-node)

The MigrationManager constantly running on the new node and try to migrate
schema:
DEBUG [NonPeriodicTasks:1] 2018-04-30 09:33:22,405
MigrationManager.java:125 - submitting migration task for /x.x.x.x

What also can I do? :(

Bye,
Gábor Auth


Re: Cassandra vs MySQL

2018-03-12 Thread Gábor Auth
Hi,

On Mon, Mar 12, 2018 at 8:58 PM Oliver Ruebenacker <cur...@gmail.com> wrote:

> We have a project currently using MySQL single-node with 5-6TB of data and
> some performance issues, and we plan to add data up to a total size of
> maybe 25-30TB.
>

There is no 'silver bullet', the Cassandra is not a 'drop in' replacement
of MySQL. Maybe it will be faster, maybe it will be totally unusable, based
on your use-case and database scheme.

Is there some good more recent material?
>

Are you able to completely redesign your database schema? :)

Bye,
Gábor Auth


Re: Materialized Views marked experimental

2017-10-27 Thread Gábor Auth
Hi,

On Thu, Oct 26, 2017 at 11:10 PM Blake Eggleston <beggles...@apple.com>
wrote:

> Following a discussion on dev@, the materialized view feature is being
> retroactively classified as experimental, and not recommended for new
> production uses. The next patch releases of 3.0, 3.11, and 4.0 will include
> CASSANDRA-13959, which will log warnings when materialized views are
> created, and introduce a yaml setting that will allow operators to disable
> their creation.
>

Will the experimental classification later be withdrawn (the issue can be
fixable)?
Will the whole MV feature later be withdrawn (the issue can't be fixable)?
:)

Bye,
Gábor Auth


Re: Alter table gc_grace_seconds

2017-10-05 Thread Gábor Auth
Hi,

On Wed, Oct 4, 2017 at 8:39 AM Oleksandr Shulgin <
oleksandr.shul...@zalando.de> wrote:

> If you have migrated ALL the data from the old CF, you could just use
> TRUNCATE or DROP TABLE, followed by "nodetool clearsnapshot" to reclaim the
> disk space (this step has to be done per-node).
>

Unfortunately not all the data migrated, but to CFs came from one CF:
"only" the 80% for data migrated to another CF.

At the moment, the compaction works with:
compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '2'} AND gc_grace_seconds = 172800

But I don't know why it did not work and why it works now... :)

Bye,
Auth Gábor


Re: Alter table gc_grace_seconds

2017-10-02 Thread Gábor Auth
Hi,

On Mon, Oct 2, 2017 at 1:43 PM Varun Barala <varunbaral...@gmail.com> wrote:

> Either you can change min_threshold to three in your case or you can
> change compaction strategy for this table.
>

I've changed:
alter table number_item with compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '2'};

The list of database files of `number_item` on one node:
-rw-r--r-- 1 cassandra cassandra 71353717 Oct  2 11:53 mc-48399-big-Data.db
-rw-r--r-- 1 cassandra cassandra 32843468 Oct  2 19:15 mc-48435-big-Data.db
-rw-r--r-- 1 cassandra cassandra 24734857 Oct  2 19:49 mc-48439-big-Data.db

I've initiated a compaction on the `number_item` CF. After the compaction:
-rw-r--r-- 1 cassandra cassandra 71353717 Oct  2 11:53 mc-48399-big-Data.db
-rw-r--r-- 1 cassandra cassandra 32843468 Oct  2 19:15 mc-48435-big-Data.db
-rw-r--r-- 1 cassandra cassandra 24734857 Oct  2 19:53 mc-48440-big-Data.db

Two of them untouched and one rewritten with the same content. :/

Bye,
Gábor Auth


Re: Alter table gc_grace_seconds

2017-10-02 Thread Gábor Auth
Hi,

On Mon, Oct 2, 2017 at 8:41 AM Varun Barala <varunbaral...@gmail.com> wrote:

> Might be possible C* is not compacting the sstables [
> https://stackoverflow.com/questions/28437301/cassandra-not-compacting-sstables
> ]
>

Oh, the other CF-s in the same keyspace are compacted, but the
`number_item` not.

[cassandra@dc02-rack01-cass01
number_item-524bf49001d911e798503511c5f98764]$ ls -l
total 172704
drwxr-xr-x 4 cassandra cassandra 4096 Oct  1 10:43 backups
-rw-r--r-- 1 cassandra cassandra15227 Oct  2 01:15
mc-48278-big-CompressionInfo.db
-rw-r--r-- 1 cassandra cassandra 46562318 Oct  2 01:15 mc-48278-big-Data.db
-rw-r--r-- 1 cassandra cassandra   10 Oct  2 01:15
mc-48278-big-Digest.crc32
-rw-r--r-- 1 cassandra cassandra  176 Oct  2 01:15
mc-48278-big-Filter.db
-rw-r--r-- 1 cassandra cassandra   119665 Oct  2 01:15
mc-48278-big-Index.db
-rw-r--r-- 1 cassandra cassandra 6368 Oct  2 01:15
mc-48278-big-Statistics.db
-rw-r--r-- 1 cassandra cassandra   92 Oct  2 01:15
mc-48278-big-Summary.db
-rw-r--r-- 1 cassandra cassandra   92 Oct  2 01:15 mc-48278-big-TOC.txt
-rw-r--r-- 1 cassandra cassandra20643 Oct  2 01:16
mc-48279-big-CompressionInfo.db
-rw-r--r-- 1 cassandra cassandra 62681705 Oct  2 01:16 mc-48279-big-Data.db
-rw-r--r-- 1 cassandra cassandra   10 Oct  2 01:16
mc-48279-big-Digest.crc32
-rw-r--r-- 1 cassandra cassandra  176 Oct  2 01:16
mc-48279-big-Filter.db
-rw-r--r-- 1 cassandra cassandra   162571 Oct  2 01:16
mc-48279-big-Index.db
-rw-r--r-- 1 cassandra cassandra 6375 Oct  2 01:16
mc-48279-big-Statistics.db
-rw-r--r-- 1 cassandra cassandra   92 Oct  2 01:16
mc-48279-big-Summary.db
-rw-r--r-- 1 cassandra cassandra   92 Oct  2 01:16 mc-48279-big-TOC.txt
-rw-r--r-- 1 cassandra cassandra20099 Oct  2 01:16
mc-48280-big-CompressionInfo.db
-rw-r--r-- 1 cassandra cassandra 62587865 Oct  2 01:16 mc-48280-big-Data.db
-rw-r--r-- 1 cassandra cassandra   10 Oct  2 01:16
mc-48280-big-Digest.crc32
-rw-r--r-- 1 cassandra cassandra  176 Oct  2 01:16
mc-48280-big-Filter.db
-rw-r--r-- 1 cassandra cassandra   158379 Oct  2 01:16
mc-48280-big-Index.db
-rw-r--r-- 1 cassandra cassandra 6375 Oct  2 01:16
mc-48280-big-Statistics.db
-rw-r--r-- 1 cassandra cassandra   92 Oct  2 01:16
mc-48280-big-Summary.db
-rw-r--r-- 1 cassandra cassandra   92 Oct  2 01:16 mc-48280-big-TOC.txt
-rw-r--r-- 1 cassandra cassandra   51 Oct  2 01:16
mc-48281-big-CompressionInfo.db
-rw-r--r-- 1 cassandra cassandra   50 Oct  2 01:16 mc-48281-big-Data.db
-rw-r--r-- 1 cassandra cassandra   10 Oct  2 01:16
mc-48281-big-Digest.crc32
-rw-r--r-- 1 cassandra cassandra  176 Oct  2 01:16
mc-48281-big-Filter.db
-rw-r--r-- 1 cassandra cassandra   20 Oct  2 01:16
mc-48281-big-Index.db
-rw-r--r-- 1 cassandra cassandra 4640 Oct  2 01:16
mc-48281-big-Statistics.db
-rw-r--r-- 1 cassandra cassandra   92 Oct  2 01:16
mc-48281-big-Summary.db
-rw-r--r-- 1 cassandra cassandra   92 Oct  2 01:16 mc-48281-big-TOC.txt

Now check both the list results. If they have some common sstables then we
> can say that C* is not compacting sstables.
>

Yes, exactly. How can I fix it?

Bye,
Gábor Auth


Re: Alter table gc_grace_seconds

2017-10-02 Thread Gábor Auth
Hi,

On Mon, Oct 2, 2017 at 8:32 AM Justin Cameron <jus...@instaclustr.com>
wrote:

> >> * You should not try on real clusters directly.
> >Why not? :)
>
> It's highly recommended that you complete a full repair before the GC
> grace period expires, otherwise it's possible you could experience zombie
> data (i.e. data that was previously deleted coming back to life)
>

It is a test cluster with test keyspaces. :)

Bye,
Gábor Auth

>


Re: Alter table gc_grace_seconds

2017-10-01 Thread Gábor Auth
Hi,

On Mon, Oct 2, 2017 at 5:55 AM Varun Barala <varunbaral...@gmail.com> wrote:

> *select gc_grace_seconds from system_schema.tables where keyspace_name =
> 'keyspace' and table_name = 'number_item;*
>

cassandra@cqlsh:mat> DESCRIBE TABLE mat.number_item;


CREATE TABLE mat.number_item (
   nodeid uuid,
   type text,
   created timeuuid,
   value float,
   PRIMARY KEY (nodeid, type, created)
) WITH CLUSTERING ORDER BY (type ASC, created ASC)
   AND bloom_filter_fp_chance = 0.01
   AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
   AND cdc = false
   AND comment = ''
   AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}
   AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
   AND crc_check_chance = 1.0
   AND dclocal_read_repair_chance = 0.1
   AND default_time_to_live = 0
   AND gc_grace_seconds = 3600
   AND max_index_interval = 2048
   AND memtable_flush_period_in_ms = 0
   AND min_index_interval = 128
   AND read_repair_chance = 0.0
   AND speculative_retry = '99PERCENTILE';

cassandra@cqlsh:mat> select gc_grace_seconds from system_schema.tables
where keyspace_name = 'mat' and table_name = 'number_item';

gc_grace_seconds
--
3600

(1 rows)

Bye,
Gábor Auth


Re: Alter table gc_grace_seconds

2017-10-01 Thread Gábor Auth
Hi,

On Sun, Oct 1, 2017 at 9:36 PM Varun Barala <varunbaral...@gmail.com> wrote:

> * You should not try on real clusters directly.
>

Why not? :)

Did you change gc_grace for all column families?
>

Not, only on the `number_item` CF.

> But not in the `number_item` CF... :(
> Could you please explain?
>

I've tried the test case that you described and it is works (the compact
removed the marked_deleted rows) on a newly created CF. But the same
gc_grace_seconds settings has no effect in the `number_item` CF (millions
of rows has been deleted during a last week migration).

Bye,
Gábor Auth


Re: Alter table gc_grace_seconds

2017-10-01 Thread Gábor Auth
Hi,

On Sun, Oct 1, 2017 at 7:44 PM Varun Barala <varunbaral...@gmail.com> wrote:

> Sorry If I misunderstood the situation.
>

Ok, I'm confused... :/

I've just tested it on the same cluster and the compact removed the
marked_deleted rows. But not in the `number_item` CF... :(

Cassandra 3.11.0, two DC (with 4-4 nodes).

Bye,
Gábor Auth


Re: Alter table gc_grace_seconds

2017-10-01 Thread Gábor Auth
Hi,

On Sun, Oct 1, 2017 at 6:53 PM Jonathan Haddad <j...@jonhaddad.com> wrote:

> The TTL is applied to the cells on insert. Changing it doesn't change the
> TTL on data that was inserted previously.
>

Is there any way to purge out these tombstoned data?

Bye,
Gábor Auth


Re: Alter table gc_grace_seconds

2017-10-01 Thread Gábor Auth
Hi,

On Sun, Oct 1, 2017 at 6:53 PM Jonathan Haddad <j...@jonhaddad.com> wrote:

> The TTL is applied to the cells on insert. Changing it doesn't change the
> TTL on data that was inserted previously.
>

Oh! So that the tombstoned cell's TTL is equals with the CF's
gc_grace_seconds value and the repair will be remove it. Am I right?

Bye,
Gábor Auth


Re: Alter table gc_grace_seconds

2017-10-01 Thread Gábor Auth
Hi,

On Sun, Oct 1, 2017 at 3:44 PM Varun Barala <varunbaral...@gmail.com> wrote:

> This is the property of table and It's not written in sstables. If you
> change gc_grace, It'll get applied for all the data.
>

Hm... I've migrated lot of data from `number_item` to `measurement` CF
because of scheme upgrade. During the migration, the script created rows in
the `measurement` CF and deleted the migrated rows in the `number_item` CF,
one-by-one.

I've just take a look on the sstables of `number_item` and it is full of
deleted rows:
{
  "type" : "row",
  "position" : 146160,
  "clustering" : [ "humidity", "97781fd0-9dab-11e7-a3d5-7f6ef9a844c7" ],
  "deletion_info" : { "marked_deleted" : "2017-09-25T11:51:19.165276Z",
"local_delete_time" : "2017-09-25T11:51:19Z" },
  "cells" : [ ]
}

How can I purge these old rows? :)

I've tried: compact, scrub, cleanup, clearsnapshot, flush and full repair.

Bye,
Gábor Auth


Alter table gc_grace_seconds

2017-10-01 Thread Gábor Auth
Hi,

The `alter table number_item with gc_grace_seconds = 3600;` is sets the
grace seconds of tombstones of the future modification of number_item
column family or affects all existing data?

Bye,
Gábor Auth


Re: Purge data from repair_history table?

2017-03-20 Thread Gábor Auth
Hi,

On Fri, Mar 17, 2017 at 2:22 PM Paulo Motta <pauloricard...@gmail.com>
wrote:

> It's safe to truncate this table since it's just used to inspect repairs
> for troubleshooting. You may also set a default TTL to avoid it from
> growing unbounded (this is going to be done by default on CASSANDRA-12701).
>

I've made an alter on the repair_history and the parent_repair_history
tables:
ALTER TABLE system_distributed.repair_history WITH compaction =
{'class':'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
'compaction_window_unit':'DAYS', 'compaction_window_size':'1'
} AND default_time_to_live = 2592000;

Is it affect the previous contents in the table or I need to truncate
manually? Is the 'TRUNCATE' safe? :)

Bye,
Gábor Auth


Re: Purge data from repair_history table?

2017-03-17 Thread Gábor Auth
Oh, thanks! :)

On Fri, 17 Mar 2017, 14:22 Paulo Motta, <pauloricard...@gmail.com> wrote:

> It's safe to truncate this table since it's just used to inspect repairs
> for troubleshooting. You may also set a default TTL to avoid it from
> growing unbounded (this is going to be done by default on CASSANDRA-12701).
>
> 2017-03-17 8:36 GMT-03:00 Gábor Auth <auth.ga...@gmail.com>:
>
> Hi,
>
> I've discovered a relative huge size of data in the system_distributed
> keyspace's repair_history table:
>Table: repair_history
>Space used (live): 389409804
>Space used (total): 389409804
>
> What is the purpose of this data? There is any safe method to purge? :)
>
> Bye,
> Gábor Auth
>
>
>


Re: Slow repair

2017-03-17 Thread Gábor Auth
Hi,

On Wed, Mar 15, 2017 at 11:35 AM Ben Slater <ben.sla...@instaclustr.com>
wrote:

> When you say you’re running repair to “rebalance” do you mean to populate
> the new DC? If so, the normal/correct procedure is to use nodetool rebuild
> rather than repair.
>

Oh, thank you! :)

Bye,
Gábor Auth

>


Purge data from repair_history table?

2017-03-17 Thread Gábor Auth
Hi,

I've discovered a relative huge size of data in the system_distributed
keyspace's repair_history table:
   Table: repair_history
   Space used (live): 389409804
   Space used (total): 389409804

What is the purpose of this data? There is any safe method to purge? :)

Bye,
Gábor Auth


Slow repair

2017-03-15 Thread Gábor Auth
Hi,

We are working with a two DCs Cassandra cluster (EU and US), so that the
distance is over 160 ms between them. I've added a new DC to this cluster,
modified the keyspace's replication factor and trying to rebalance it with
repair but the repair is very slow (over 10-15 minutes per node per
keyspace with ~40 column families). Is it normal with this network latency
or something wrong with the cluster or the network connection? :)

[2017-03-15 05:52:38,255] Starting repair command #4, repairing keyspace
test20151222 with repair options (parallelism: parallel, primary range:
true, incremental: false, job threads: 1, ColumnFamilies: [], dataCenters:
[], hosts: [], # of ranges: 32)
[2017-03-15 05:54:11,913] Repair session
988bd850-0943-11e7-9c1f-f5ba092c6aea for range
[(-3328182031191101706,-3263206086630594139],
(-449681117114180865,-426983008087217811],
(-4940101276128910421,-4726878962587262390],
(-4999008077542282524,-4940101276128910421]] finished (progress: 11%)
[2017-03-15 05:55:39,721] Repair session
9a6fda92-0943-11e7-9c1f-f5ba092c6aea for range
[(7538662821591320245,7564364667721298414],
(8095771383100385537,8112071444788258953],
(-1625703837190283897,-1600176580612824092],
(-1075557915997532230,-1072724867906442440], (-9152
563942239372475,-9123254980705325471],
(7485905313674392326,7513617239634230698]] finished (progress: 14%)
[2017-03-15 05:57:05,718] Repair session
9de181b1-0943-11e7-9c1f-f5ba092c6aea for range
[(-6471953894734787784,-6420063839816736750],
(1372322727565611879,1480899944406172322],
(1176263633569625668,1177285361971054591],
(440549646067640682,491840653569315468], (-43128299
75221321282,-4177428401237878410]] finished (progress: 17%)
[2017-03-15 05:58:39,997] Repair session
a18bc500-0943-11e7-9c1f-f5ba092c6aea for range
[(5327651902976749177,5359189884199963589],
(-5362946313988105342,-5348008210198062914],
(-5756557262823877856,-5652851311492822149],
(-5400778420101537991,-5362946313988105342], (668
2536072120412021,6904193483670147322]] finished (progress: 20%)
[2017-03-15 05:59:11,791] Repair session
a44f2ac2-0943-11e7-9c1f-f5ba092c6aea for range
[(952873612468870228,1042958763135655298],
(558544893991295379,572114658167804730]] finished (progress: 22%)
[2017-03-15 05:59:56,197] Repair session
a5e13c71-0943-11e7-9c1f-f5ba092c6aea for range
[(1914238614647876002,1961526714897144472],
(3610056520286573718,3619622957324752442],
(-3506227577233676363,-3504718440405535976],
(-4120686433235827731,-4098515820338981500], (56515
94158011135924,5668698324546997949]] finished (progress: 25%)
[2017-03-15 06:00:45,610] Repair session
a897a9e1-0943-11e7-9c1f-f5ba092c6aea for range
[(-9007733666337543056,-8979974976044921941]] finished (progress: 28%)
[2017-03-15 06:01:58,826] Repair session
a927b4e1-0943-11e7-9c1f-f5ba092c6aea for range
[(3599745202434925817,3608662806723095677],
(3390003128426746316,3391135639180043521],
(3391135639180043521,3529019003015169892]] finished (progress: 31%)
[2017-03-15 06:03:15,440] Repair session
aae06160-0943-11e7-9c1f-f5ba092c6aea for range
[(-7542303048667795773,-7300899534947316960]] finished (progress: 34%)
[2017-03-15 06:03:17,786] Repair completed successfully
[2017-03-15 06:03:17,787] Repair command #4 finished in 10 minutes 39
seconds

Bye,
Gábor Auth


Re: Archive node

2017-03-06 Thread Gábor Auth
Hi,

On Mon, Mar 6, 2017 at 12:46 PM Carlos Rolo <r...@pythian.com> wrote:

> I would not suggest to do that, because the new "Archive" node would be a
> new DC that you would need to build (Operational wise).
>

Yes, but it is a simple copy of an exists Puppet script in our case and it
works... and the automated clean, cleanup and repair job will be move of
the old keyspaces to the 'Archive' DC without any operational overhead...
hm.

You could also snapshot the old one once it finishes and use SSTableloader
> to push it into your Development DC. This way you have isolation from
> Production. Plus no operational overhead.
>

I think, this is also an operational overhead... :)

Bye,
Gábor Auth


Archive node

2017-03-06 Thread Gábor Auth
Hi,

The background story: we are developing an MMO strategy game and every two
week the game world ends and we are starting a new one with a slightly
different new database scheme. So that, we have over ~100 keyspaces in our
cluster and we want to archive the old schemes into a separated Cassandra
node or something else to be available online for support of development.
The archived keyspace is mostly read-only and rarely used (~once a year or
less often).

We've two DC Cassandra cluster with 4-4 nodes, the idea is the following:
we add a new Cassandra node with DC name 'Archive' and change the
replication factor of old keyspaces from {'class':
'NetworkTopologyStrategy', 'DC01': '3', 'DC02': '3'} to {'class':
'NetworkTopologyStrategy', 'Archive': '1'}, and repair the keyspace.

What do you think? Any other idea? :)

Bye,
Gábor Auth


Upgrade from 3.6 to 3.9

2016-10-28 Thread Gábor Auth
Hi,

Anyone been updated Cassandra to version 3.9 on live system?

I've updated our two DC's Cassandra cluster from 3.6 to 3.9 and on the new
nodes I experienced relative huge CPU and memory footprint during the
handshake with the old nodes (sometimes run out of memory:
OutOfMemoryError: Java heap space).

Bye,
Gábor Auth