Re: Remove folders of deleted tables

2023-12-07 Thread Bowen Song via user
   a écrit :

The same table name with two different CF IDs is not just
"temporary schema disagreements", it's much worse than that.
This breaks the eventual consistency guarantee, and leads to
silent data corruption. It's silently happening in the
background, and you don't realise it until you suddenly do,
and then everything seems to blow up at the same time. You
need to sort this out ASAP.


On 05/12/2023 19:57, Sébastien Rebecchi wrote:

Hi Bowen,

Thanks for your answer.

I was thinking of extreme use cases, but as far as I am
concerned I can deal with creation and deletion of 2 tables
every 6 hours for a keyspace. So it lets around 8 folders of
deleted tables per day - sometimes more cause I can see
sometimes 2 folders created for a same table name, with 2
different ids, caused by temporary schema disagreements I guess.
Basically it means 20 years before the KS folder has 65K
subfolders, so I would say I have time to think of
redesigning the data model ^^
Nevertheless, does it sound too much in terms of thombstones
in the systems tables (with the default GC grace period of
10 days)?

Sébastien.

Le mar. 5 déc. 2023, 12:19, Bowen Song via user
 a écrit :

Please rethink your use case. Create and delete tables
concurrently often lead to schema disagreement. Even
doing so on a single node sequentially will lead to a
large number of tombstones in the system tables.

On 04/12/2023 19:55, Sébastien Rebecchi wrote:

Thank you Dipan.

Do you know if there is a good reason for Cassandra to
let tables folder even when there is no snapshot?

I'm thinking of use cases where there is the need to
create and delete small tables at a high rate. You
could quickly end with more than 65K (limit of ext4)
subdirectories in the KS directory, while 99.9.. % of
them are residual of deleted tables.

That looks quite dirty from Cassandra to not clean its
own "garbage" by itself, and quite dangerous for the
end user to have to do it alone, don't you think so?

Thanks,

Sébastien.

Le lun. 4 déc. 2023, 11:28, Dipan Shah
 a écrit :

Hello Sebastien,

    There are no inbuilt tools that will automatically
remove folders of deleted tables.

Thanks,

Dipan Shah



*From:* Sébastien Rebecchi 
*Sent:* 04 December 2023 13:54
*To:* user@cassandra.apache.org

*Subject:* Remove folders of deleted tables
Hello,

When we delete a table with Cassandra, it lets the
folder of that table on file system, even if there
is no snapshot (auto snapshots disabled).
So we end with the empty folder {data
folder}/{keyspace name}/{table name-table id}
containing only 1 subfolder, backups, which is
    itself empty.
Is there a way to automatically remove folders of
deleted tables?

Sébastien.


Re: Remove folders of deleted tables

2023-12-07 Thread Sébastien Rebecchi
s of deleted tables per day - sometimes more
>> cause I can see sometimes 2 folders created for a same table name, with 2
>> different ids, caused by temporary schema disagreements I guess.
>> Basically it means 20 years before the KS folder has 65K subfolders, so I
>> would say I have time to think of redesigning the data model ^^
>> Nevertheless, does it sound too much in terms of thombstones in the
>> systems tables (with the default GC grace period of 10 days)?
>>
>> Sébastien.
>>
>> Le mar. 5 déc. 2023, 12:19, Bowen Song via user <
>> user@cassandra.apache.org> a écrit :
>>
>>> Please rethink your use case. Create and delete tables concurrently
>>> often lead to schema disagreement. Even doing so on a single node
>>> sequentially will lead to a large number of tombstones in the system tables.
>>> On 04/12/2023 19:55, Sébastien Rebecchi wrote:
>>>
>>> Thank you Dipan.
>>>
>>> Do you know if there is a good reason for Cassandra to let tables folder
>>> even when there is no snapshot?
>>>
>>> I'm thinking of use cases where there is the need to create and delete
>>> small tables at a high rate. You could quickly end with more than 65K
>>> (limit of ext4) subdirectories in the KS directory, while 99.9.. % of them
>>> are residual of deleted tables.
>>>
>>> That looks quite dirty from Cassandra to not clean its own "garbage" by
>>> itself, and quite dangerous for the end user to have to do it alone, don't
>>> you think so?
>>>
>>> Thanks,
>>>
>>> Sébastien.
>>>
>>> Le lun. 4 déc. 2023, 11:28, Dipan Shah  a écrit :
>>>
>>>> Hello Sebastien,
>>>>
>>>> There are no inbuilt tools that will automatically remove folders of
>>>> deleted tables.
>>>>
>>>> Thanks,
>>>>
>>>> Dipan Shah
>>>> --
>>>> *From:* Sébastien Rebecchi 
>>>> *Sent:* 04 December 2023 13:54
>>>> *To:* user@cassandra.apache.org 
>>>> *Subject:* Remove folders of deleted tables
>>>>
>>>> Hello,
>>>>
>>>> When we delete a table with Cassandra, it lets the folder of that table
>>>> on file system, even if there is no snapshot (auto snapshots disabled).
>>>> So we end with the empty folder {data folder}/{keyspace name}/{table
>>>> name-table id} containing only 1  subfolder, backups, which is itself 
>>>> empty.
>>>> Is there a way to automatically remove folders of deleted tables?
>>>>
>>>> Sébastien.
>>>>
>>>


Re: Remove folders of deleted tables

2023-12-06 Thread Bowen Song via user
There are many different ways to avoid or minimise the chance of schema 
disagreements, the easiest way is to always send DDL queries to the same 
node in the cluster. This is very easy to implement and avoids schema 
disagreements at the cost of creating a single point of failure for DDL 
queries. More sophisticated methods also exist, such as locking and 
centralised schema modification, and you should consider which one is 
more suitable for your use case. Ignoring the schema disagreements 
problem is not recommended, as this is not a tested state for the 
cluster, you are likely to run into some known and unknown (and possibly 
severe) issues later.


The system_schema.columns table will almost certainly have more 
tombstones created than the number of tables deleted, unless each 
deleted table had only one column. I doubt creating and deleting 8 
tables per day will be a problem, but I would recommend you find a way 
to test it before doing that on a production system, because I don't 
know anyone else is using Cassandra in this way.


From the surface, it does sound like TWCS with the date in in the 
partition key may fit your use case better than creating and deleting 
tables every day.



On 06/12/2023 08:26, Sébastien Rebecchi wrote:

Hello Jeff, Bowen

Thanks for your answer.
Now I understand that there is a bug in Cassandra that can not handle 
concurrent schema modifications, I was not aware of that severity, I 
thought that temporary schema mismatches were eventually resolved 
smartly, by a kind of "merge" mechanism.
For my use cases, keyspaces and tables are created "on-demand", when 
receiving exceptions for invalid KS or table on insert (then the KS 
and table are created and the insert is retried). I can not afford to 
centralize schema modifications in a bottleneck, but I can afford the 
data inconsistencies, waiting for the fix in Cassandra.
I'm more worried about tombstones in system tables, I assume that 8 
tombstones per day (or even more, but in the order of no more than 
some dozens) is reasonable, can you confirm (or invalidate) that please?


Sébastien.

Le mer. 6 déc. 2023 à 03:00, Bowen Song via user 
 a écrit :


The same table name with two different CF IDs is not just
"temporary schema disagreements", it's much worse than that. This
breaks the eventual consistency guarantee, and leads to silent
data corruption. It's silently happening in the background, and
you don't realise it until you suddenly do, and then everything
seems to blow up at the same time. You need to sort this out ASAP.


On 05/12/2023 19:57, Sébastien Rebecchi wrote:

Hi Bowen,

Thanks for your answer.

I was thinking of extreme use cases, but as far as I am concerned
I can deal with creation and deletion of 2 tables every 6 hours
for a keyspace. So it lets around 8 folders of deleted tables per
day - sometimes more cause I can see sometimes 2 folders created
for a same table name, with 2 different ids, caused by temporary
schema disagreements I guess.
Basically it means 20 years before the KS folder has 65K
subfolders, so I would say I have time to think of redesigning
the data model ^^
Nevertheless, does it sound too much in terms of thombstones in
the systems tables (with the default GC grace period of 10 days)?

Sébastien.

Le mar. 5 déc. 2023, 12:19, Bowen Song via user
 a écrit :

Please rethink your use case. Create and delete tables
concurrently often lead to schema disagreement. Even doing so
on a single node sequentially will lead to a large number of
tombstones in the system tables.

On 04/12/2023 19:55, Sébastien Rebecchi wrote:

Thank you Dipan.

Do you know if there is a good reason for Cassandra to let
tables folder even when there is no snapshot?

I'm thinking of use cases where there is the need to create
and delete small tables at a high rate. You could quickly
end with more than 65K (limit of ext4) subdirectories in the
KS directory, while 99.9.. % of them are residual of deleted
tables.

That looks quite dirty from Cassandra to not clean its own
"garbage" by itself, and quite dangerous for the end user to
have to do it alone, don't you think so?

Thanks,

Sébastien.

Le lun. 4 déc. 2023, 11:28, Dipan Shah
 a écrit :

Hello Sebastien,

There are no inbuilt tools that will automatically
remove folders of deleted tables.

Thanks,

Dipan Shah



*From:* Sébastien Rebecchi 
*Sent:* 04 December 2023 13:54
*To:* user@cassandra.apache.org 
*Subject:* Remove folders of deleted tables
Hello,

When w

Re: Remove folders of deleted tables

2023-12-06 Thread Sébastien Rebecchi
Hello Jeff, Bowen

Thanks for your answer.
Now I understand that there is a bug in Cassandra that can not handle
concurrent schema modifications, I was not aware of that severity, I
thought that temporary schema mismatches were eventually resolved smartly,
by a kind of "merge" mechanism.
For my use cases, keyspaces and tables are created "on-demand", when
receiving exceptions for invalid KS or table on insert (then the KS and
table are created and the insert is retried). I can not afford to
centralize schema modifications in a bottleneck, but I can afford the data
inconsistencies, waiting for the fix in Cassandra.
I'm more worried about tombstones in system tables, I assume that 8
tombstones per day (or even more, but in the order of no more than some
dozens) is reasonable, can you confirm (or invalidate) that please?

Sébastien.

Le mer. 6 déc. 2023 à 03:00, Bowen Song via user 
a écrit :

> The same table name with two different CF IDs is not just "temporary
> schema disagreements", it's much worse than that. This breaks the eventual
> consistency guarantee, and leads to silent data corruption. It's silently
> happening in the background, and you don't realise it until you suddenly
> do, and then everything seems to blow up at the same time. You need to sort
> this out ASAP.
>
>
> On 05/12/2023 19:57, Sébastien Rebecchi wrote:
>
> Hi Bowen,
>
> Thanks for your answer.
>
> I was thinking of extreme use cases, but as far as I am concerned I can
> deal with creation and deletion of 2 tables every 6 hours for a keyspace.
> So it lets around 8 folders of deleted tables per day - sometimes more
> cause I can see sometimes 2 folders created for a same table name, with 2
> different ids, caused by temporary schema disagreements I guess.
> Basically it means 20 years before the KS folder has 65K subfolders, so I
> would say I have time to think of redesigning the data model ^^
> Nevertheless, does it sound too much in terms of thombstones in the
> systems tables (with the default GC grace period of 10 days)?
>
> Sébastien.
>
> Le mar. 5 déc. 2023, 12:19, Bowen Song via user 
> a écrit :
>
>> Please rethink your use case. Create and delete tables concurrently often
>> lead to schema disagreement. Even doing so on a single node sequentially
>> will lead to a large number of tombstones in the system tables.
>> On 04/12/2023 19:55, Sébastien Rebecchi wrote:
>>
>> Thank you Dipan.
>>
>> Do you know if there is a good reason for Cassandra to let tables folder
>> even when there is no snapshot?
>>
>> I'm thinking of use cases where there is the need to create and delete
>> small tables at a high rate. You could quickly end with more than 65K
>> (limit of ext4) subdirectories in the KS directory, while 99.9.. % of them
>> are residual of deleted tables.
>>
>> That looks quite dirty from Cassandra to not clean its own "garbage" by
>> itself, and quite dangerous for the end user to have to do it alone, don't
>> you think so?
>>
>> Thanks,
>>
>> Sébastien.
>>
>> Le lun. 4 déc. 2023, 11:28, Dipan Shah  a écrit :
>>
>>> Hello Sebastien,
>>>
>>> There are no inbuilt tools that will automatically remove folders of
>>> deleted tables.
>>>
>>> Thanks,
>>>
>>> Dipan Shah
>>> --
>>> *From:* Sébastien Rebecchi 
>>> *Sent:* 04 December 2023 13:54
>>> *To:* user@cassandra.apache.org 
>>> *Subject:* Remove folders of deleted tables
>>>
>>> Hello,
>>>
>>> When we delete a table with Cassandra, it lets the folder of that table
>>> on file system, even if there is no snapshot (auto snapshots disabled).
>>> So we end with the empty folder {data folder}/{keyspace name}/{table
>>> name-table id} containing only 1  subfolder, backups, which is itself empty.
>>> Is there a way to automatically remove folders of deleted tables?
>>>
>>> Sébastien.
>>>
>>


Re: Remove folders of deleted tables

2023-12-05 Thread Bowen Song via user
The same table name with two different CF IDs is not just "temporary 
schema disagreements", it's much worse than that. This breaks the 
eventual consistency guarantee, and leads to silent data corruption. 
It's silently happening in the background, and you don't realise it 
until you suddenly do, and then everything seems to blow up at the same 
time. You need to sort this out ASAP.



On 05/12/2023 19:57, Sébastien Rebecchi wrote:

Hi Bowen,

Thanks for your answer.

I was thinking of extreme use cases, but as far as I am concerned I 
can deal with creation and deletion of 2 tables every 6 hours for a 
keyspace. So it lets around 8 folders of deleted tables per day - 
sometimes more cause I can see sometimes 2 folders created for a same 
table name, with 2 different ids, caused by temporary schema 
disagreements I guess.
Basically it means 20 years before the KS folder has 65K subfolders, 
so I would say I have time to think of redesigning the data model ^^
Nevertheless, does it sound too much in terms of thombstones in the 
systems tables (with the default GC grace period of 10 days)?


Sébastien.

Le mar. 5 déc. 2023, 12:19, Bowen Song via user 
 a écrit :


Please rethink your use case. Create and delete tables
concurrently often lead to schema disagreement. Even doing so on a
single node sequentially will lead to a large number of tombstones
in the system tables.

On 04/12/2023 19:55, Sébastien Rebecchi wrote:

Thank you Dipan.

Do you know if there is a good reason for Cassandra to let tables
folder even when there is no snapshot?

I'm thinking of use cases where there is the need to create and
delete small tables at a high rate. You could quickly end with
more than 65K (limit of ext4) subdirectories in the KS directory,
while 99.9.. % of them are residual of deleted tables.

That looks quite dirty from Cassandra to not clean its own
"garbage" by itself, and quite dangerous for the end user to have
to do it alone, don't you think so?

Thanks,

Sébastien.

Le lun. 4 déc. 2023, 11:28, Dipan Shah  a
écrit :

Hello Sebastien,

There are no inbuilt tools that will automatically remove
folders of deleted tables.

Thanks,

Dipan Shah


*From:* Sébastien Rebecchi 
*Sent:* 04 December 2023 13:54
*To:* user@cassandra.apache.org 
    *Subject:* Remove folders of deleted tables
Hello,

When we delete a table with Cassandra, it lets the folder of
that table on file system, even if there is no snapshot (auto
snapshots disabled).
So we end with the empty folder {data folder}/{keyspace
name}/{table name-table id} containing only 1  subfolder,
backups, which is itself empty.
Is there a way to automatically remove folders of deleted tables?

Sébastien.


Re: Remove folders of deleted tables

2023-12-05 Thread Jeff Jirsa
The last time you mentioned this:

On Tue, Dec 5, 2023 at 11:57 AM Sébastien Rebecchi 
wrote:

> Hi Bowen,
>
> Thanks for your answer.
>
> I was thinking of extreme use cases, but as far as I am concerned I can
> deal with creation and deletion of 2 tables every 6 hours for a keyspace.
> So it lets around 8 folders of deleted tables per day - sometimes more
> cause I can see sometimes 2 folders created for a same table name, with 2
> different ids, caused by temporary schema disagreements I guess.
>

I told you it's much worse than you're assuming it is:
https://lists.apache.org/thread/fzkn3vqjyfjslcv97wcycb6w0wn5ltk2

Here's a more detailed explanation:
https://www.mail-archive.com/user@cassandra.apache.org/msg62206.html

(This is fixed and strictly safe in the version of cassandra with
transactional cluster metadata, which just got merged to trunk in the past
month, so "will be safe soon").


Re: Remove folders of deleted tables

2023-12-05 Thread Sébastien Rebecchi
Hi Bowen,

Thanks for your answer.

I was thinking of extreme use cases, but as far as I am concerned I can
deal with creation and deletion of 2 tables every 6 hours for a keyspace.
So it lets around 8 folders of deleted tables per day - sometimes more
cause I can see sometimes 2 folders created for a same table name, with 2
different ids, caused by temporary schema disagreements I guess.
Basically it means 20 years before the KS folder has 65K subfolders, so I
would say I have time to think of redesigning the data model ^^
Nevertheless, does it sound too much in terms of thombstones in the systems
tables (with the default GC grace period of 10 days)?

Sébastien.

Le mar. 5 déc. 2023, 12:19, Bowen Song via user 
a écrit :

> Please rethink your use case. Create and delete tables concurrently often
> lead to schema disagreement. Even doing so on a single node sequentially
> will lead to a large number of tombstones in the system tables.
> On 04/12/2023 19:55, Sébastien Rebecchi wrote:
>
> Thank you Dipan.
>
> Do you know if there is a good reason for Cassandra to let tables folder
> even when there is no snapshot?
>
> I'm thinking of use cases where there is the need to create and delete
> small tables at a high rate. You could quickly end with more than 65K
> (limit of ext4) subdirectories in the KS directory, while 99.9.. % of them
> are residual of deleted tables.
>
> That looks quite dirty from Cassandra to not clean its own "garbage" by
> itself, and quite dangerous for the end user to have to do it alone, don't
> you think so?
>
> Thanks,
>
> Sébastien.
>
> Le lun. 4 déc. 2023, 11:28, Dipan Shah  a écrit :
>
>> Hello Sebastien,
>>
>> There are no inbuilt tools that will automatically remove folders of
>> deleted tables.
>>
>> Thanks,
>>
>> Dipan Shah
>> ------
>> *From:* Sébastien Rebecchi 
>> *Sent:* 04 December 2023 13:54
>> *To:* user@cassandra.apache.org 
>> *Subject:* Remove folders of deleted tables
>>
>> Hello,
>>
>> When we delete a table with Cassandra, it lets the folder of that table
>> on file system, even if there is no snapshot (auto snapshots disabled).
>> So we end with the empty folder {data folder}/{keyspace name}/{table
>> name-table id} containing only 1  subfolder, backups, which is itself empty.
>> Is there a way to automatically remove folders of deleted tables?
>>
>> Sébastien.
>>
>


Re: Remove folders of deleted tables

2023-12-05 Thread Jon Haddad
I can't think of a reason to keep empty directories around, seems like a 
reasonable change, but I don't think you're butting up against a thing that 
most people would run into, as snapshots are enabled by default (auto_snapshot: 
true) and almost nobody changes it.  

The use case you described isn't handled well by Cassandra for a host of other 
reasons, and I would *never* do that in a production environment with any 
released version.  The folder thing is the least of the issues you'll run into, 
so even if you contribute a patch and address it, I'd still wouldn't do it 
until transactional cluster metadata gets released and I've had a chance to 
kick the tires to see what issues you run into besides schema inconsistencies.  
I suspect the drivers won't love it either.

Assuming you're running into an issue now:

find . -type d -empty -exec rmdir {} \;

rmdir only removes empty directories, you'll need to run it twice (once for 
backup, once for the empty table).  It will remove all empty directories in 
that folder so if you've got unused tables, you'd be better off using the find 
command, getting the list, removing the active tables from it and explicitly 
running the rmdir command with the directories you want cleaned up.

Jon

On 2023/12/04 19:55:06 Sébastien Rebecchi wrote:
> Thank you Dipan.
> 
> Do you know if there is a good reason for Cassandra to let tables folder
> even when there is no snapshot?
> 
> I'm thinking of use cases where there is the need to create and delete
> small tables at a high rate. You could quickly end with more than 65K
> (limit of ext4) subdirectories in the KS directory, while 99.9.. % of them
> are residual of deleted tables.
> 
> That looks quite dirty from Cassandra to not clean its own "garbage" by
> itself, and quite dangerous for the end user to have to do it alone, don't
> you think so?
> 
> Thanks,
> 
> Sébastien.
> 
> Le lun. 4 déc. 2023, 11:28, Dipan Shah  a écrit :
> 
> > Hello Sebastien,
> >
> > There are no inbuilt tools that will automatically remove folders of
> > deleted tables.
> >
> > Thanks,
> >
> > Dipan Shah
> > --
> > *From:* Sébastien Rebecchi 
> > *Sent:* 04 December 2023 13:54
> > *To:* user@cassandra.apache.org 
> > *Subject:* Remove folders of deleted tables
> >
> > Hello,
> >
> > When we delete a table with Cassandra, it lets the folder of that table on
> > file system, even if there is no snapshot (auto snapshots disabled).
> > So we end with the empty folder {data folder}/{keyspace name}/{table
> > name-table id} containing only 1  subfolder, backups, which is itself empty.
> > Is there a way to automatically remove folders of deleted tables?
> >
> > Sébastien.
> >
> 


Re: Remove folders of deleted tables

2023-12-05 Thread Bowen Song via user
Please rethink your use case. Create and delete tables concurrently 
often lead to schema disagreement. Even doing so on a single node 
sequentially will lead to a large number of tombstones in the system tables.


On 04/12/2023 19:55, Sébastien Rebecchi wrote:

Thank you Dipan.

Do you know if there is a good reason for Cassandra to let tables 
folder even when there is no snapshot?


I'm thinking of use cases where there is the need to create and delete 
small tables at a high rate. You could quickly end with more than 65K 
(limit of ext4) subdirectories in the KS directory, while 99.9.. % of 
them are residual of deleted tables.


That looks quite dirty from Cassandra to not clean its own "garbage" 
by itself, and quite dangerous for the end user to have to do it 
alone, don't you think so?


Thanks,

Sébastien.

Le lun. 4 déc. 2023, 11:28, Dipan Shah  a écrit :

Hello Sebastien,

There are no inbuilt tools that will automatically remove folders
of deleted tables.

Thanks,

Dipan Shah


*From:* Sébastien Rebecchi 
*Sent:* 04 December 2023 13:54
*To:* user@cassandra.apache.org 
*Subject:* Remove folders of deleted tables
Hello,

When we delete a table with Cassandra, it lets the folder of that
table on file system, even if there is no snapshot (auto snapshots
disabled).
So we end with the empty folder {data folder}/{keyspace
name}/{table name-table id} containing only 1  subfolder, backups,
which is itself empty.
Is there a way to automatically remove folders of deleted tables?

Sébastien.


Re: Remove folders of deleted tables

2023-12-04 Thread Sébastien Rebecchi
Thank you Dipan.

Do you know if there is a good reason for Cassandra to let tables folder
even when there is no snapshot?

I'm thinking of use cases where there is the need to create and delete
small tables at a high rate. You could quickly end with more than 65K
(limit of ext4) subdirectories in the KS directory, while 99.9.. % of them
are residual of deleted tables.

That looks quite dirty from Cassandra to not clean its own "garbage" by
itself, and quite dangerous for the end user to have to do it alone, don't
you think so?

Thanks,

Sébastien.

Le lun. 4 déc. 2023, 11:28, Dipan Shah  a écrit :

> Hello Sebastien,
>
> There are no inbuilt tools that will automatically remove folders of
> deleted tables.
>
> Thanks,
>
> Dipan Shah
> --
> *From:* Sébastien Rebecchi 
> *Sent:* 04 December 2023 13:54
> *To:* user@cassandra.apache.org 
> *Subject:* Remove folders of deleted tables
>
> Hello,
>
> When we delete a table with Cassandra, it lets the folder of that table on
> file system, even if there is no snapshot (auto snapshots disabled).
> So we end with the empty folder {data folder}/{keyspace name}/{table
> name-table id} containing only 1  subfolder, backups, which is itself empty.
> Is there a way to automatically remove folders of deleted tables?
>
> Sébastien.
>


Re: Remove folders of deleted tables

2023-12-04 Thread Dipan Shah
Hello Sebastien,

There are no inbuilt tools that will automatically remove folders of deleted 
tables.


Thanks,

Dipan Shah


From: Sébastien Rebecchi 
Sent: 04 December 2023 13:54
To: user@cassandra.apache.org 
Subject: Remove folders of deleted tables

Hello,

When we delete a table with Cassandra, it lets the folder of that table on file 
system, even if there is no snapshot (auto snapshots disabled).
So we end with the empty folder {data folder}/{keyspace name}/{table name-table 
id} containing only 1  subfolder, backups, which is itself empty.
Is there a way to automatically remove folders of deleted tables?

Sébastien.


Remove folders of deleted tables

2023-12-04 Thread Sébastien Rebecchi
Hello,

When we delete a table with Cassandra, it lets the folder of that table on
file system, even if there is no snapshot (auto snapshots disabled).
So we end with the empty folder {data folder}/{keyspace name}/{table
name-table id} containing only 1  subfolder, backups, which is itself empty.
Is there a way to automatically remove folders of deleted tables?

Sébastien.


Re: Issues during Install/Remove Cassandra ver 4.0.x

2023-04-05 Thread Bowen Song via user
Since you have already downloaded the RPM file, you may install it with 
"yum install cassandra-4.0.7-1.noarch.rpm" command. This will install 
the package with all of its dependencies.


BTW, you can even run "yum install 
https://redhat.cassandra.apache.org/40x/cassandra-4.0.7-1.noarch.rpm; to 
download and install the package with just one command.



On 05/04/2023 14:45, MyWorld wrote:

Hi all,
We are facing one issue in installing cassandra-4.0.7.

### We started with*yum installation.* We setup repo "cassandra.repo" 
as below:

[cassandra]
name=Apache Cassandra
baseurl=https://redhat.cassandra.apache.org/40x/noboolean/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://downloads.apache.org/cassandra/KEYS

On doing "yum list cassandra", *it shows ver 4.0.8 but not 4.0.7*.
Further using showduplicates "yum --showduplicates list cassandra", 
*still it shows ver 4.0.8 but not 4.0.7*.


*How can we get earlier versions here ??*

###Next, we tried *using rpm,*
sudo curl -OL 
https://redhat.cassandra.apache.org/40x/cassandra-4.0.7-1.noarch.rpm


On running "sudo rpm -ivh cassandra-4.0.7-1.noarch.rpm",
It gives below error,
error: Failed dependencies:         (jre-1.8.0 or jre-11) is needed by 
cassandra-4.0.7-1.noarch         rpmlib(RichDependencies) <= 4.12.0-1 
is needed by cassandra-4.0.7-1.noarch
Then, i solve this by using "sudo rpm --nodeps -ivh 
cassandra-4.0.7-1.noarch.rpm"

and version was installed successfully

*Is skipping dependencies with  --nodeps a right approach ??*

###Next, i tried to *uninstall the version* using
"yum remove cassandra"
It gives error: *Invalid version flag: or*

Refer complete trace below:
# yum remove cassandra
Loaded plugins: fastestmirror
Resolving Dependencies
--> Running transaction check
---> Package cassandra.noarch 0:4.0.7-1 will be erased
--> Finished Dependency Resolution

Dependencies Resolved
===
 Package                     Arch                     Version         
           Repository                   Size

===
Removing:
 cassandra                   noarch                   4.0.7-1         
           installed                    55 M


Transaction Summary
===
Remove  1 Package

Installed size: 55 M
Is this ok [y/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction

Invalid version flag: or

*How to solve this issue ??*

Regards,
Ashish Gupta

Issues during Install/Remove Cassandra ver 4.0.x

2023-04-05 Thread MyWorld
Hi all,
We are facing one issue in installing cassandra-4.0.7.

### We started with* yum installation.* We setup repo "cassandra.repo" as
below:
[cassandra]
name=Apache Cassandra
baseurl=https://redhat.cassandra.apache.org/40x/noboolean/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://downloads.apache.org/cassandra/KEYS

On doing "yum list cassandra", *it shows ver 4.0.8 but not 4.0.7*.
Further using showduplicates "yum --showduplicates list cassandra", *still it
shows ver 4.0.8 but not 4.0.7*.

*How can we get earlier versions here ??*

###Next, we tried *using rpm,*
sudo curl -OL
https://redhat.cassandra.apache.org/40x/cassandra-4.0.7-1.noarch.rpm

On running "sudo rpm -ivh cassandra-4.0.7-1.noarch.rpm",
It gives below error,
error: Failed dependencies: (jre-1.8.0 or jre-11) is needed by
cassandra-4.0.7-1.noarch rpmlib(RichDependencies) <= 4.12.0-1 is
needed by cassandra-4.0.7-1.noarch

Then, i solve this by using "sudo rpm --nodeps -ivh
cassandra-4.0.7-1.noarch.rpm"
and version was installed successfully

*Is skipping dependencies with  --nodeps a right approach ??*

###Next, i tried to *uninstall the version* using
"yum remove cassandra"
It gives error: *Invalid version flag: or*

Refer complete trace below:
# yum remove cassandra
Loaded plugins: fastestmirror
Resolving Dependencies
--> Running transaction check
---> Package cassandra.noarch 0:4.0.7-1 will be erased
--> Finished Dependency Resolution

Dependencies Resolved
===
 Package Arch Version
 Repository   Size
===
Removing:
 cassandra   noarch   4.0.7-1
 installed55 M

Transaction Summary
===
Remove  1 Package

Installed size: 55 M
Is this ok [y/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction

Invalid version flag: or

*How to solve this issue ??*

Regards,
Ashish Gupta


Re: How to remove tombstones in a levelled compaction table in Cassandra 2.1.16?

2021-07-06 Thread manish khandelwal
Thanks Jeff and  Vytenis.

Jeff, could you explain what do you mean by

If you just pipe all of your sstables to user defined compaction jmx
> endpoint one at a time you’ll purge many of the tombstones as long as you
> don’t have a horrific data model.


Regards
Manish

On Wed, Jul 7, 2021 at 4:21 AM Jeff Jirsa  wrote:

> In 2.1 the only option is enable auto compaction or queue up manual user
> defined compaction
>
> If you just pipe all of your sstables to user defined compaction jmx
> endpoint one at a time you’ll purge many of the tombstones as long as you
> don’t have a horrific data model.
>
>
>
> On Jul 6, 2021, at 3:03 PM, vytenis silgalis  wrote:
>
> 
> You might want to take a look at `unchecked_tombstone_compaction` table
> setting. The best way to see if this is affecting you is to look at the
> sstablemetadata for the sstables and see if your tombstone ratio is higher
> than the configured tombstone_threshold ratio (0.2 be default) for the
> table.
>
> For example the sstable has a tombstone_threshold of 0.2 but you see
> sstables OLDER than 10 days (LCS has a tombstone compaction interval of 10
> days, it won't run a tombstone compaction until a sstable is at least 10
> days old).
>
> > sstablemetadata example-ka-1233-Data.db | grep droppable
> Estimated droppable tombstones: 1.0
> ^ this is an extreme example but anything greater than .2 on a 10+ day
> sstable is a problem.
>
> By default the unchecked_tombstone_compaction setting is false which will
> lead to tombstones staying around if a partition spans multiple sstables
> (which may happen with LCS over a long period).
>
> Try setting `unchecked_tombstone_compaction` to true, note: that when you
> first run this IF any sstables are above the tombstone_ratio setting for
> that table they will be compacted, this may cause extra load on the cluster.
>
> Vytenis
> ... always do your own research and verify what people say. :)
>
> On Mon, Jul 5, 2021 at 10:11 PM manish khandelwal <
> manishkhandelwa...@gmail.com> wrote:
>
>> Thanks Kane for the suggestion.
>>
>> Regards
>> Manish
>>
>> On Tue, Jul 6, 2021 at 6:19 AM Kane Wilson  wrote:
>>
>>>
>>> In one of our LCS table auto compaction was disabled. Now after years of
 run, range queries using spark-cassandra-connector are failing. Cassandra
 version is 2.1.16.

 I suspect due to disabling of autocompaction lots of tombstones got
 created. And now while reading those are creating issues and queries are
 getting timed out. Am I right in my thinking? What is the possible way to
 get out of this?

 I thought of using major compaction but for LCS that was introduced in
 Cassandra 2.2. Also user defined compactions dont work on LCS tables.



 Regards

 Manish Khandelwal

>>>
>>> If it's tombstones specifically you'll be able to see errors in the logs
>>> regarding passing the tombstone limit. However, disabling compactions could
>>> cause lots of problems (especially over years). I wouldn't be surprised if
>>> your reads are slow purely because of the number of SSTables you're hitting
>>> on each read. Given you've been running without compactions for so long you
>>> might want to look at just switching to STCS and re-enabling compactions.
>>> Note this should be done with care, as it could cause performance/storage
>>> issues.
>>>
>>> Cheers,
>>> Kane
>>>
>>> --
>>> raft.so - Cassandra consulting, support, and managed services
>>>
>>


Re: How to remove tombstones in a levelled compaction table in Cassandra 2.1.16?

2021-07-06 Thread Jeff Jirsa
In 2.1 the only option is enable auto compaction or queue up manual user 
defined compaction 

If you just pipe all of your sstables to user defined compaction jmx endpoint 
one at a time you’ll purge many of the tombstones as long as you don’t have a 
horrific data model.



> On Jul 6, 2021, at 3:03 PM, vytenis silgalis  wrote:
> 
> 
> You might want to take a look at `unchecked_tombstone_compaction` table 
> setting. The best way to see if this is affecting you is to look at the 
> sstablemetadata for the sstables and see if your tombstone ratio is higher 
> than the configured tombstone_threshold ratio (0.2 be default) for the table.
> 
> For example the sstable has a tombstone_threshold of 0.2 but you see sstables 
> OLDER than 10 days (LCS has a tombstone compaction interval of 10 days, it 
> won't run a tombstone compaction until a sstable is at least 10 days old).
> 
> > sstablemetadata example-ka-1233-Data.db | grep droppable
> Estimated droppable tombstones: 1.0
> ^ this is an extreme example but anything greater than .2 on a 10+ day 
> sstable is a problem.
> 
> By default the unchecked_tombstone_compaction setting is false which will 
> lead to tombstones staying around if a partition spans multiple sstables 
> (which may happen with LCS over a long period).
> 
> Try setting `unchecked_tombstone_compaction` to true, note: that when you 
> first run this IF any sstables are above the tombstone_ratio setting for that 
> table they will be compacted, this may cause extra load on the cluster.
> 
> Vytenis
> ... always do your own research and verify what people say. :)
> 
>> On Mon, Jul 5, 2021 at 10:11 PM manish khandelwal 
>>  wrote:
>> Thanks Kane for the suggestion.
>> 
>> Regards
>> Manish
>> 
>>> On Tue, Jul 6, 2021 at 6:19 AM Kane Wilson  wrote:
>>> 
 In one of our LCS table auto compaction was disabled. Now after years of 
 run, range queries using spark-cassandra-connector are failing. Cassandra 
 version is 2.1.16.
 
 I suspect due to disabling of autocompaction lots of tombstones got 
 created. And now while reading those are creating issues and queries are 
 getting timed out. Am I right in my thinking? What is the possible way to 
 get out of this?
 
 I thought of using major compaction but for LCS that was introduced in 
 Cassandra 2.2. Also user defined compactions dont work on LCS tables.
 
 
 Regards
 Manish Khandelwal
>>> 
>>> 
>>> If it's tombstones specifically you'll be able to see errors in the logs 
>>> regarding passing the tombstone limit. However, disabling compactions could 
>>> cause lots of problems (especially over years). I wouldn't be surprised if 
>>> your reads are slow purely because of the number of SSTables you're hitting 
>>> on each read. Given you've been running without compactions for so long you 
>>> might want to look at just switching to STCS and re-enabling compactions. 
>>> Note this should be done with care, as it could cause performance/storage 
>>> issues.
>>> 
>>> Cheers,
>>> Kane
>>> 
>>> -- 
>>> raft.so - Cassandra consulting, support, and managed services


Re: How to remove tombstones in a levelled compaction table in Cassandra 2.1.16?

2021-07-06 Thread vytenis silgalis
You might want to take a look at `unchecked_tombstone_compaction` table
setting. The best way to see if this is affecting you is to look at the
sstablemetadata for the sstables and see if your tombstone ratio is higher
than the configured tombstone_threshold ratio (0.2 be default) for the
table.

For example the sstable has a tombstone_threshold of 0.2 but you see
sstables OLDER than 10 days (LCS has a tombstone compaction interval of 10
days, it won't run a tombstone compaction until a sstable is at least 10
days old).

> sstablemetadata example-ka-1233-Data.db | grep droppable
Estimated droppable tombstones: 1.0
^ this is an extreme example but anything greater than .2 on a 10+ day
sstable is a problem.

By default the unchecked_tombstone_compaction setting is false which will
lead to tombstones staying around if a partition spans multiple sstables
(which may happen with LCS over a long period).

Try setting `unchecked_tombstone_compaction` to true, note: that when you
first run this IF any sstables are above the tombstone_ratio setting for
that table they will be compacted, this may cause extra load on the cluster.

Vytenis
... always do your own research and verify what people say. :)

On Mon, Jul 5, 2021 at 10:11 PM manish khandelwal <
manishkhandelwa...@gmail.com> wrote:

> Thanks Kane for the suggestion.
>
> Regards
> Manish
>
> On Tue, Jul 6, 2021 at 6:19 AM Kane Wilson  wrote:
>
>>
>> In one of our LCS table auto compaction was disabled. Now after years of
>>> run, range queries using spark-cassandra-connector are failing. Cassandra
>>> version is 2.1.16.
>>>
>>> I suspect due to disabling of autocompaction lots of tombstones got
>>> created. And now while reading those are creating issues and queries are
>>> getting timed out. Am I right in my thinking? What is the possible way to
>>> get out of this?
>>>
>>> I thought of using major compaction but for LCS that was introduced in
>>> Cassandra 2.2. Also user defined compactions dont work on LCS tables.
>>>
>>>
>>>
>>> Regards
>>>
>>> Manish Khandelwal
>>>
>>
>> If it's tombstones specifically you'll be able to see errors in the logs
>> regarding passing the tombstone limit. However, disabling compactions could
>> cause lots of problems (especially over years). I wouldn't be surprised if
>> your reads are slow purely because of the number of SSTables you're hitting
>> on each read. Given you've been running without compactions for so long you
>> might want to look at just switching to STCS and re-enabling compactions.
>> Note this should be done with care, as it could cause performance/storage
>> issues.
>>
>> Cheers,
>> Kane
>>
>> --
>> raft.so - Cassandra consulting, support, and managed services
>>
>


Re: How to remove tombstones in a levelled compaction table in Cassandra 2.1.16?

2021-07-05 Thread manish khandelwal
Thanks Kane for the suggestion.

Regards
Manish

On Tue, Jul 6, 2021 at 6:19 AM Kane Wilson  wrote:

>
> In one of our LCS table auto compaction was disabled. Now after years of
>> run, range queries using spark-cassandra-connector are failing. Cassandra
>> version is 2.1.16.
>>
>> I suspect due to disabling of autocompaction lots of tombstones got
>> created. And now while reading those are creating issues and queries are
>> getting timed out. Am I right in my thinking? What is the possible way to
>> get out of this?
>>
>> I thought of using major compaction but for LCS that was introduced in
>> Cassandra 2.2. Also user defined compactions dont work on LCS tables.
>>
>>
>>
>> Regards
>>
>> Manish Khandelwal
>>
>
> If it's tombstones specifically you'll be able to see errors in the logs
> regarding passing the tombstone limit. However, disabling compactions could
> cause lots of problems (especially over years). I wouldn't be surprised if
> your reads are slow purely because of the number of SSTables you're hitting
> on each read. Given you've been running without compactions for so long you
> might want to look at just switching to STCS and re-enabling compactions.
> Note this should be done with care, as it could cause performance/storage
> issues.
>
> Cheers,
> Kane
>
> --
> raft.so - Cassandra consulting, support, and managed services
>


Re: How to remove tombstones in a levelled compaction table in Cassandra 2.1.16?

2021-07-05 Thread Kane Wilson
In one of our LCS table auto compaction was disabled. Now after years of
> run, range queries using spark-cassandra-connector are failing. Cassandra
> version is 2.1.16.
>
> I suspect due to disabling of autocompaction lots of tombstones got
> created. And now while reading those are creating issues and queries are
> getting timed out. Am I right in my thinking? What is the possible way to
> get out of this?
>
> I thought of using major compaction but for LCS that was introduced in
> Cassandra 2.2. Also user defined compactions dont work on LCS tables.
>
>
>
> Regards
>
> Manish Khandelwal
>

If it's tombstones specifically you'll be able to see errors in the logs
regarding passing the tombstone limit. However, disabling compactions could
cause lots of problems (especially over years). I wouldn't be surprised if
your reads are slow purely because of the number of SSTables you're hitting
on each read. Given you've been running without compactions for so long you
might want to look at just switching to STCS and re-enabling compactions.
Note this should be done with care, as it could cause performance/storage
issues.

Cheers,
Kane

-- 
raft.so - Cassandra consulting, support, and managed services


How to remove tombstones in a levelled compaction table in Cassandra 2.1.16?

2021-07-05 Thread manish khandelwal
In one of our LCS table auto compaction was disabled. Now after years of
run, range queries using spark-cassandra-connector are failing. Cassandra
version is 2.1.16.

I suspect due to disabling of autocompaction lots of tombstones got
created. And now while reading those are creating issues and queries are
getting timed out. Am I right in my thinking? What is the possible way to
get out of this?

I thought of using major compaction but for LCS that was introduced in
Cassandra 2.2. Also user defined compactions dont work on LCS tables.



Regards

Manish Khandelwal


Re: remove dead node without streaming

2021-03-25 Thread Jeff Jirsa
It is strictly unsafe and you shouldn’t do it but “nodetool assassinate” does 
what you’re asking (you may leave data behind that you won’t be able to read 
again later)

You’d be better off adding a new host on top of the old one (replace address)

> On Mar 25, 2021, at 7:43 PM, Eunsu Kim  wrote:
> 
> 
> Hi all,
> 
> Is it possible to remove dead node directly from the cluster without 
> streaming?
> 
> My Cassandra cluster is quite large and takes too long to stream. (nodetool 
> removenode)
> 
> It's okay if my data is temporarily inconsistent.
> 
> Thanks in advance.


remove dead node without streaming

2021-03-25 Thread Eunsu Kim
Hi all,

Is it possible to remove dead node directly from the cluster without streaming?

My Cassandra cluster is quite large and takes too long to stream. (nodetool 
removenode)

It's okay if my data is temporarily inconsistent.

Thanks in advance.

Remove from listserv

2020-07-13 Thread Marie-Anne
Remove me from Cassandra listserv.

maharkn...@comcast.net

 



Re: Does node add/remove requires all cluster nodes be present?

2018-05-01 Thread Jinhua Luo
Could you explain more?

When we add a new node, it should migrate data from other nodes, right?
What happens if other nodes are absent? For example, the cluster
consists of 3 nodes, but 2 nodes down, now we add the fourth new node,
what happens then?

2018-05-01 12:01 GMT+08:00 Jeff Jirsa :
> nodetool decommission streams data from the losing replica, so only that 
> instance has to be online (and decom should be preferred to removenode)
>
> If that instance is offline, you can use removenode, but you risk violating 
> consistency guarantees
>
> Adding nodes is similar - bootstrap streams from the losing range
>
> --
> Jeff Jirsa
>
>
>> On Apr 30, 2018, at 8:57 PM, Jinhua Luo  wrote:
>>
>> Hi All,
>>
>> When a new node added, due to the even distribution of the new tokens,
>> the current nodes of the ring should migrate data to this new node.
>>
>> So, does it requires all nodes be present? If not, then if some nodes
>> are down, then it will miss the data migration of those parts, how and
>> when to fix it? When those nodes come back?
>>
>> Similarly, the node removal would migrate its data to other nodes, so
>> it seems that all other nodes must be present, otherwise it would lost
>> data?
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Does node add/remove requires all cluster nodes be present?

2018-04-30 Thread Jeff Jirsa
nodetool decommission streams data from the losing replica, so only that 
instance has to be online (and decom should be preferred to removenode)

If that instance is offline, you can use removenode, but you risk violating 
consistency guarantees

Adding nodes is similar - bootstrap streams from the losing range

-- 
Jeff Jirsa


> On Apr 30, 2018, at 8:57 PM, Jinhua Luo  wrote:
> 
> Hi All,
> 
> When a new node added, due to the even distribution of the new tokens,
> the current nodes of the ring should migrate data to this new node.
> 
> So, does it requires all nodes be present? If not, then if some nodes
> are down, then it will miss the data migration of those parts, how and
> when to fix it? When those nodes come back?
> 
> Similarly, the node removal would migrate its data to other nodes, so
> it seems that all other nodes must be present, otherwise it would lost
> data?
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Does node add/remove requires all cluster nodes be present?

2018-04-30 Thread Jinhua Luo
Hi All,

When a new node added, due to the even distribution of the new tokens,
the current nodes of the ring should migrate data to this new node.

So, does it requires all nodes be present? If not, then if some nodes
are down, then it will miss the data migration of those parts, how and
when to fix it? When those nodes come back?

Similarly, the node removal would migrate its data to other nodes, so
it seems that all other nodes must be present, otherwise it would lost
data?

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: replace dead node vs remove node

2018-03-25 Thread kurt greaves
Didn't read the blog but it's worth noting that if you replace the node and
give it a *different* ip address repairs will not be necessary as it will
receive writes during replacement. This works as long as you start up the
replacement node before HH window ends.

https://issues.apache.org/jira/browse/CASSANDRA-12344 and
https://issues.apache.org/jira/browse/CASSANDRA-11559 fixes this for same
address replacements (hopefully in 4.0)

On Fri., 23 Mar. 2018, 15:11 Anthony Grasso, <anthony.gra...@gmail.com>
wrote:

> Hi Peng,
>
> Correct, you would want to repair in either case.
>
> Regards,
> Anthony
>
>
> On Fri, 23 Mar 2018 at 14:09, Peng Xiao <2535...@qq.com> wrote:
>
>> Hi Anthony,
>>
>> there is a problem with replacing dead node as per the blog,if the
>> replacement process takes longer than max_hint_window_in_ms,we must run
>> repair to make the replaced node consistent again, since it missed ongoing
>> writes during bootstrapping.but for a great cluster,repair is a painful
>> process.
>>
>> Thanks,
>> Peng Xiao
>>
>>
>>
>> -- 原始邮件 --
>> *发件人:* "Anthony Grasso"<anthony.gra...@gmail.com>;
>> *发送时间:* 2018年3月22日(星期四) 晚上7:13
>> *收件人:* "user"<user@cassandra.apache.org>;
>> *主题:* Re: replace dead node vs remove node
>>
>> Hi Peng,
>>
>> Depending on the hardware failure you can do one of two things:
>>
>> 1. If the disks are intact and uncorrupted you could just use the disks
>> with the current data on them in the new node. Even if the IP address
>> changes for the new node that is fine. In that case all you need to do is
>> run repair on the new node. The repair will fix any writes the node missed
>> while it was down. This process is similar to the scenario in this blog
>> post:
>> http://thelastpickle.com/blog/2018/02/21/replace-node-without-bootstrapping.html
>>
>> 2. If the disks are inaccessible or corrupted, then use the method as
>> described in the blogpost you linked to. The operation is similar to
>> bootstrapping a new node. There is no need to perform any other remove or
>> join operation on the failed or new nodes. As per the blog post, you
>> definitely want to run repair on the new node as soon as it joins the
>> cluster. In this case here, the data on the failed node is effectively lost
>> and replaced with data from other nodes in the cluster.
>>
>> Hope this helps.
>>
>> Regards,
>> Anthony
>>
>>
>> On Thu, 22 Mar 2018 at 20:52, Peng Xiao <2535...@qq.com> wrote:
>>
>>> Dear All,
>>>
>>> when one node failure with hardware errors,it will be in DN status in
>>> the cluster.Then if we are not able to handle this error in three hours(max
>>> hints window),we will loss data,right?we have to run repair to keep the
>>> consistency.
>>> And as per
>>> https://blog.alteroot.org/articles/2014-03-12/replace-a-dead-node-in-cassandra.html,we
>>> can replace this dead node,is it the same as bootstrap new node?that means
>>> we don't need to remove node and rejoin?
>>> Could anyone please advise?
>>>
>>> Thanks,
>>> Peng Xiao
>>>
>>>
>>>
>>>
>>>


Re: replace dead node vs remove node

2018-03-22 Thread Anthony Grasso
Hi Peng,

Correct, you would want to repair in either case.

Regards,
Anthony


On Fri, 23 Mar 2018 at 14:09, Peng Xiao <2535...@qq.com> wrote:

> Hi Anthony,
>
> there is a problem with replacing dead node as per the blog,if the
> replacement process takes longer than max_hint_window_in_ms,we must run
> repair to make the replaced node consistent again, since it missed ongoing
> writes during bootstrapping.but for a great cluster,repair is a painful
> process.
>
> Thanks,
> Peng Xiao
>
>
>
> -- 原始邮件 --
> *发件人:* "Anthony Grasso"<anthony.gra...@gmail.com>;
> *发送时间:* 2018年3月22日(星期四) 晚上7:13
> *收件人:* "user"<user@cassandra.apache.org>;
> *主题:* Re: replace dead node vs remove node
>
> Hi Peng,
>
> Depending on the hardware failure you can do one of two things:
>
> 1. If the disks are intact and uncorrupted you could just use the disks
> with the current data on them in the new node. Even if the IP address
> changes for the new node that is fine. In that case all you need to do is
> run repair on the new node. The repair will fix any writes the node missed
> while it was down. This process is similar to the scenario in this blog
> post:
> http://thelastpickle.com/blog/2018/02/21/replace-node-without-bootstrapping.html
>
> 2. If the disks are inaccessible or corrupted, then use the method as
> described in the blogpost you linked to. The operation is similar to
> bootstrapping a new node. There is no need to perform any other remove or
> join operation on the failed or new nodes. As per the blog post, you
> definitely want to run repair on the new node as soon as it joins the
> cluster. In this case here, the data on the failed node is effectively lost
> and replaced with data from other nodes in the cluster.
>
> Hope this helps.
>
> Regards,
> Anthony
>
>
> On Thu, 22 Mar 2018 at 20:52, Peng Xiao <2535...@qq.com> wrote:
>
>> Dear All,
>>
>> when one node failure with hardware errors,it will be in DN status in the
>> cluster.Then if we are not able to handle this error in three hours(max
>> hints window),we will loss data,right?we have to run repair to keep the
>> consistency.
>> And as per
>> https://blog.alteroot.org/articles/2014-03-12/replace-a-dead-node-in-cassandra.html,we
>> can replace this dead node,is it the same as bootstrap new node?that means
>> we don't need to remove node and rejoin?
>> Could anyone please advise?
>>
>> Thanks,
>> Peng Xiao
>>
>>
>>
>>
>>


Re: replace dead node vs remove node

2018-03-22 Thread Jonathan Haddad
Ah sorry - I misread the original post - for some reason I had it in my
head the question was about bootstrap.

Carry on.

On Thu, Mar 22, 2018 at 8:35 PM Jonathan Haddad <j...@jonhaddad.com> wrote:

> Under normal circumstances this is not true.
>
> Take a look at org.apache.cassandra.service.StorageProxy#performWrite, it
> grabs both the natural endpoints and the pending endpoints (new nodes).
> They're eventually passed through
> to 
> org.apache.cassandra.locator.AbstractReplicationStrategy#getWriteResponseHandler,
> which keeps track of both the current endpoints and the pending ones.
> Later, it gets to the actual work:
>
> performer.apply(mutation, Iterables.concat(naturalEndpoints, 
> pendingEndpoints), responseHandler, localDataCenter, consistency_level);
>
> The signature of this method is:
>
> public interface WritePerformer
> {
> public void apply(IMutation mutation,
>   Iterable targets,
>   AbstractWriteResponseHandler responseHandler,
>   String localDataCenter,
>   ConsistencyLevel consistencyLevel) throws 
> OverloadedException;
> }
>
> Notice the targets?  That's the list of all current owners and pending
> owners.  The list is a concatenation of the natural endpoints and the
> pending ones.
>
> Pending owners are listed in org.apache.cassandra.locator.TokenMetadata
>
> // this is a cache of the calculation from {tokenToEndpointMap, 
> bootstrapTokens, leavingEndpoints}
> private final ConcurrentMap<String, PendingRangeMaps> pendingRanges = new 
> ConcurrentHashMap<String, PendingRangeMaps>();
>
>
> TL;DR: mutations are sent to nodes being bootstrapped.
>
> Jon
>
>
> On Thu, Mar 22, 2018 at 8:09 PM Peng Xiao <2535...@qq.com> wrote:
>
>> Hi Anthony,
>>
>> there is a problem with replacing dead node as per the blog,if the
>> replacement process takes longer than max_hint_window_in_ms,we must run
>> repair to make the replaced node consistent again, since it missed ongoing
>> writes during bootstrapping.but for a great cluster,repair is a painful
>> process.
>>
>> Thanks,
>> Peng Xiao
>>
>>
>>
>> -- 原始邮件 --
>> *发件人:* "Anthony Grasso"<anthony.gra...@gmail.com>;
>> *发送时间:* 2018年3月22日(星期四) 晚上7:13
>> *收件人:* "user"<user@cassandra.apache.org>;
>> *主题:* Re: replace dead node vs remove node
>>
>> Hi Peng,
>>
>> Depending on the hardware failure you can do one of two things:
>>
>> 1. If the disks are intact and uncorrupted you could just use the disks
>> with the current data on them in the new node. Even if the IP address
>> changes for the new node that is fine. In that case all you need to do is
>> run repair on the new node. The repair will fix any writes the node missed
>> while it was down. This process is similar to the scenario in this blog
>> post:
>> http://thelastpickle.com/blog/2018/02/21/replace-node-without-bootstrapping.html
>>
>> 2. If the disks are inaccessible or corrupted, then use the method as
>> described in the blogpost you linked to. The operation is similar to
>> bootstrapping a new node. There is no need to perform any other remove or
>> join operation on the failed or new nodes. As per the blog post, you
>> definitely want to run repair on the new node as soon as it joins the
>> cluster. In this case here, the data on the failed node is effectively lost
>> and replaced with data from other nodes in the cluster.
>>
>> Hope this helps.
>>
>> Regards,
>> Anthony
>>
>>
>> On Thu, 22 Mar 2018 at 20:52, Peng Xiao <2535...@qq.com> wrote:
>>
>>> Dear All,
>>>
>>> when one node failure with hardware errors,it will be in DN status in
>>> the cluster.Then if we are not able to handle this error in three hours(max
>>> hints window),we will loss data,right?we have to run repair to keep the
>>> consistency.
>>> And as per
>>> https://blog.alteroot.org/articles/2014-03-12/replace-a-dead-node-in-cassandra.html,we
>>> can replace this dead node,is it the same as bootstrap new node?that means
>>> we don't need to remove node and rejoin?
>>> Could anyone please advise?
>>>
>>> Thanks,
>>> Peng Xiao
>>>
>>>
>>>
>>>
>>>


Re: replace dead node vs remove node

2018-03-22 Thread Jonathan Haddad
Under normal circumstances this is not true.

Take a look at org.apache.cassandra.service.StorageProxy#performWrite, it
grabs both the natural endpoints and the pending endpoints (new nodes).
They're eventually passed through
to 
org.apache.cassandra.locator.AbstractReplicationStrategy#getWriteResponseHandler,
which keeps track of both the current endpoints and the pending ones.
Later, it gets to the actual work:

performer.apply(mutation, Iterables.concat(naturalEndpoints,
pendingEndpoints), responseHandler, localDataCenter,
consistency_level);

The signature of this method is:

public interface WritePerformer
{
public void apply(IMutation mutation,
  Iterable targets,
  AbstractWriteResponseHandler responseHandler,
  String localDataCenter,
  ConsistencyLevel consistencyLevel) throws
OverloadedException;
}

Notice the targets?  That's the list of all current owners and pending
owners.  The list is a concatenation of the natural endpoints and the
pending ones.

Pending owners are listed in org.apache.cassandra.locator.TokenMetadata

// this is a cache of the calculation from {tokenToEndpointMap,
bootstrapTokens, leavingEndpoints}
private final ConcurrentMap<String, PendingRangeMaps> pendingRanges =
new ConcurrentHashMap<String, PendingRangeMaps>();


TL;DR: mutations are sent to nodes being bootstrapped.

Jon


On Thu, Mar 22, 2018 at 8:09 PM Peng Xiao <2535...@qq.com> wrote:

> Hi Anthony,
>
> there is a problem with replacing dead node as per the blog,if the
> replacement process takes longer than max_hint_window_in_ms,we must run
> repair to make the replaced node consistent again, since it missed ongoing
> writes during bootstrapping.but for a great cluster,repair is a painful
> process.
>
> Thanks,
> Peng Xiao
>
>
>
> -- 原始邮件 --
> *发件人:* "Anthony Grasso"<anthony.gra...@gmail.com>;
> *发送时间:* 2018年3月22日(星期四) 晚上7:13
> *收件人:* "user"<user@cassandra.apache.org>;
> *主题:* Re: replace dead node vs remove node
>
> Hi Peng,
>
> Depending on the hardware failure you can do one of two things:
>
> 1. If the disks are intact and uncorrupted you could just use the disks
> with the current data on them in the new node. Even if the IP address
> changes for the new node that is fine. In that case all you need to do is
> run repair on the new node. The repair will fix any writes the node missed
> while it was down. This process is similar to the scenario in this blog
> post:
> http://thelastpickle.com/blog/2018/02/21/replace-node-without-bootstrapping.html
>
> 2. If the disks are inaccessible or corrupted, then use the method as
> described in the blogpost you linked to. The operation is similar to
> bootstrapping a new node. There is no need to perform any other remove or
> join operation on the failed or new nodes. As per the blog post, you
> definitely want to run repair on the new node as soon as it joins the
> cluster. In this case here, the data on the failed node is effectively lost
> and replaced with data from other nodes in the cluster.
>
> Hope this helps.
>
> Regards,
> Anthony
>
>
> On Thu, 22 Mar 2018 at 20:52, Peng Xiao <2535...@qq.com> wrote:
>
>> Dear All,
>>
>> when one node failure with hardware errors,it will be in DN status in the
>> cluster.Then if we are not able to handle this error in three hours(max
>> hints window),we will loss data,right?we have to run repair to keep the
>> consistency.
>> And as per
>> https://blog.alteroot.org/articles/2014-03-12/replace-a-dead-node-in-cassandra.html,we
>> can replace this dead node,is it the same as bootstrap new node?that means
>> we don't need to remove node and rejoin?
>> Could anyone please advise?
>>
>> Thanks,
>> Peng Xiao
>>
>>
>>
>>
>>


Re: 回复: replace dead node vs remove node

2018-03-22 Thread Jeff Jirsa
Subrange repair of only the neighbors is sufficient

Break the range covering the dead node into ~100 splits and repair those splits 
individually in sequence. You don’t have to repair the whole range all at once



-- 
Jeff Jirsa


> On Mar 22, 2018, at 8:08 PM, Peng Xiao <2535...@qq.com> wrote:
> 
> Hi Anthony,
> 
> there is a problem with replacing dead node as per the blog,if the 
> replacement process takes longer than max_hint_window_in_ms,we must run 
> repair to make the replaced node consistent again, since it missed ongoing 
> writes during bootstrapping.but for a great cluster,repair is a painful 
> process.
>  
> Thanks,
> Peng Xiao
> 
> 
> 
> -- 原始邮件 --
> 发件人: "Anthony Grasso"<anthony.gra...@gmail.com>;
> 发送时间: 2018年3月22日(星期四) 晚上7:13
> 收件人: "user"<user@cassandra.apache.org>;
> 主题: Re: replace dead node vs remove node
> 
> Hi Peng,
> 
> Depending on the hardware failure you can do one of two things:
> 
> 1. If the disks are intact and uncorrupted you could just use the disks with 
> the current data on them in the new node. Even if the IP address changes for 
> the new node that is fine. In that case all you need to do is run repair on 
> the new node. The repair will fix any writes the node missed while it was 
> down. This process is similar to the scenario in this blog post: 
> http://thelastpickle.com/blog/2018/02/21/replace-node-without-bootstrapping.html
> 
> 2. If the disks are inaccessible or corrupted, then use the method as 
> described in the blogpost you linked to. The operation is similar to 
> bootstrapping a new node. There is no need to perform any other remove or 
> join operation on the failed or new nodes. As per the blog post, you 
> definitely want to run repair on the new node as soon as it joins the 
> cluster. In this case here, the data on the failed node is effectively lost 
> and replaced with data from other nodes in the cluster.
> 
> Hope this helps.
> 
> Regards,
> Anthony
> 
> 
>> On Thu, 22 Mar 2018 at 20:52, Peng Xiao <2535...@qq.com> wrote:
>> Dear All,
>> 
>> when one node failure with hardware errors,it will be in DN status in the 
>> cluster.Then if we are not able to handle this error in three hours(max 
>> hints window),we will loss data,right?we have to run repair to keep the 
>> consistency.
>> And as per 
>> https://blog.alteroot.org/articles/2014-03-12/replace-a-dead-node-in-cassandra.html,we
>>  can replace this dead node,is it the same as bootstrap new node?that means 
>> we don't need to remove node and rejoin?
>> Could anyone please advise?
>> 
>> Thanks,
>> Peng Xiao
>> 
>>  
>> 
>> 


?????? replace dead node vs remove node

2018-03-22 Thread Peng Xiao
Hi Anthony,


there is a problem with replacing dead node as per the blog,if the replacement 
process takes longer than max_hint_window_in_ms,we must run repair to make the 
replaced node consistent again, since it missed ongoing writes during 
bootstrapping.but for a great cluster,repair is a painful process.
 
Thanks,
Peng Xiao






--  --
??: "Anthony Grasso"<anthony.gra...@gmail.com>;
: 2018??3??22??(??) 7:13
??: "user"<user@cassandra.apache.org>;

: Re: replace dead node vs remove node



Hi Peng,

Depending on the hardware failure you can do one of two things:



1. If the disks are intact and uncorrupted you could just use the disks with 
the current data on them in the new node. Even if the IP address changes for 
the new node that is fine. In that case all you need to do is run repair on the 
new node. The repair will fix any writes the node missed while it was down. 
This process is similar to the scenario in this blog post: 
http://thelastpickle.com/blog/2018/02/21/replace-node-without-bootstrapping.html


2. If the disks are inaccessible or corrupted, then use the method as described 
in the blogpost you linked to. The operation is similar to bootstrapping a new 
node. There is no need to perform any other remove or join operation on the 
failed or new nodes. As per the blog post, you definitely want to run repair on 
the new node as soon as it joins the cluster. In this case here, the data on 
the failed node is effectively lost and replaced with data from other nodes in 
the cluster.


Hope this helps.


Regards,
Anthony


On Thu, 22 Mar 2018 at 20:52, Peng Xiao <2535...@qq.com> wrote:

Dear All,


when one node failure with hardware errors,it will be in DN status in the 
cluster.Then if we are not able to handle this error in three hours(max hints 
window),we will loss data,right?we have to run repair to keep the consistency.
And as per 
https://blog.alteroot.org/articles/2014-03-12/replace-a-dead-node-in-cassandra.html,we
 can replace this dead node,is it the same as bootstrap new node?that means we 
don't need to remove node and rejoin?
Could anyone please advise?


Thanks,
Peng Xiao

Re: replace dead node vs remove node

2018-03-22 Thread Anthony Grasso
Hi Peng,

Depending on the hardware failure you can do one of two things:

1. If the disks are intact and uncorrupted you could just use the disks
with the current data on them in the new node. Even if the IP address
changes for the new node that is fine. In that case all you need to do is
run repair on the new node. The repair will fix any writes the node missed
while it was down. This process is similar to the scenario in this blog
post:
http://thelastpickle.com/blog/2018/02/21/replace-node-without-bootstrapping.html

2. If the disks are inaccessible or corrupted, then use the method as
described in the blogpost you linked to. The operation is similar to
bootstrapping a new node. There is no need to perform any other remove or
join operation on the failed or new nodes. As per the blog post, you
definitely want to run repair on the new node as soon as it joins the
cluster. In this case here, the data on the failed node is effectively lost
and replaced with data from other nodes in the cluster.

Hope this helps.

Regards,
Anthony


On Thu, 22 Mar 2018 at 20:52, Peng Xiao <2535...@qq.com> wrote:

> Dear All,
>
> when one node failure with hardware errors,it will be in DN status in the
> cluster.Then if we are not able to handle this error in three hours(max
> hints window),we will loss data,right?we have to run repair to keep the
> consistency.
> And as per
> https://blog.alteroot.org/articles/2014-03-12/replace-a-dead-node-in-cassandra.html,we
> can replace this dead node,is it the same as bootstrap new node?that means
> we don't need to remove node and rejoin?
> Could anyone please advise?
>
> Thanks,
> Peng Xiao
>
>
>
>
>


replace dead node vs remove node

2018-03-22 Thread Peng Xiao
Dear All,


when one node failure with hardware errors,it will be in DN status in the 
cluster.Then if we are not able to handle this error in three hours(max hints 
window),we will loss data,right?we have to run repair to keep the consistency.
And as per 
https://blog.alteroot.org/articles/2014-03-12/replace-a-dead-node-in-cassandra.html,we
 can replace this dead node,is it the same as bootstrap new node?that means we 
don't need to remove node and rejoin?
Could anyone please advise?


Thanks,
Peng Xiao

RE: system.size_estimates - safe to remove sstables?

2018-03-12 Thread Kunal Gangakhedkar
No, this is a different cluster.

Kunal

On 13-Mar-2018 6:27 AM, "Kenneth Brotman" <kenbrot...@yahoo.com.invalid>
wrote:

Kunal,



Is  this the GCE cluster you are speaking of in the “Adding new DC?” thread?



Kenneth Brotman



*From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com]
*Sent:* Sunday, March 11, 2018 2:18 PM
*To:* user@cassandra.apache.org
*Subject:* Re: system.size_estimates - safe to remove sstables?



Finally, got a chance to work on it over the weekend.

It worked as advertised. :)



Thanks a lot, Chris.


Kunal



On 8 March 2018 at 10:47, Kunal Gangakhedkar <kgangakhed...@gmail.com>
wrote:

Thanks a lot, Chris.



Will try it today/tomorrow and update here.



Thanks,

Kunal



On 7 March 2018 at 00:25, Chris Lohfink <clohf...@apple.com> wrote:

While its off you can delete the files in the directory yeah



Chris





On Mar 6, 2018, at 2:35 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com>
wrote:



Hi Chris,



I checked for snapshots and backups - none found.

Also, we're not using opscenter, hadoop or spark or any such tool.



So, do you think we can just remove the cf and restart the service?



Thanks,

Kunal



On 5 March 2018 at 21:52, Chris Lohfink <clohf...@apple.com> wrote:

Any chance space used by snapshots? What files exist there that are taking
up space?

> On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com>
wrote:
>

> Hi all,
>
> I have a 2-node cluster running cassandra 2.1.18.
> One of the nodes has run out of disk space and died - almost all of it
shows up as occupied by size_estimates CF.
> Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du -sh'
output.
>
> This is while the other node is chugging along - shows only 25MiB
consumed by size_estimates (du -sh output).
>
> Any idea why this descripancy?
> Is it safe to remove the size_estimates sstables from the affected node
and restart the service?
>
> Thanks,
> Kunal

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
<user-unsubscribe@cassandra.apacheorg>
For additional commands, e-mail: user-h...@cassandra.apache.org


RE: system.size_estimates - safe to remove sstables?

2018-03-12 Thread Kenneth Brotman
Kunal,

 

Is  this the GCE cluster you are speaking of in the “Adding new DC?” thread?

 

Kenneth Brotman

 

From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com] 
Sent: Sunday, March 11, 2018 2:18 PM
To: user@cassandra.apache.org
Subject: Re: system.size_estimates - safe to remove sstables?

 

Finally, got a chance to work on it over the weekend.

It worked as advertised. :)

 

Thanks a lot, Chris.




Kunal

 

On 8 March 2018 at 10:47, Kunal Gangakhedkar <kgangakhed...@gmail.com> wrote:

Thanks a lot, Chris.

 

Will try it today/tomorrow and update here.

 

Thanks,


Kunal

 

On 7 March 2018 at 00:25, Chris Lohfink <clohf...@apple.com> wrote:

While its off you can delete the files in the directory yeah

 

Chris

 





On Mar 6, 2018, at 2:35 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com> wrote:

 

Hi Chris,

 

I checked for snapshots and backups - none found.

Also, we're not using opscenter, hadoop or spark or any such tool.

 

So, do you think we can just remove the cf and restart the service?

 

Thanks,


Kunal

 

On 5 March 2018 at 21:52, Chris Lohfink <clohf...@apple.com> wrote:

Any chance space used by snapshots? What files exist there that are taking up 
space?

> On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com> 
> wrote:
>

> Hi all,
>
> I have a 2-node cluster running cassandra 2.1.18.
> One of the nodes has run out of disk space and died - almost all of it shows 
> up as occupied by size_estimates CF.
> Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du -sh' 
> output.
>
> This is while the other node is chugging along - shows only 25MiB consumed by 
> size_estimates (du -sh output).
>
> Any idea why this descripancy?
> Is it safe to remove the size_estimates sstables from the affected node and 
> restart the service?
>
> Thanks,
> Kunal



-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
<mailto:user-unsubscribe@cassandra.apacheorg> 
For additional commands, e-mail: user-h...@cassandra.apache.org

 

 

 

 



Re: system.size_estimates - safe to remove sstables?

2018-03-11 Thread Kunal Gangakhedkar
Finally, got a chance to work on it over the weekend.
It worked as advertised. :)

Thanks a lot, Chris.

Kunal

On 8 March 2018 at 10:47, Kunal Gangakhedkar <kgangakhed...@gmail.com>
wrote:

> Thanks a lot, Chris.
>
> Will try it today/tomorrow and update here.
>
> Thanks,
> Kunal
>
> On 7 March 2018 at 00:25, Chris Lohfink <clohf...@apple.com> wrote:
>
>> While its off you can delete the files in the directory yeah
>>
>> Chris
>>
>>
>> On Mar 6, 2018, at 2:35 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com>
>> wrote:
>>
>> Hi Chris,
>>
>> I checked for snapshots and backups - none found.
>> Also, we're not using opscenter, hadoop or spark or any such tool.
>>
>> So, do you think we can just remove the cf and restart the service?
>>
>> Thanks,
>> Kunal
>>
>> On 5 March 2018 at 21:52, Chris Lohfink <clohf...@apple.com> wrote:
>>
>>> Any chance space used by snapshots? What files exist there that are
>>> taking up space?
>>>
>>> > On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar <
>>> kgangakhed...@gmail.com> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > I have a 2-node cluster running cassandra 2.1.18.
>>> > One of the nodes has run out of disk space and died - almost all of it
>>> shows up as occupied by size_estimates CF.
>>> > Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du
>>> -sh' output.
>>> >
>>> > This is while the other node is chugging along - shows only 25MiB
>>> consumed by size_estimates (du -sh output).
>>> >
>>> > Any idea why this descripancy?
>>> > Is it safe to remove the size_estimates sstables from the affected
>>> node and restart the service?
>>> >
>>> > Thanks,
>>> > Kunal
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>
>>>
>>
>>
>


Re: system.size_estimates - safe to remove sstables?

2018-03-07 Thread Kunal Gangakhedkar
Thanks a lot, Chris.

Will try it today/tomorrow and update here.

Thanks,
Kunal

On 7 March 2018 at 00:25, Chris Lohfink <clohf...@apple.com> wrote:

> While its off you can delete the files in the directory yeah
>
> Chris
>
>
> On Mar 6, 2018, at 2:35 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com>
> wrote:
>
> Hi Chris,
>
> I checked for snapshots and backups - none found.
> Also, we're not using opscenter, hadoop or spark or any such tool.
>
> So, do you think we can just remove the cf and restart the service?
>
> Thanks,
> Kunal
>
> On 5 March 2018 at 21:52, Chris Lohfink <clohf...@apple.com> wrote:
>
>> Any chance space used by snapshots? What files exist there that are
>> taking up space?
>>
>> > On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com>
>> wrote:
>> >
>> > Hi all,
>> >
>> > I have a 2-node cluster running cassandra 2.1.18.
>> > One of the nodes has run out of disk space and died - almost all of it
>> shows up as occupied by size_estimates CF.
>> > Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du
>> -sh' output.
>> >
>> > This is while the other node is chugging along - shows only 25MiB
>> consumed by size_estimates (du -sh output).
>> >
>> > Any idea why this descripancy?
>> > Is it safe to remove the size_estimates sstables from the affected node
>> and restart the service?
>> >
>> > Thanks,
>> > Kunal
>>
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>
>


Re: system.size_estimates - safe to remove sstables?

2018-03-06 Thread Chris Lohfink
While its off you can delete the files in the directory yeah

Chris

> On Mar 6, 2018, at 2:35 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com> 
> wrote:
> 
> Hi Chris,
> 
> I checked for snapshots and backups - none found.
> Also, we're not using opscenter, hadoop or spark or any such tool.
> 
> So, do you think we can just remove the cf and restart the service?
> 
> Thanks,
> Kunal
> 
> On 5 March 2018 at 21:52, Chris Lohfink <clohf...@apple.com 
> <mailto:clohf...@apple.com>> wrote:
> Any chance space used by snapshots? What files exist there that are taking up 
> space?
> 
> > On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com 
> > <mailto:kgangakhed...@gmail.com>> wrote:
> >
> > Hi all,
> >
> > I have a 2-node cluster running cassandra 2.1.18.
> > One of the nodes has run out of disk space and died - almost all of it 
> > shows up as occupied by size_estimates CF.
> > Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du -sh' 
> > output.
> >
> > This is while the other node is chugging along - shows only 25MiB consumed 
> > by size_estimates (du -sh output).
> >
> > Any idea why this descripancy?
> > Is it safe to remove the size_estimates sstables from the affected node and 
> > restart the service?
> >
> > Thanks,
> > Kunal
> 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
> <mailto:user-unsubscr...@cassandra.apache.org>
> For additional commands, e-mail: user-h...@cassandra.apache.org 
> <mailto:user-h...@cassandra.apache.org>
> 
> 



Re: system.size_estimates - safe to remove sstables?

2018-03-06 Thread Kunal Gangakhedkar
Hi Chris,

I checked for snapshots and backups - none found.
Also, we're not using opscenter, hadoop or spark or any such tool.

So, do you think we can just remove the cf and restart the service?

Thanks,
Kunal

On 5 March 2018 at 21:52, Chris Lohfink <clohf...@apple.com> wrote:

> Any chance space used by snapshots? What files exist there that are taking
> up space?
>
> > On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com>
> wrote:
> >
> > Hi all,
> >
> > I have a 2-node cluster running cassandra 2.1.18.
> > One of the nodes has run out of disk space and died - almost all of it
> shows up as occupied by size_estimates CF.
> > Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du -sh'
> output.
> >
> > This is while the other node is chugging along - shows only 25MiB
> consumed by size_estimates (du -sh output).
> >
> > Any idea why this descripancy?
> > Is it safe to remove the size_estimates sstables from the affected node
> and restart the service?
> >
> > Thanks,
> > Kunal
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: system.size_estimates - safe to remove sstables?

2018-03-05 Thread Chris Lohfink
Any chance space used by snapshots? What files exist there that are taking up 
space?

> On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com> 
> wrote:
> 
> Hi all,
> 
> I have a 2-node cluster running cassandra 2.1.18.
> One of the nodes has run out of disk space and died - almost all of it shows 
> up as occupied by size_estimates CF.
> Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du -sh' 
> output.
> 
> This is while the other node is chugging along - shows only 25MiB consumed by 
> size_estimates (du -sh output).
> 
> Any idea why this descripancy?
> Is it safe to remove the size_estimates sstables from the affected node and 
> restart the service?
> 
> Thanks,
> Kunal


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: system.size_estimates - safe to remove sstables?

2018-03-05 Thread Chris Lohfink
Unless using spark or hadoop nothing consumes the data in that table (unless 
you have tooling that may use it like opscenter or something) so your safe to 
just truncate it or rm the sstables when instance offline you will be fine, if 
you do use that table you can then do a `nodetool refreshsizeestimates` to 
readd it or just wait for it to re-run automatically (every 5 min).

Chris

> On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar <kgangakhed...@gmail.com> 
> wrote:
> 
> Hi all,
> 
> I have a 2-node cluster running cassandra 2.1.18.
> One of the nodes has run out of disk space and died - almost all of it shows 
> up as occupied by size_estimates CF.
> Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du -sh' 
> output.
> 
> This is while the other node is chugging along - shows only 25MiB consumed by 
> size_estimates (du -sh output).
> 
> Any idea why this descripancy?
> Is it safe to remove the size_estimates sstables from the affected node and 
> restart the service?
> 
> Thanks,
> Kunal


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



system.size_estimates - safe to remove sstables?

2018-03-04 Thread Kunal Gangakhedkar
Hi all,

I have a 2-node cluster running cassandra 2.1.18.
One of the nodes has run out of disk space and died - almost all of it
shows up as occupied by size_estimates CF.
Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du -sh'
output.

This is while the other node is chugging along - shows only 25MiB consumed
by size_estimates (du -sh output).

Any idea why this descripancy?
Is it safe to remove the size_estimates sstables from the affected node and
restart the service?

Thanks,
Kunal


RE: How to remove 'compact storage' attribute?

2016-06-12 Thread Lu, Boying
Thanks a lot for your suggestion.

Actually what we want to do is to use 
https://github.com/Stratio/cassandra-lucene-index
to provide user more powerful ‘searching/sorting’ functions on large data set 
in our product.
But it doesn’t support ‘compact storage’ tables.

I wonder why the ‘create custom index’ doesn’t support ‘compact storage’ 
tables?  My understanding is that
Cassandra uses the class provided by user (through the ‘USING’ clause) to do 
the index which should be unrelated to the ‘compact storage’ attribute.

Is there any way that I can register a hook to the Cassandra what is invoked on 
all the related nodes when upsert a record? We use Cassandra 2.1.x.

Thanks

Boying

From: Romain Hardouin [mailto:romainh...@yahoo.fr]
Sent: 2016年6月8日 20:12
To: user@cassandra.apache.org
Subject: Re: How to remove 'compact storage' attribute?


Hi,

You can't yet, see https://issues.apache.org/jira/browse/CASSANDRA-10857
Note that secondary indexes don't scale. Be aware of their limitations.
If you want to change the data model of a CF, a Spark job can do the trick.

Best,

Romain

Le Mardi 7 juin 2016 10h51, "Lu, Boying" 
<boying...@emc.com<mailto:boying...@emc.com>> a écrit :

Hi, All,

Since the Astyanax client has been EOL, we are considering to migrate to 
Datastax java client in our product.

One thing I notice is that the CFs created  by Astyanax have ‘compact storage’ 
attribute which prevent us from
using some new features provided by CQL such as secondary index.

Does anyone know how to remove this attribute? “ALTER TABLE” seems doesn’t work 
according to the CQL document.

Thanks

Boying




Re: How to remove 'compact storage' attribute?

2016-06-08 Thread Romain Hardouin
 
Hi,
You can't yet, see https://issues.apache.org/jira/browse/CASSANDRA-10857Note 
that secondary indexes don't scale. Be aware of their limitations.If you want 
to change the data model of a CF, a Spark job can do the trick.
Best,
Romain   

 Le Mardi 7 juin 2016 10h51, "Lu, Boying" <boying...@emc.com> a écrit :
 

  #yiv3185006454 #yiv3185006454 -- filtered {font-family:SimSun;panose-1:2 1 6 
0 3 1 1 1 1 1;}#yiv3185006454 filtered {font-family:SimSun;panose-1:2 1 6 0 3 1 
1 1 1 1;}#yiv3185006454 filtered {font-family:Calibri;panose-1:2 15 5 2 2 2 4 3 
2 4;}#yiv3185006454 filtered {font-family:SimSun;panose-1:2 1 6 0 3 1 1 1 1 
1;}#yiv3185006454 p.yiv3185006454MsoNormal, #yiv3185006454 
li.yiv3185006454MsoNormal, #yiv3185006454 div.yiv3185006454MsoNormal 
{margin:0cm;margin-bottom:.0001pt;font-size:11.0pt;}#yiv3185006454 a:link, 
#yiv3185006454 span.yiv3185006454MsoHyperlink 
{color:blue;text-decoration:underline;}#yiv3185006454 a:visited, #yiv3185006454 
span.yiv3185006454MsoHyperlinkFollowed 
{color:purple;text-decoration:underline;}#yiv3185006454 
span.yiv3185006454EmailStyle17 {color:windowtext;}#yiv3185006454 
.yiv3185006454MsoChpDefault {}#yiv3185006454 filtered {margin:72.0pt 90.0pt 
72.0pt 90.0pt;}#yiv3185006454 div.yiv3185006454WordSection1 {}#yiv3185006454 
Hi, All,    Since the Astyanax client has been EOL, we are considering to 
migrate to Datastax java client in our product.    One thing I notice is that 
the CFs created  by Astyanax have ‘compact storage’ attribute which prevent us 
from using some new features provided by CQL such as secondary index.    Does 
anyone know how to remove this attribute? “ALTER TABLE” seems doesn’t work 
according to the CQL document.    Thanks     Boying    

  

How to remove 'compact storage' attribute?

2016-06-07 Thread Lu, Boying
Hi, All,

Since the Astyanax client has been EOL, we are considering to migrate to 
Datastax java client in our product.

One thing I notice is that the CFs created  by Astyanax have 'compact storage' 
attribute which prevent us from
using some new features provided by CQL such as secondary index.

Does anyone know how to remove this attribute? "ALTER TABLE" seems doesn't work 
according to the CQL document.

Thanks

Boying



Re: How to remove huge files with all expired data sooner?

2015-09-28 Thread Dongfeng Lu
Thanks, Erick, Ken, and Jeff.

Erick,

I thought about min_threshold. The document says it "Sets the minimum number of 
SSTables to trigger a minor compaction." I thought removing those large files 
would be considered a major compaction, and this parameter may not help. Am I 
wrong?

I also wondered what side effect it may have by lowering min_threshold value. 
Will there be more compactions? I understand it is a balance sometimes to 
either have multiple small compactions or a single big compaction. 

About your comment "never run nodetool compact". Is it what Cassandra does when 
it finally compact those 4 files? I don't really see the difference between 
what Cassandra does programatically and what if I run it once every two weeks 
to reclaim the disk space.

Ken,

Interesting way to do it. I will think about it.

Jeff,

That would be an ideal solution. Actually I am planning to migrate to the 
latest 2.1 version, and hopefully it will be solved then.

Thanks again, everyone, for your responses.

Dongfeng 


 On Monday, September 28, 2015 10:36 AM, Jeff Jirsa 
<jeff.ji...@crowdstrike.com> wrote:
   

 There’s a seldom discussed parameter called:
unchecked_tombstone_compaction
The documentation describes the option as follows:

| True enables more aggressive than normal tombstone compactions. A single 
SSTable tombstone compaction runs without checking the likelihood of success. 
Cassandra 2.0.9 and later.
 |


You’d need to upgrade to newer than 2.0.9, but by doing so, and enabling 
unchecked_tombstone_compaction, you could encourage cassandra to compact just 
one single large sstable to purge tombstones.


From:  <erickramirezonl...@gmail.com> on behalf of Erick Ramirez
Reply-To:  "user@cassandra.apache.org"
Date:  Sunday, September 27, 2015 at 11:59 PM
To:  "user@cassandra.apache.org", Dongfeng Lu
Subject:  Re: How to remove huge files with all expired data sooner?

Hello,
You should never run `nodetool compact` since this will result in a massive 
SSTable that will almost never get compacted out or take a very long time to 
get compacted out.
You are correct that there needs to be 4 similar-sized SSTables for them to get 
compacted. If you want the expired data to be deleted quicker, try lowering the 
STCS `min_threshold` to 3 or even 2. Good luck!

Cheers,
Erick

On Sat, Sep 26, 2015 at 4:40 AM, Dongfeng Lu <dlu66...@yahoo.com> wrote:

Hi I have a table where I set TTL to only 7 days for all records and we keep 
pumping records in every day. In general, I would expect all data files for 
that table to have timestamps less than, say 8 or 9 days old, giving the system 
some time to work its magic. However, I see some files more than 9 days old 
occationally. Last Friday, I saw 4 large files, each about 10G in size, with 
timestamps about 5, 4, 3, 2 weeks old. Interestingly they are all gone this 
Monday, leaving 1 new file 9 GB in size.

The compaction strategy is SizeTieredCompactionStrategy, and I can understand 
why the above happened. It seems we have 10G of data every week and when 
SizeTieredCompactionStrategy works to create various tiers, it just happened 
the file size for the next tier is 10G, and all the data is packed into this 
huge file. Then it starts the next cycle. Another week goes by, and another 10G 
file is created. This process continues until the minimum number of files of 
the same size is reached, which I think is 4 by default. Then it started to 
compact this set of 4 10G files. At this time, all data in these 4 files have 
expired so we end up with nothing or much smaller file if there is still some 
records with TTL left.

I have many tables like this, and I'd like to reclaim those spaces sooner. What 
would be the best way to do it? Should I run "nodetool compact" when I see two 
large files that are 2 weeks old? Is there configuration parameters I can tune 
to achieve the same effect? I looked through all the CQL Compaction 
Subproperties for STCS, but I am not sure how they can help here. Any 
suggestion is welcome.

BTW, I am using Cassandra 2.0.6.




  

Re: How to remove huge files with all expired data sooner?

2015-09-28 Thread Robert Coli
On Sun, Sep 27, 2015 at 11:59 PM, Erick Ramirez 
wrote:

> You should never run `nodetool compact` since this will result in a
> massive SSTable that will almost never get compacted out or take a very
> long time to get compacted out.
>

Respectfully disagree. There are various cases where nodetool compact will
result in a small SSTable.

There are other cases where one might wish to major compact and then stop
the node and run sstablesplit.

I agree that in modern Cassandra, if one has not made an error, one should
rarely wish to run nodetool compact, but "never" is too strong.

=Rob


Re: How to remove huge files with all expired data sooner?

2015-09-28 Thread Jeff Jirsa
There’s a seldom discussed parameter called:

unchecked_tombstone_compaction

The documentation describes the option as follows:

True enables more aggressive than normal tombstone compactions. A single 
SSTable tombstone compaction runs without checking the likelihood of success. 
Cassandra 2.0.9 and later.

You’d need to upgrade to newer than 2.0.9, but by doing so, and enabling 
unchecked_tombstone_compaction, you could encourage cassandra to compact just 
one single large sstable to purge tombstones.



From:  <erickramirezonl...@gmail.com> on behalf of Erick Ramirez
Reply-To:  "user@cassandra.apache.org"
Date:  Sunday, September 27, 2015 at 11:59 PM
To:  "user@cassandra.apache.org", Dongfeng Lu
Subject:  Re: How to remove huge files with all expired data sooner?

Hello, 

You should never run `nodetool compact` since this will result in a massive 
SSTable that will almost never get compacted out or take a very long time to 
get compacted out.

You are correct that there needs to be 4 similar-sized SSTables for them to get 
compacted. If you want the expired data to be deleted quicker, try lowering the 
STCS `min_threshold` to 3 or even 2. Good luck!

Cheers,
Erick 


On Sat, Sep 26, 2015 at 4:40 AM, Dongfeng Lu <dlu66...@yahoo.com> wrote:
Hi I have a table where I set TTL to only 7 days for all records and we keep 
pumping records in every day. In general, I would expect all data files for 
that table to have timestamps less than, say 8 or 9 days old, giving the system 
some time to work its magic. However, I see some files more than 9 days old 
occationally. Last Friday, I saw 4 large files, each about 10G in size, with 
timestamps about 5, 4, 3, 2 weeks old. Interestingly they are all gone this 
Monday, leaving 1 new file 9 GB in size.

The compaction strategy is SizeTieredCompactionStrategy, and I can understand 
why the above happened. It seems we have 10G of data every week and when 
SizeTieredCompactionStrategy works to create various tiers, it just happened 
the file size for the next tier is 10G, and all the data is packed into this 
huge file. Then it starts the next cycle. Another week goes by, and another 10G 
file is created. This process continues until the minimum number of files of 
the same size is reached, which I think is 4 by default. Then it started to 
compact this set of 4 10G files. At this time, all data in these 4 files have 
expired so we end up with nothing or much smaller file if there is still some 
records with TTL left.

I have many tables like this, and I'd like to reclaim those spaces sooner. What 
would be the best way to do it? Should I run "nodetool compact" when I see two 
large files that are 2 weeks old? Is there configuration parameters I can tune 
to achieve the same effect? I looked through all the CQL Compaction 
Subproperties for STCS, but I am not sure how they can help here. Any 
suggestion is welcome.

BTW, I am using Cassandra 2.0.6.




smime.p7s
Description: S/MIME cryptographic signature


Re: How to remove huge files with all expired data sooner?

2015-09-28 Thread Erick Ramirez
Hello,

You should never run `nodetool compact` since this will result in a massive
SSTable that will almost never get compacted out or take a very long time
to get compacted out.

You are correct that there needs to be 4 similar-sized SSTables for them to
get compacted. If you want the expired data to be deleted quicker, try
lowering the STCS `min_threshold` to 3 or even 2. Good luck!

Cheers,
Erick


On Sat, Sep 26, 2015 at 4:40 AM, Dongfeng Lu  wrote:

> Hi I have a table where I set TTL to only 7 days for all records and we
> keep pumping records in every day. In general, I would expect all data
> files for that table to have timestamps less than, say 8 or 9 days old,
> giving the system some time to work its magic. However, I see some files
> more than 9 days old occationally. Last Friday, I saw 4 large files, each
> about 10G in size, with timestamps about 5, 4, 3, 2 weeks old.
> Interestingly they are all gone this Monday, leaving 1 new file 9 GB in
> size.
>
> The compaction strategy is SizeTieredCompactionStrategy, and I can
> understand why the above happened. It seems we have 10G of data every week
> and when SizeTieredCompactionStrategy works to create various tiers, it
> just happened the file size for the next tier is 10G, and all the data is
> packed into this huge file. Then it starts the next cycle. Another week
> goes by, and another 10G file is created. This process continues until the
> minimum number of files of the same size is reached, which I think is 4 by
> default. Then it started to compact this set of 4 10G files. At this time,
> all data in these 4 files have expired so we end up with nothing or much
> smaller file if there is still some records with TTL left.
>
> I have many tables like this, and I'd like to reclaim those spaces sooner.
> What would be the best way to do it? Should I run "nodetool compact" when I
> see two large files that are 2 weeks old? Is there configuration parameters
> I can tune to achieve the same effect? I looked through all the CQL
> Compaction Subproperties for STCS, but I am not sure how they can help
> here. Any suggestion is welcome.
>
> BTW, I am using Cassandra 2.0.6.
>


Re: How to remove huge files with all expired data sooner?

2015-09-28 Thread Ken Hancock
On Mon, Sep 28, 2015 at 2:59 AM, Erick Ramirez  wrote:

> have many tables like this, and I'd like to reclaim those spaces sooner.
> What would be the best way to do it? Should I run "nodetool compact" when I
> see two large files that are 2 weeks old? Is there configuration parameters
> I can tune to achieve the same effect? I looked through all the CQL
> Compaction Subproperties for STCS, but I am not sure how they can help
> here. Any suggestion is welcome.


You can use the JMX org.apache.cassandra.db:type=StorageService
forceTableCompaction to compact a single table.

Last time this came up, Robert Coli also indicated he thought nodetool
cleanup would trigger the same thing, but I never got a chance to confirm
that as I'd already done something with forceTableCompaction.  If you have
the data and try a cleanup, please report back your findings.


Re: Unable to remove dead node from cluster.

2015-09-25 Thread Nate McCall
A few other folks have reported issues with lingering dead nodes on large
clusters - Jason Brown *just* gave an excellent gossip presentation at the
summit regarding gossip optimizations for large clusters.

Gossip is in the process of being refactored (here's at least one of the
issues: https://issues.apache.org/jira/browse/CASSANDRA-9667), but it would
be worth opening an issue with as much information as you can provide to,
at the very least, have information avaiable for others.

On Fri, Sep 25, 2015 at 7:08 AM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
wrote:

> The stack trace is one similar to one I recall seeing recently, but don’t
> have in front of me. This is an outside chance that is not at all certain
> to be the case.
>
> For EACH of the hundreds of nodes in your cluster, I suggest you run
>
> nodetool status | egrep “(^UN|^DN)" | wc -l
>
> and count to see if every node really has every other node in its ring
> properly.
>
> I suspect, but am not at all sure, that you have inconsistencies you’re
> not yet aware of (for example, if you expect that you have 100 nodes in the
> cluster, I’m betting that the query above returns 99 on at least one of the
> nodes).  If this is the case, please reply so that you and I can submit a
> Jira and compare our stack traces and we can find the underlying root cause
> of this together.
>
> - Jeff
>
> From: Dikang Gu
> Reply-To: "user@cassandra.apache.org"
> Date: Thursday, September 24, 2015 at 9:10 PM
> To: cassandra
>
> Subject: Re: Unable to remove dead node from cluster.
>
> @Jeff, I just use jmx connect to one node, run the
> unsafeAssainateEndpoint, and pass in the "10.210.165.55" ip address.
>
> Yes, we have hundreds of other nodes in the nodetool status output as well.
>
> On Tue, Sep 22, 2015 at 11:31 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
> wrote:
>
>> When you run unsafeAssassinateEndpoint, to which host are you connected,
>> and what argument are you passing?
>>
>> Are there other nodes in the ring that you’re not including in the
>> ‘nodetool status’ output?
>>
>>
>> From: Dikang Gu
>> Reply-To: "user@cassandra.apache.org"
>> Date: Tuesday, September 22, 2015 at 10:09 PM
>> To: cassandra
>> Cc: "d...@cassandra.apache.org"
>> Subject: Re: Unable to remove dead node from cluster.
>>
>> ping.
>>
>> On Mon, Sep 21, 2015 at 11:51 AM, Dikang Gu <dikan...@gmail.com> wrote:
>>
>>> I have tried all of them, neither of them worked.
>>> 1. decommission: the host had hardware issue, and I can not connect to
>>> it.
>>> 2. remove, there is not HostID, so the removenode did not work.
>>> 3. unsafeAssassinateEndpoint, it will throw NPE as I pasted before, can
>>> we fix it?
>>>
>>> Thanks
>>> Dikang.
>>>
>>> On Mon, Sep 21, 2015 at 11:11 AM, Sebastian Estevez <
>>> sebastian.este...@datastax.com> wrote:
>>>
>>>> Order is decommission, remove, assassinate.
>>>>
>>>> Which have you tried?
>>>> On Sep 21, 2015 10:47 AM, "Dikang Gu" <dikan...@gmail.com> wrote:
>>>>
>>>>> Hi there,
>>>>>
>>>>> I have a dead node in our cluster, which is a wired state right now,
>>>>> and can not be removed from cluster.
>>>>>
>>>>> The nodestatus shows:
>>>>> Datacenter: DC1
>>>>> ===
>>>>> Status=Up/Down
>>>>> |/ State=Normal/Leaving/Joining/Moving
>>>>> --  Address  Load   Tokens  OwnsHost
>>>>> ID   Rack
>>>>> DN  10.210.165.55?  256 ?   null
>>>>>r1
>>>>>
>>>>> I tried the unsafeAssassinateEndpoint, but got exception like:
>>>>> 2015-09-18_23:21:40.79760 INFO  23:21:40 InetAddress /10.210.165.55
>>>>> is now DOWN
>>>>> 2015-09-18_23:21:40.80667 ERROR 23:21:40 Exception in thread
>>>>> Thread[GossipStage:1,5,main]
>>>>> 2015-09-18_23:21:40.80668 java.lang.NullPointerException: null
>>>>> 2015-09-18_23:21:40.80669   at
>>>>> org.apache.cassandra.service.StorageService.getApplicationStateValue(StorageService.java:1584)
>>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>>> 2015-09-18_23:21:40.80669   at
>>>>> org.apache.cassandra.service.Sto

How to remove huge files with all expired data sooner?

2015-09-25 Thread Dongfeng Lu
Hi I have a table where I set TTL to only 7 days for all records and we keep 
pumping records in every day. In general, I would expect all data files for 
that table to have timestamps less than, say 8 or 9 days old, giving the system 
some time to work its magic. However, I see some files more than 9 days old 
occationally. Last Friday, I saw 4 large files, each about 10G in size, with 
timestamps about 5, 4, 3, 2 weeks old. Interestingly they are all gone this 
Monday, leaving 1 new file 9 GB in size.

The compaction strategy is SizeTieredCompactionStrategy, and I can understand 
why the above happened. It seems we have 10G of data every week and when 
SizeTieredCompactionStrategy works to create various tiers, it just happened 
the file size for the next tier is 10G, and all the data is packed into this 
huge file. Then it starts the next cycle. Another week goes by, and another 10G 
file is created. This process continues until the minimum number of files of 
the same size is reached, which I think is 4 by default. Then it started to 
compact this set of 4 10G files. At this time, all data in these 4 files have 
expired so we end up with nothing or much smaller file if there is still some 
records with TTL left.

I have many tables like this, and I'd like to reclaim those spaces sooner. What 
would be the best way to do it? Should I run "nodetool compact" when I see two 
large files that are 2 weeks old? Is there configuration parameters I can tune 
to achieve the same effect? I looked through all the CQL Compaction 
Subproperties for STCS, but I am not sure how they can help here. Any 
suggestion is welcome.

BTW, I am using Cassandra 2.0.6.


Re: Unable to remove dead node from cluster.

2015-09-25 Thread Dikang Gu
The NPE throws when node tried to handleStateLeft, because it can not find
the tokens associated with the node, can we just ignore the NPE and
continue to remove the endpoint from the ring?

On Fri, Sep 25, 2015 at 10:52 AM, Dikang Gu <dikan...@gmail.com> wrote:

> @Jeff, yeah, I run the nodetool grep, and in my case, some nodes return
> "301", and some nodes return "300". And 300 is the correct number of nodes
> in my cluster.
>
> So it does look like an inconsistent issue, can you open a jira for this?
> Also, I'm looking for a quick fix/patch for this.
>
> On Fri, Sep 25, 2015 at 7:43 AM, Nate McCall <n...@thelastpickle.com>
> wrote:
>
>> A few other folks have reported issues with lingering dead nodes on large
>> clusters - Jason Brown *just* gave an excellent gossip presentation at the
>> summit regarding gossip optimizations for large clusters.
>>
>> Gossip is in the process of being refactored (here's at least one of the
>> issues: https://issues.apache.org/jira/browse/CASSANDRA-9667), but it
>> would be worth opening an issue with as much information as you can provide
>> to, at the very least, have information avaiable for others.
>>
>> On Fri, Sep 25, 2015 at 7:08 AM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
>> wrote:
>>
>>> The stack trace is one similar to one I recall seeing recently, but
>>> don’t have in front of me. This is an outside chance that is not at all
>>> certain to be the case.
>>>
>>> For EACH of the hundreds of nodes in your cluster, I suggest you run
>>>
>>> nodetool status | egrep “(^UN|^DN)" | wc -l
>>>
>>> and count to see if every node really has every other node in its ring
>>> properly.
>>>
>>> I suspect, but am not at all sure, that you have inconsistencies you’re
>>> not yet aware of (for example, if you expect that you have 100 nodes in the
>>> cluster, I’m betting that the query above returns 99 on at least one of the
>>> nodes).  If this is the case, please reply so that you and I can submit a
>>> Jira and compare our stack traces and we can find the underlying root cause
>>> of this together.
>>>
>>> - Jeff
>>>
>>> From: Dikang Gu
>>> Reply-To: "user@cassandra.apache.org"
>>> Date: Thursday, September 24, 2015 at 9:10 PM
>>> To: cassandra
>>>
>>> Subject: Re: Unable to remove dead node from cluster.
>>>
>>> @Jeff, I just use jmx connect to one node, run the
>>> unsafeAssainateEndpoint, and pass in the "10.210.165.55" ip address.
>>>
>>> Yes, we have hundreds of other nodes in the nodetool status output as
>>> well.
>>>
>>> On Tue, Sep 22, 2015 at 11:31 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com
>>> > wrote:
>>>
>>>> When you run unsafeAssassinateEndpoint, to which host are you
>>>> connected, and what argument are you passing?
>>>>
>>>> Are there other nodes in the ring that you’re not including in the
>>>> ‘nodetool status’ output?
>>>>
>>>>
>>>> From: Dikang Gu
>>>> Reply-To: "user@cassandra.apache.org"
>>>> Date: Tuesday, September 22, 2015 at 10:09 PM
>>>> To: cassandra
>>>> Cc: "d...@cassandra.apache.org"
>>>> Subject: Re: Unable to remove dead node from cluster.
>>>>
>>>> ping.
>>>>
>>>> On Mon, Sep 21, 2015 at 11:51 AM, Dikang Gu <dikan...@gmail.com> wrote:
>>>>
>>>>> I have tried all of them, neither of them worked.
>>>>> 1. decommission: the host had hardware issue, and I can not connect to
>>>>> it.
>>>>> 2. remove, there is not HostID, so the removenode did not work.
>>>>> 3. unsafeAssassinateEndpoint, it will throw NPE as I pasted before,
>>>>> can we fix it?
>>>>>
>>>>> Thanks
>>>>> Dikang.
>>>>>
>>>>> On Mon, Sep 21, 2015 at 11:11 AM, Sebastian Estevez <
>>>>> sebastian.este...@datastax.com> wrote:
>>>>>
>>>>>> Order is decommission, remove, assassinate.
>>>>>>
>>>>>> Which have you tried?
>>>>>> On Sep 21, 2015 10:47 AM, "Dikang Gu" <dikan...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi there,
>>>>>>>
>>>>>>> I have a dead node in 

Re: Unable to remove dead node from cluster.

2015-09-25 Thread Dikang Gu
@Jeff, yeah, I run the nodetool grep, and in my case, some nodes return
"301", and some nodes return "300". And 300 is the correct number of nodes
in my cluster.

So it does look like an inconsistent issue, can you open a jira for this?
Also, I'm looking for a quick fix/patch for this.

On Fri, Sep 25, 2015 at 7:43 AM, Nate McCall <n...@thelastpickle.com> wrote:

> A few other folks have reported issues with lingering dead nodes on large
> clusters - Jason Brown *just* gave an excellent gossip presentation at the
> summit regarding gossip optimizations for large clusters.
>
> Gossip is in the process of being refactored (here's at least one of the
> issues: https://issues.apache.org/jira/browse/CASSANDRA-9667), but it
> would be worth opening an issue with as much information as you can provide
> to, at the very least, have information avaiable for others.
>
> On Fri, Sep 25, 2015 at 7:08 AM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
> wrote:
>
>> The stack trace is one similar to one I recall seeing recently, but don’t
>> have in front of me. This is an outside chance that is not at all certain
>> to be the case.
>>
>> For EACH of the hundreds of nodes in your cluster, I suggest you run
>>
>> nodetool status | egrep “(^UN|^DN)" | wc -l
>>
>> and count to see if every node really has every other node in its ring
>> properly.
>>
>> I suspect, but am not at all sure, that you have inconsistencies you’re
>> not yet aware of (for example, if you expect that you have 100 nodes in the
>> cluster, I’m betting that the query above returns 99 on at least one of the
>> nodes).  If this is the case, please reply so that you and I can submit a
>> Jira and compare our stack traces and we can find the underlying root cause
>> of this together.
>>
>> - Jeff
>>
>> From: Dikang Gu
>> Reply-To: "user@cassandra.apache.org"
>> Date: Thursday, September 24, 2015 at 9:10 PM
>> To: cassandra
>>
>> Subject: Re: Unable to remove dead node from cluster.
>>
>> @Jeff, I just use jmx connect to one node, run the
>> unsafeAssainateEndpoint, and pass in the "10.210.165.55" ip address.
>>
>> Yes, we have hundreds of other nodes in the nodetool status output as
>> well.
>>
>> On Tue, Sep 22, 2015 at 11:31 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
>> wrote:
>>
>>> When you run unsafeAssassinateEndpoint, to which host are you connected,
>>> and what argument are you passing?
>>>
>>> Are there other nodes in the ring that you’re not including in the
>>> ‘nodetool status’ output?
>>>
>>>
>>> From: Dikang Gu
>>> Reply-To: "user@cassandra.apache.org"
>>> Date: Tuesday, September 22, 2015 at 10:09 PM
>>> To: cassandra
>>> Cc: "d...@cassandra.apache.org"
>>> Subject: Re: Unable to remove dead node from cluster.
>>>
>>> ping.
>>>
>>> On Mon, Sep 21, 2015 at 11:51 AM, Dikang Gu <dikan...@gmail.com> wrote:
>>>
>>>> I have tried all of them, neither of them worked.
>>>> 1. decommission: the host had hardware issue, and I can not connect to
>>>> it.
>>>> 2. remove, there is not HostID, so the removenode did not work.
>>>> 3. unsafeAssassinateEndpoint, it will throw NPE as I pasted before, can
>>>> we fix it?
>>>>
>>>> Thanks
>>>> Dikang.
>>>>
>>>> On Mon, Sep 21, 2015 at 11:11 AM, Sebastian Estevez <
>>>> sebastian.este...@datastax.com> wrote:
>>>>
>>>>> Order is decommission, remove, assassinate.
>>>>>
>>>>> Which have you tried?
>>>>> On Sep 21, 2015 10:47 AM, "Dikang Gu" <dikan...@gmail.com> wrote:
>>>>>
>>>>>> Hi there,
>>>>>>
>>>>>> I have a dead node in our cluster, which is a wired state right now,
>>>>>> and can not be removed from cluster.
>>>>>>
>>>>>> The nodestatus shows:
>>>>>> Datacenter: DC1
>>>>>> ===
>>>>>> Status=Up/Down
>>>>>> |/ State=Normal/Leaving/Joining/Moving
>>>>>> --  Address  Load   Tokens  OwnsHost
>>>>>> ID   Rack
>>>>>> DN  10.210.165.55?  256 ?   null
>>>>>>  r1
>>>>>>
>

Re: Unable to remove dead node from cluster.

2015-09-25 Thread Jeff Jirsa
Apparently this was reported back in May: 
https://issues.apache.org/jira/browse/CASSANDRA-9510

- Jeff

From:  Dikang Gu
Reply-To:  "user@cassandra.apache.org"
Date:  Friday, September 25, 2015 at 11:31 AM
To:  cassandra
Subject:  Re: Unable to remove dead node from cluster.

The NPE throws when node tried to handleStateLeft, because it can not find the 
tokens associated with the node, can we just ignore the NPE and continue to 
remove the endpoint from the ring?

On Fri, Sep 25, 2015 at 10:52 AM, Dikang Gu <dikan...@gmail.com> wrote:
@Jeff, yeah, I run the nodetool grep, and in my case, some nodes return "301", 
and some nodes return "300". And 300 is the correct number of nodes in my 
cluster. 

So it does look like an inconsistent issue, can you open a jira for this? Also, 
I'm looking for a quick fix/patch for this.

On Fri, Sep 25, 2015 at 7:43 AM, Nate McCall <n...@thelastpickle.com> wrote:
A few other folks have reported issues with lingering dead nodes on large 
clusters - Jason Brown *just* gave an excellent gossip presentation at the 
summit regarding gossip optimizations for large clusters. 

Gossip is in the process of being refactored (here's at least one of the 
issues: https://issues.apache.org/jira/browse/CASSANDRA-9667), but it would be 
worth opening an issue with as much information as you can provide to, at the 
very least, have information avaiable for others. 

On Fri, Sep 25, 2015 at 7:08 AM, Jeff Jirsa <jeff.ji...@crowdstrike.com> wrote:
The stack trace is one similar to one I recall seeing recently, but don’t have 
in front of me. This is an outside chance that is not at all certain to be the 
case.

For EACH of the hundreds of nodes in your cluster, I suggest you run 

nodetool status | egrep “(^UN|^DN)" | wc -l 

and count to see if every node really has every other node in its ring 
properly. 

I suspect, but am not at all sure, that you have inconsistencies you’re not yet 
aware of (for example, if you expect that you have 100 nodes in the cluster, 
I’m betting that the query above returns 99 on at least one of the nodes).  If 
this is the case, please reply so that you and I can submit a Jira and compare 
our stack traces and we can find the underlying root cause of this together. 

- Jeff

From: Dikang Gu
Reply-To: "user@cassandra.apache.org"
Date: Thursday, September 24, 2015 at 9:10 PM
To: cassandra 

Subject: Re: Unable to remove dead node from cluster.

@Jeff, I just use jmx connect to one node, run the unsafeAssainateEndpoint, and 
pass in the "10.210.165.55" ip address.

Yes, we have hundreds of other nodes in the nodetool status output as well.

On Tue, Sep 22, 2015 at 11:31 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com> wrote:
When you run unsafeAssassinateEndpoint, to which host are you connected, and 
what argument are you passing?

Are there other nodes in the ring that you’re not including in the ‘nodetool 
status’ output?


From: Dikang Gu
Reply-To: "user@cassandra.apache.org"
Date: Tuesday, September 22, 2015 at 10:09 PM
To: cassandra
Cc: "d...@cassandra.apache.org"
Subject: Re: Unable to remove dead node from cluster.

ping.

On Mon, Sep 21, 2015 at 11:51 AM, Dikang Gu <dikan...@gmail.com> wrote:
I have tried all of them, neither of them worked. 
1. decommission: the host had hardware issue, and I can not connect to it.
2. remove, there is not HostID, so the removenode did not work.
3. unsafeAssassinateEndpoint, it will throw NPE as I pasted before, can we fix 
it?

Thanks
Dikang.

On Mon, Sep 21, 2015 at 11:11 AM, Sebastian Estevez 
<sebastian.este...@datastax.com> wrote:

Order is decommission, remove, assassinate.

Which have you tried?

On Sep 21, 2015 10:47 AM, "Dikang Gu" <dikan...@gmail.com> wrote:
Hi there, 

I have a dead node in our cluster, which is a wired state right now, and can 
not be removed from cluster.

The nodestatus shows:
Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  OwnsHost ID 
  Rack
DN  10.210.165.55?  256 ?   null
  r1

I tried the unsafeAssassinateEndpoint, but got exception like:
2015-09-18_23:21:40.79760 INFO  23:21:40 InetAddress /10.210.165.55 is now DOWN
2015-09-18_23:21:40.80667 ERROR 23:21:40 Exception in thread 
Thread[GossipStage:1,5,main]
2015-09-18_23:21:40.80668 java.lang.NullPointerException: null
2015-09-18_23:21:40.80669   at 
org.apache.cassandra.service.StorageService.getApplicationStateValue(StorageService.java:1584)
 ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80669   at 
org.apache.cassandra.service.StorageService.getTokensFor(StorageService.java:1592)
 ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015

Re: Unable to remove dead node from cluster.

2015-09-24 Thread Dikang Gu
@Jeff, I just use jmx connect to one node, run the unsafeAssainateEndpoint,
and pass in the "10.210.165.55" ip address.

Yes, we have hundreds of other nodes in the nodetool status output as well.

On Tue, Sep 22, 2015 at 11:31 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
wrote:

> When you run unsafeAssassinateEndpoint, to which host are you connected,
> and what argument are you passing?
>
> Are there other nodes in the ring that you’re not including in the
> ‘nodetool status’ output?
>
>
> From: Dikang Gu
> Reply-To: "user@cassandra.apache.org"
> Date: Tuesday, September 22, 2015 at 10:09 PM
> To: cassandra
> Cc: "d...@cassandra.apache.org"
> Subject: Re: Unable to remove dead node from cluster.
>
> ping.
>
> On Mon, Sep 21, 2015 at 11:51 AM, Dikang Gu <dikan...@gmail.com> wrote:
>
>> I have tried all of them, neither of them worked.
>> 1. decommission: the host had hardware issue, and I can not connect to it.
>> 2. remove, there is not HostID, so the removenode did not work.
>> 3. unsafeAssassinateEndpoint, it will throw NPE as I pasted before, can
>> we fix it?
>>
>> Thanks
>> Dikang.
>>
>> On Mon, Sep 21, 2015 at 11:11 AM, Sebastian Estevez <
>> sebastian.este...@datastax.com> wrote:
>>
>>> Order is decommission, remove, assassinate.
>>>
>>> Which have you tried?
>>> On Sep 21, 2015 10:47 AM, "Dikang Gu" <dikan...@gmail.com> wrote:
>>>
>>>> Hi there,
>>>>
>>>> I have a dead node in our cluster, which is a wired state right now,
>>>> and can not be removed from cluster.
>>>>
>>>> The nodestatus shows:
>>>> Datacenter: DC1
>>>> ===
>>>> Status=Up/Down
>>>> |/ State=Normal/Leaving/Joining/Moving
>>>> --  Address  Load   Tokens  OwnsHost ID
>>>>   Rack
>>>> DN  10.210.165.55?  256 ?   null
>>>>r1
>>>>
>>>> I tried the unsafeAssassinateEndpoint, but got exception like:
>>>> 2015-09-18_23:21:40.79760 INFO  23:21:40 InetAddress /10.210.165.55 is
>>>> now DOWN
>>>> 2015-09-18_23:21:40.80667 ERROR 23:21:40 Exception in thread
>>>> Thread[GossipStage:1,5,main]
>>>> 2015-09-18_23:21:40.80668 java.lang.NullPointerException: null
>>>> 2015-09-18_23:21:40.80669   at
>>>> org.apache.cassandra.service.StorageService.getApplicationStateValue(StorageService.java:1584)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80669   at
>>>> org.apache.cassandra.service.StorageService.getTokensFor(StorageService.java:1592)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80670   at
>>>> org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1822)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80671   at
>>>> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1495)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80671   at
>>>> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2121)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80672   at
>>>> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1009)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80673   at
>>>> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1113)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80673   at
>>>> org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80673   at
>>>> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62)
>>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>>> 2015-09-18_23:21:40.80674   at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>> ~[na:1.7.0_45]
>>>> 2015-09-18_23:21:40.80674   at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>> ~[na:1.7.0_45]
>>>> 2015-09-18_23:21:40.80674   at
>>>> java.lang.Thread.run(Thread.java:744) ~[na:1.7.0_45]
>>>> 2015-09-18_23:21:40.85812 WARN  23:21:40 Not marking nodes down due to
>>>> local pause of 10852378435 > 50
>>>>
>>>> Any suggestions about how to remove it?
>>>> Thanks.
>>>>
>>>> --
>>>> Dikang
>>>>
>>>>
>>
>>
>> --
>> Dikang
>>
>>
>
>
> --
> Dikang
>
>


-- 
Dikang


Re: Unable to remove dead node from cluster.

2015-09-23 Thread Jeff Jirsa
When you run unsafeAssassinateEndpoint, to which host are you connected, and 
what argument are you passing?

Are there other nodes in the ring that you’re not including in the ‘nodetool 
status’ output?


From:  Dikang Gu
Reply-To:  "user@cassandra.apache.org"
Date:  Tuesday, September 22, 2015 at 10:09 PM
To:  cassandra
Cc:  "d...@cassandra.apache.org"
Subject:  Re: Unable to remove dead node from cluster.

ping.

On Mon, Sep 21, 2015 at 11:51 AM, Dikang Gu <dikan...@gmail.com> wrote:
I have tried all of them, neither of them worked. 
1. decommission: the host had hardware issue, and I can not connect to it.
2. remove, there is not HostID, so the removenode did not work.
3. unsafeAssassinateEndpoint, it will throw NPE as I pasted before, can we fix 
it?

Thanks
Dikang.

On Mon, Sep 21, 2015 at 11:11 AM, Sebastian Estevez 
<sebastian.este...@datastax.com> wrote:

Order is decommission, remove, assassinate.

Which have you tried?

On Sep 21, 2015 10:47 AM, "Dikang Gu" <dikan...@gmail.com> wrote:
Hi there, 

I have a dead node in our cluster, which is a wired state right now, and can 
not be removed from cluster.

The nodestatus shows:
Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  OwnsHost ID 
  Rack
DN  10.210.165.55?  256 ?   null
  r1

I tried the unsafeAssassinateEndpoint, but got exception like:
2015-09-18_23:21:40.79760 INFO  23:21:40 InetAddress /10.210.165.55 is now DOWN
2015-09-18_23:21:40.80667 ERROR 23:21:40 Exception in thread 
Thread[GossipStage:1,5,main]
2015-09-18_23:21:40.80668 java.lang.NullPointerException: null
2015-09-18_23:21:40.80669   at 
org.apache.cassandra.service.StorageService.getApplicationStateValue(StorageService.java:1584)
 ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80669   at 
org.apache.cassandra.service.StorageService.getTokensFor(StorageService.java:1592)
 ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80670   at 
org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1822)
 ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80671   at 
org.apache.cassandra.service.StorageService.onChange(StorageService.java:1495) 
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80671   at 
org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2121) 
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80672   at 
org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1009) 
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80673   at 
org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1113) 
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80673   at 
org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
 ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80673   at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) 
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80674   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
~[na:1.7.0_45]
2015-09-18_23:21:40.80674   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
~[na:1.7.0_45]
2015-09-18_23:21:40.80674   at java.lang.Thread.run(Thread.java:744) 
~[na:1.7.0_45]
2015-09-18_23:21:40.85812 WARN  23:21:40 Not marking nodes down due to local 
pause of 10852378435 > 50

Any suggestions about how to remove it?
Thanks.

-- 
Dikang




-- 
Dikang




-- 
Dikang




smime.p7s
Description: S/MIME cryptographic signature


Re: Unable to remove dead node from cluster.

2015-09-22 Thread Dikang Gu
ping.

On Mon, Sep 21, 2015 at 11:51 AM, Dikang Gu <dikan...@gmail.com> wrote:

> I have tried all of them, neither of them worked.
> 1. decommission: the host had hardware issue, and I can not connect to it.
> 2. remove, there is not HostID, so the removenode did not work.
> 3. unsafeAssassinateEndpoint, it will throw NPE as I pasted before, can we
> fix it?
>
> Thanks
> Dikang.
>
> On Mon, Sep 21, 2015 at 11:11 AM, Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
>> Order is decommission, remove, assassinate.
>>
>> Which have you tried?
>> On Sep 21, 2015 10:47 AM, "Dikang Gu" <dikan...@gmail.com> wrote:
>>
>>> Hi there,
>>>
>>> I have a dead node in our cluster, which is a wired state right now, and
>>> can not be removed from cluster.
>>>
>>> The nodestatus shows:
>>> Datacenter: DC1
>>> ===
>>> Status=Up/Down
>>> |/ State=Normal/Leaving/Joining/Moving
>>> --  Address  Load   Tokens  OwnsHost ID
>>>   Rack
>>> DN  10.210.165.55?  256 ?   null
>>>  r1
>>>
>>> I tried the unsafeAssassinateEndpoint, but got exception like:
>>> 2015-09-18_23:21:40.79760 INFO  23:21:40 InetAddress /10.210.165.55 is
>>> now DOWN
>>> 2015-09-18_23:21:40.80667 ERROR 23:21:40 Exception in thread
>>> Thread[GossipStage:1,5,main]
>>> 2015-09-18_23:21:40.80668 java.lang.NullPointerException: null
>>> 2015-09-18_23:21:40.80669   at
>>> org.apache.cassandra.service.StorageService.getApplicationStateValue(StorageService.java:1584)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80669   at
>>> org.apache.cassandra.service.StorageService.getTokensFor(StorageService.java:1592)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80670   at
>>> org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1822)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80671   at
>>> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1495)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80671   at
>>> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2121)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80672   at
>>> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1009)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80673   at
>>> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1113)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80673   at
>>> org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80673   at
>>> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62)
>>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>>> 2015-09-18_23:21:40.80674   at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> ~[na:1.7.0_45]
>>> 2015-09-18_23:21:40.80674   at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> ~[na:1.7.0_45]
>>> 2015-09-18_23:21:40.80674   at java.lang.Thread.run(Thread.java:744)
>>> ~[na:1.7.0_45]
>>> 2015-09-18_23:21:40.85812 WARN  23:21:40 Not marking nodes down due to
>>> local pause of 10852378435 > 50
>>>
>>> Any suggestions about how to remove it?
>>> Thanks.
>>>
>>> --
>>> Dikang
>>>
>>>
>
>
> --
> Dikang
>
>


-- 
Dikang


Unable to remove dead node from cluster.

2015-09-21 Thread Dikang Gu
Hi there,

I have a dead node in our cluster, which is a wired state right now, and
can not be removed from cluster.

The nodestatus shows:
Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens  OwnsHost ID
  Rack
DN  10.210.165.55?  256 ?   null
   r1

I tried the unsafeAssassinateEndpoint, but got exception like:
2015-09-18_23:21:40.79760 INFO  23:21:40 InetAddress /10.210.165.55 is now
DOWN
2015-09-18_23:21:40.80667 ERROR 23:21:40 Exception in thread
Thread[GossipStage:1,5,main]
2015-09-18_23:21:40.80668 java.lang.NullPointerException: null
2015-09-18_23:21:40.80669   at
org.apache.cassandra.service.StorageService.getApplicationStateValue(StorageService.java:1584)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80669   at
org.apache.cassandra.service.StorageService.getTokensFor(StorageService.java:1592)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80670   at
org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1822)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80671   at
org.apache.cassandra.service.StorageService.onChange(StorageService.java:1495)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80671   at
org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2121)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80672   at
org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1009)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80673   at
org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1113)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80673   at
org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80673   at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62)
~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
2015-09-18_23:21:40.80674   at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_45]
2015-09-18_23:21:40.80674   at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
~[na:1.7.0_45]
2015-09-18_23:21:40.80674   at java.lang.Thread.run(Thread.java:744)
~[na:1.7.0_45]
2015-09-18_23:21:40.85812 WARN  23:21:40 Not marking nodes down due to
local pause of 10852378435 > 50

Any suggestions about how to remove it?
Thanks.

-- 
Dikang


Re: Unable to remove dead node from cluster.

2015-09-21 Thread Sebastian Estevez
Order is decommission, remove, assassinate.

Which have you tried?
On Sep 21, 2015 10:47 AM, "Dikang Gu" <dikan...@gmail.com> wrote:

> Hi there,
>
> I have a dead node in our cluster, which is a wired state right now, and
> can not be removed from cluster.
>
> The nodestatus shows:
> Datacenter: DC1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address  Load   Tokens  OwnsHost ID
> Rack
> DN  10.210.165.55?  256 ?   null
>r1
>
> I tried the unsafeAssassinateEndpoint, but got exception like:
> 2015-09-18_23:21:40.79760 INFO  23:21:40 InetAddress /10.210.165.55 is
> now DOWN
> 2015-09-18_23:21:40.80667 ERROR 23:21:40 Exception in thread
> Thread[GossipStage:1,5,main]
> 2015-09-18_23:21:40.80668 java.lang.NullPointerException: null
> 2015-09-18_23:21:40.80669   at
> org.apache.cassandra.service.StorageService.getApplicationStateValue(StorageService.java:1584)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80669   at
> org.apache.cassandra.service.StorageService.getTokensFor(StorageService.java:1592)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80670   at
> org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1822)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80671   at
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1495)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80671   at
> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2121)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80672   at
> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1009)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80673   at
> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1113)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80673   at
> org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80673   at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62)
> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
> 2015-09-18_23:21:40.80674   at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> ~[na:1.7.0_45]
> 2015-09-18_23:21:40.80674   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> ~[na:1.7.0_45]
> 2015-09-18_23:21:40.80674   at java.lang.Thread.run(Thread.java:744)
> ~[na:1.7.0_45]
> 2015-09-18_23:21:40.85812 WARN  23:21:40 Not marking nodes down due to
> local pause of 10852378435 > 50
>
> Any suggestions about how to remove it?
> Thanks.
>
> --
> Dikang
>
>


Re: Unable to remove dead node from cluster.

2015-09-21 Thread Dikang Gu
I have tried all of them, neither of them worked.
1. decommission: the host had hardware issue, and I can not connect to it.
2. remove, there is not HostID, so the removenode did not work.
3. unsafeAssassinateEndpoint, it will throw NPE as I pasted before, can we
fix it?

Thanks
Dikang.

On Mon, Sep 21, 2015 at 11:11 AM, Sebastian Estevez <
sebastian.este...@datastax.com> wrote:

> Order is decommission, remove, assassinate.
>
> Which have you tried?
> On Sep 21, 2015 10:47 AM, "Dikang Gu" <dikan...@gmail.com> wrote:
>
>> Hi there,
>>
>> I have a dead node in our cluster, which is a wired state right now, and
>> can not be removed from cluster.
>>
>> The nodestatus shows:
>> Datacenter: DC1
>> ===
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address  Load   Tokens  OwnsHost ID
>> Rack
>> DN  10.210.165.55?  256 ?   null
>>  r1
>>
>> I tried the unsafeAssassinateEndpoint, but got exception like:
>> 2015-09-18_23:21:40.79760 INFO  23:21:40 InetAddress /10.210.165.55 is
>> now DOWN
>> 2015-09-18_23:21:40.80667 ERROR 23:21:40 Exception in thread
>> Thread[GossipStage:1,5,main]
>> 2015-09-18_23:21:40.80668 java.lang.NullPointerException: null
>> 2015-09-18_23:21:40.80669   at
>> org.apache.cassandra.service.StorageService.getApplicationStateValue(StorageService.java:1584)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80669   at
>> org.apache.cassandra.service.StorageService.getTokensFor(StorageService.java:1592)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80670   at
>> org.apache.cassandra.service.StorageService.handleStateLeft(StorageService.java:1822)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80671   at
>> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1495)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80671   at
>> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:2121)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80672   at
>> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1009)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80673   at
>> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1113)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80673   at
>> org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80673   at
>> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62)
>> ~[apache-cassandra-2.1.8+git20150804.076b0b1.jar:2.1.8+git20150804.076b0b1]
>> 2015-09-18_23:21:40.80674   at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> ~[na:1.7.0_45]
>> 2015-09-18_23:21:40.80674   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> ~[na:1.7.0_45]
>> 2015-09-18_23:21:40.80674   at java.lang.Thread.run(Thread.java:744)
>> ~[na:1.7.0_45]
>> 2015-09-18_23:21:40.85812 WARN  23:21:40 Not marking nodes down due to
>> local pause of 10852378435 > 50
>>
>> Any suggestions about how to remove it?
>> Thanks.
>>
>> --
>> Dikang
>>
>>


-- 
Dikang


Re: abnormal log after remove a node

2015-09-01 Thread Alain RODRIGUEZ
Hi,

I finally did the exact same thing before receiving the answer.

I guess this will remain FTR :).

Thanks though !

Alain


2015-09-01 16:22 GMT+02:00 曹志富 <cao.zh...@gmail.com>:

> Just restart all of the c* node
>
> --
> Ranger Tsao
>
> 2015-08-25 18:17 GMT+08:00 Alain RODRIGUEZ <arodr...@gmail.com>:
>
>> Hi, I am facing the same issue on 2.0.16.
>>
>> Did you solve this ? How ?
>>
>> I plan to try a rolling restart and see if gossip state recover from this.
>>
>> C*heers,
>>
>> Alain
>>
>> 2015-06-19 11:40 GMT+02:00 曹志富 <cao.zh...@gmail.com>:
>>
>>> I have a C* 2.1.5 with 24 nodes.A few days ago ,I have remove a node
>>> from this cluster using nodetool decommission.
>>>
>>> But tody I find some log like this:
>>>
>>> INFO  [GossipStage:1] 2015-06-19 17:38:05,616 Gossiper.java:968 -
>>> InetAddress /172.19.105.41 is now DOWN
>>> INFO  [GossipStage:1] 2015-06-19 17:38:05,617 StorageService.java:1885 -
>>> Removing tokens [-1014432261309809702, -1055322450438958612,
>>> -1120728727235087395, -1191392141261832305, -1203676771883970142,
>>> -1215563040745505837, -1215648909329054362, -1269531760567530381,
>>> -1278047879489577908, -1313427877031136549, -1342822572958042617,
>>> -1350792764922315814, -1383390744017639599, -139000372807970456,
>>> -140827955201469664, -1631551789771606023, -1633789813430312609,
>>> -1795528665156349205, -1836619444785023397, -1879127294549041822,
>>> -1962337787208890426, -2022309807234530256, -2033402140526360327,
>>> -2089413865145942100, -210961549458416802, -2148530352195763113,
>>> -2184481573787758786, -610790268720205, -2340762266634834427,
>>> -2513416003567685694, -2520971378752190013, -2596695976621541808,
>>> -2620636796023437199, -2640378596436678113, -2679143017361311011,
>>> -2721176590519112233, -2749213392354746126, -279267896827516626,
>>> -2872377759991294853, -2904711688111888325, -290489381926812623,
>>> -3000574339499272616, -301428600802598523, -3019280155316984595,
>>> -3024451041907074275, -3056898917375012425, -3161300347260716852,
>>> -3166392383659271772, -3327634380871627036, -3530685865340274372,
>>> -3563112657791369745, -366930313427781469, -3729582520450700795,
>>> -3901838244986519991, -4065326606010524312, -4174346928341550117,
>>> -4184239233207315432, -4204369933734181327, -4206479093137814808,
>>> -421410317165821100, -4311166118017934135, -4407123461118340117,
>>> -4466364858622123151, -4466939645485100087, -448955147512581975,
>>> -4587780638857304626, -4649897584350376674, -4674234125365755024
>>> , -4833801201210885896, -4857586579802212277, -4868896650650107463,
>>> -4980063310159547694, -4983471821416248610, -4992846054037653676,
>>> -5026994389965137674, -514302500353679181
>>> 0, -5198414516309928594, -5245363745777287346, -5346838390293957674,
>>> -5374413419545696184, -5427881744040857637, -5453876964430787287,
>>> -5491923669475601173, -55219734138599212
>>> 6, -5523011502670737422, -5537121117160410549, -5557015938925208697,
>>> -5572489682738121748, -5745899409803353484, -5771239101488682535,
>>> -5893479791287484099, -59766730414807540
>>> 44, -6014643892406938367, -6086002438656595783, -6129360679394503700,
>>> -6224240257573911174, -6290393495130499466, -6378712056928268929,
>>> -6430306056990093461, -6800188263839065
>>> 013, -6912720411187525051, -7160327814305587432, -7175004328733776324,
>>> -7272070430660252577, -7307945744786025148, -742448651973108101,
>>> -7539255117639002578, -7657460716997978
>>> 94, -7846698077070579798, -7870621904906244395, -7900841391761900719,
>>> -7918145426423910061, -7936795453892692473, -8070255024778921411,
>>> -8086888710627677669, -8124855925323654
>>> 631, -8175270408138820500, -8271197636596881168, -8336685710406477123,
>>> -8466220397076441627, -8534337908154758270, -8550484400487603561,
>>> -862246738021989870, -8727219287242892
>>> 185, -8895705475282612927, -8921801772904834063, -9057266752652143883,
>>> -9059183540698454288, -9067986437682229598, -9148183367896132028,
>>> -962208188860606543, 10859447725819218
>>> 30, 1189775396643491793, 1253728955879686947, 1389982523380382228,
>>> 1429632314664544045, 143610053770130548, 150118120072602242,
>>> 1575692041584712198, 1624575905722628764, 17894
>>> 76212785155173, 1995296121962835019, 2041217364870030239,
>>> 2120277336231792146, 2124445736743

Re: abnormal log after remove a node

2015-09-01 Thread 曹志富
Just restart all of the c* node

--
Ranger Tsao

2015-08-25 18:17 GMT+08:00 Alain RODRIGUEZ <arodr...@gmail.com>:

> Hi, I am facing the same issue on 2.0.16.
>
> Did you solve this ? How ?
>
> I plan to try a rolling restart and see if gossip state recover from this.
>
> C*heers,
>
> Alain
>
> 2015-06-19 11:40 GMT+02:00 曹志富 <cao.zh...@gmail.com>:
>
>> I have a C* 2.1.5 with 24 nodes.A few days ago ,I have remove a node from
>> this cluster using nodetool decommission.
>>
>> But tody I find some log like this:
>>
>> INFO  [GossipStage:1] 2015-06-19 17:38:05,616 Gossiper.java:968 -
>> InetAddress /172.19.105.41 is now DOWN
>> INFO  [GossipStage:1] 2015-06-19 17:38:05,617 StorageService.java:1885 -
>> Removing tokens [-1014432261309809702, -1055322450438958612,
>> -1120728727235087395, -1191392141261832305, -1203676771883970142,
>> -1215563040745505837, -1215648909329054362, -1269531760567530381,
>> -1278047879489577908, -1313427877031136549, -1342822572958042617,
>> -1350792764922315814, -1383390744017639599, -139000372807970456,
>> -140827955201469664, -1631551789771606023, -1633789813430312609,
>> -1795528665156349205, -1836619444785023397, -1879127294549041822,
>> -1962337787208890426, -2022309807234530256, -2033402140526360327,
>> -2089413865145942100, -210961549458416802, -2148530352195763113,
>> -2184481573787758786, -610790268720205, -2340762266634834427,
>> -2513416003567685694, -2520971378752190013, -2596695976621541808,
>> -2620636796023437199, -2640378596436678113, -2679143017361311011,
>> -2721176590519112233, -2749213392354746126, -279267896827516626,
>> -2872377759991294853, -2904711688111888325, -290489381926812623,
>> -3000574339499272616, -301428600802598523, -3019280155316984595,
>> -3024451041907074275, -3056898917375012425, -3161300347260716852,
>> -3166392383659271772, -3327634380871627036, -3530685865340274372,
>> -3563112657791369745, -366930313427781469, -3729582520450700795,
>> -3901838244986519991, -4065326606010524312, -4174346928341550117,
>> -4184239233207315432, -4204369933734181327, -4206479093137814808,
>> -421410317165821100, -4311166118017934135, -4407123461118340117,
>> -4466364858622123151, -4466939645485100087, -448955147512581975,
>> -4587780638857304626, -4649897584350376674, -4674234125365755024
>> , -4833801201210885896, -4857586579802212277, -4868896650650107463,
>> -4980063310159547694, -4983471821416248610, -4992846054037653676,
>> -5026994389965137674, -514302500353679181
>> 0, -5198414516309928594, -5245363745777287346, -5346838390293957674,
>> -5374413419545696184, -5427881744040857637, -5453876964430787287,
>> -5491923669475601173, -55219734138599212
>> 6, -5523011502670737422, -5537121117160410549, -5557015938925208697,
>> -5572489682738121748, -5745899409803353484, -5771239101488682535,
>> -5893479791287484099, -59766730414807540
>> 44, -6014643892406938367, -6086002438656595783, -6129360679394503700,
>> -6224240257573911174, -6290393495130499466, -6378712056928268929,
>> -6430306056990093461, -6800188263839065
>> 013, -6912720411187525051, -7160327814305587432, -7175004328733776324,
>> -7272070430660252577, -7307945744786025148, -742448651973108101,
>> -7539255117639002578, -7657460716997978
>> 94, -7846698077070579798, -7870621904906244395, -7900841391761900719,
>> -7918145426423910061, -7936795453892692473, -8070255024778921411,
>> -8086888710627677669, -8124855925323654
>> 631, -8175270408138820500, -8271197636596881168, -8336685710406477123,
>> -8466220397076441627, -8534337908154758270, -8550484400487603561,
>> -862246738021989870, -8727219287242892
>> 185, -8895705475282612927, -8921801772904834063, -9057266752652143883,
>> -9059183540698454288, -9067986437682229598, -9148183367896132028,
>> -962208188860606543, 10859447725819218
>> 30, 1189775396643491793, 1253728955879686947, 1389982523380382228,
>> 1429632314664544045, 143610053770130548, 150118120072602242,
>> 1575692041584712198, 1624575905722628764, 17894
>> 76212785155173, 1995296121962835019, 2041217364870030239,
>> 2120277336231792146, 2124445736743406711, 2154979704292433983,
>> 2340726755918680765, 23481654796845972, 23620268084352
>> 24407, 2366144489007464626, 2381492708106933027, 2398868971489617398,
>> 2427315953339163528, 2433999003913998534, 2633074510238705620,
>> 266659839023809792, 2677817641360639089, 2
>> 719725410894526151, 2751925111749406683, 2815703589803785617,
>> 3041515796379693113, 3044903149214270978, 3094954503756703989,
>> 3243933267690865263, 3246086646486800371, 33270068
>>

Re: abnormal log after remove a node

2015-08-25 Thread Alain RODRIGUEZ
Hi, I am facing the same issue on 2.0.16.

Did you solve this ? How ?

I plan to try a rolling restart and see if gossip state recover from this.

C*heers,

Alain

2015-06-19 11:40 GMT+02:00 曹志富 cao.zh...@gmail.com:

 I have a C* 2.1.5 with 24 nodes.A few days ago ,I have remove a node from
 this cluster using nodetool decommission.

 But tody I find some log like this:

 INFO  [GossipStage:1] 2015-06-19 17:38:05,616 Gossiper.java:968 -
 InetAddress /172.19.105.41 is now DOWN
 INFO  [GossipStage:1] 2015-06-19 17:38:05,617 StorageService.java:1885 -
 Removing tokens [-1014432261309809702, -1055322450438958612,
 -1120728727235087395, -1191392141261832305, -1203676771883970142,
 -1215563040745505837, -1215648909329054362, -1269531760567530381,
 -1278047879489577908, -1313427877031136549, -1342822572958042617,
 -1350792764922315814, -1383390744017639599, -139000372807970456,
 -140827955201469664, -1631551789771606023, -1633789813430312609,
 -1795528665156349205, -1836619444785023397, -1879127294549041822,
 -1962337787208890426, -2022309807234530256, -2033402140526360327,
 -2089413865145942100, -210961549458416802, -2148530352195763113,
 -2184481573787758786, -610790268720205, -2340762266634834427,
 -2513416003567685694, -2520971378752190013, -2596695976621541808,
 -2620636796023437199, -2640378596436678113, -2679143017361311011,
 -2721176590519112233, -2749213392354746126, -279267896827516626,
 -2872377759991294853, -2904711688111888325, -290489381926812623,
 -3000574339499272616, -301428600802598523, -3019280155316984595,
 -3024451041907074275, -3056898917375012425, -3161300347260716852,
 -3166392383659271772, -3327634380871627036, -3530685865340274372,
 -3563112657791369745, -366930313427781469, -3729582520450700795,
 -3901838244986519991, -4065326606010524312, -4174346928341550117,
 -4184239233207315432, -4204369933734181327, -4206479093137814808,
 -421410317165821100, -4311166118017934135, -4407123461118340117,
 -4466364858622123151, -4466939645485100087, -448955147512581975,
 -4587780638857304626, -4649897584350376674, -4674234125365755024
 , -4833801201210885896, -4857586579802212277, -4868896650650107463,
 -4980063310159547694, -4983471821416248610, -4992846054037653676,
 -5026994389965137674, -514302500353679181
 0, -5198414516309928594, -5245363745777287346, -5346838390293957674,
 -5374413419545696184, -5427881744040857637, -5453876964430787287,
 -5491923669475601173, -55219734138599212
 6, -5523011502670737422, -5537121117160410549, -5557015938925208697,
 -5572489682738121748, -5745899409803353484, -5771239101488682535,
 -5893479791287484099, -59766730414807540
 44, -6014643892406938367, -6086002438656595783, -6129360679394503700,
 -6224240257573911174, -6290393495130499466, -6378712056928268929,
 -6430306056990093461, -6800188263839065
 013, -6912720411187525051, -7160327814305587432, -7175004328733776324,
 -7272070430660252577, -7307945744786025148, -742448651973108101,
 -7539255117639002578, -7657460716997978
 94, -7846698077070579798, -7870621904906244395, -7900841391761900719,
 -7918145426423910061, -7936795453892692473, -8070255024778921411,
 -8086888710627677669, -8124855925323654
 631, -8175270408138820500, -8271197636596881168, -8336685710406477123,
 -8466220397076441627, -8534337908154758270, -8550484400487603561,
 -862246738021989870, -8727219287242892
 185, -8895705475282612927, -8921801772904834063, -9057266752652143883,
 -9059183540698454288, -9067986437682229598, -9148183367896132028,
 -962208188860606543, 10859447725819218
 30, 1189775396643491793, 1253728955879686947, 1389982523380382228,
 1429632314664544045, 143610053770130548, 150118120072602242,
 1575692041584712198, 1624575905722628764, 17894
 76212785155173, 1995296121962835019, 2041217364870030239,
 2120277336231792146, 2124445736743406711, 2154979704292433983,
 2340726755918680765, 23481654796845972, 23620268084352
 24407, 2366144489007464626, 2381492708106933027, 2398868971489617398,
 2427315953339163528, 2433999003913998534, 2633074510238705620,
 266659839023809792, 2677817641360639089, 2
 719725410894526151, 2751925111749406683, 2815703589803785617,
 3041515796379693113, 3044903149214270978, 3094954503756703989,
 3243933267690865263, 3246086646486800371, 33270068
 97333869434, 3393657685587750192, 3395065499228709345,
 3426126123948029459, 3500469615600510698, 3644011364716880512,
 3693249207133187620, 3776164494954636918, 38780676797
 8035, 3872151295451662867, 3937077827707223414, 4041082935346014761,
 4060208918173638435, 4086747843759164940, 4165638694482690057,
 4203996339238989224, 4220155275330961826, 4
 366784953339236686, 4390116924352514616, 4391225331964772681,
 4392419346255765958, 4448400054980766409, 4463335839328115373,
 4547306976104362915, 4588174843388248100, 48438580
 67983993745, 4912719175808770608, 499628843707992459, 5004392861473086088,
 5021047773702107258, 510226752691159107, 5109551630357971118,
 5157669927051121583, 51627694176199618
 24, 5238710860488961530, 5245958115092331518

Re: Question about how to remove data

2015-08-22 Thread Analia Lorenzatto
Thanks guys for the answers!

Saludos / Regards.

Analía Lorenzatto.

Hapiness is not something really made. It comes from your own actions by
Dalai Lama


On 21 Aug 2015 2:31 pm, Sebastian Estevez sebastian.este...@datastax.com
wrote:

 To clarify, you do not need a ttl for deletes to be compacted away in
 Cassandra. When you delete, we create a tombstone which will remain in the
 system __at least__ gc grace seconds. We wait this long to give the
 tombstone a chance to make it to all replica nodes, the best practice is to
 run repairs as often as gc grace seconds in order to ensure edge cases
 where data comes back to life (i.e. the tombstone was never sent to one of
 your replicas and when the tombstones and data are removed from the other
 two replicas, all that is left is the old value.

 __at least__ are the key words in the previous paragraph, there are more
 conditions that need to be met in order for a tombstone to actually get
 cleaned up. As most things in Cassandra, these conditions are configurable
 (via the following compaction sub-properties):


 http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_configure_compaction_t.html

 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax


 http://cassandrasummit-datastax.com/?utm_campaign=summit15utm_medium=summiticonutm_source=emailsignature

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Thu, Aug 20, 2015 at 4:13 PM, Daniel Chia danc...@coursera.org wrote:

 The TTL shouldn't matter if you deleted the data, since to my
 understanding the delete should shadow the data signaling to C* that the
 data is a candidate for removal on compaction.

 Others might know better, but it could very well be the fact that
 gc_grace_seconds is 0 that is causing your problems. Others might have
 other suggestions, but you could potentially use sstable2json to see the
 raw contents of the sstable on disk and see why data is still there.

 Thanks,
 Daniel

 On Thu, Aug 20, 2015 at 12:55 PM, Analia Lorenzatto 
 analialorenza...@gmail.com wrote:

 Hello,

 Daniel, I am using Size Tiered compaction.

 My concern is that as I do not have a TTL defined on the Column family,
 and I do not have the possibility to create it.   Perhaps, the deleted
 data is never actually going to be removed?

 Thanks a lot!


 On Thu, Aug 20, 2015 at 4:24 AM, Daniel Chia danc...@coursera.org
 wrote:

 Is this a LCS family, or Size Tiered? Manually running compaction on
 LCS doesn't do anything until C* 2.2 (
 https://issues.apache.org/jira/browse/CASSANDRA-7272)

 Thanks,
 Daniel

 On Wed, Aug 19, 2015 at 6:56 PM, Analia Lorenzatto 
 analialorenza...@gmail.com wrote:

 Hello Michael,

 Thanks for responding!

 I do not have snapshots on any node of the cluster.

 Saludos / Regards.

 Analía Lorenzatto.

 Hapiness is not something really made. It comes from your own
 actions by Dalai Lama


 On 19 Aug 2015 6:19 pm, Laing, Michael michael.la...@nytimes.com
 wrote:

 Possibly you have snapshots? If so, use nodetool to clear them.

 On Wed, Aug 19, 2015 at 4:54 PM, Analia Lorenzatto 
 analialorenza...@gmail.com wrote:

 Hello guys,

 I have a cassandra cluster 2.1 comprised of 4 nodes.

 I removed a lot of data in a Column Family, then I ran manually a
 compaction on this Column family on every node.   After doing that, If I
 query that data, cassandra correctly says this data is not there.  But 
 the
 space on disk is exactly the same before removing that data.

 Also, I realized that  gc_grace_seconds = 0.  Some people on the
 internet say that it could produce zombie data, what do you think?

 I do not have a TTL defined on the Column family, and I do not have
 the possibility to create it.   So my questions is, given that I do not
 have a TTL defined is data going to be removed?  or the deleted data is
 never actually going to be deleted due to I do not have a TTL?


 Thanks in advance!

 --
 Saludos / Regards.

 Analía Lorenzatto.

 “It's possible to commit no errors and still lose. That is not
 weakness.  That is life.  By Captain Jean-Luc Picard.






 --
 Saludos / Regards.

 Analía Lorenzatto.

 “It's possible to commit no errors and still lose. That is not
 weakness.  That is life.  By Captain Jean-Luc Picard.






Re: Question about how to remove data

2015-08-21 Thread Sebastian Estevez
To clarify, you do not need a ttl for deletes to be compacted away in
Cassandra. When you delete, we create a tombstone which will remain in the
system __at least__ gc grace seconds. We wait this long to give the
tombstone a chance to make it to all replica nodes, the best practice is to
run repairs as often as gc grace seconds in order to ensure edge cases
where data comes back to life (i.e. the tombstone was never sent to one of
your replicas and when the tombstones and data are removed from the other
two replicas, all that is left is the old value.

__at least__ are the key words in the previous paragraph, there are more
conditions that need to be met in order for a tombstone to actually get
cleaned up. As most things in Cassandra, these conditions are configurable
(via the following compaction sub-properties):

http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_configure_compaction_t.html

All the best,


[image: datastax_logo.png] http://www.datastax.com/

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png] https://www.linkedin.com/company/datastax [image:
facebook.png] https://www.facebook.com/datastax [image: twitter.png]
https://twitter.com/datastax [image: g+.png]
https://plus.google.com/+Datastax/about
http://feeds.feedburner.com/datastax

http://cassandrasummit-datastax.com/?utm_campaign=summit15utm_medium=summiticonutm_source=emailsignature

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, Aug 20, 2015 at 4:13 PM, Daniel Chia danc...@coursera.org wrote:

 The TTL shouldn't matter if you deleted the data, since to my
 understanding the delete should shadow the data signaling to C* that the
 data is a candidate for removal on compaction.

 Others might know better, but it could very well be the fact that
 gc_grace_seconds is 0 that is causing your problems. Others might have
 other suggestions, but you could potentially use sstable2json to see the
 raw contents of the sstable on disk and see why data is still there.

 Thanks,
 Daniel

 On Thu, Aug 20, 2015 at 12:55 PM, Analia Lorenzatto 
 analialorenza...@gmail.com wrote:

 Hello,

 Daniel, I am using Size Tiered compaction.

 My concern is that as I do not have a TTL defined on the Column family,
 and I do not have the possibility to create it.   Perhaps, the deleted
 data is never actually going to be removed?

 Thanks a lot!


 On Thu, Aug 20, 2015 at 4:24 AM, Daniel Chia danc...@coursera.org
 wrote:

 Is this a LCS family, or Size Tiered? Manually running compaction on LCS
 doesn't do anything until C* 2.2 (
 https://issues.apache.org/jira/browse/CASSANDRA-7272)

 Thanks,
 Daniel

 On Wed, Aug 19, 2015 at 6:56 PM, Analia Lorenzatto 
 analialorenza...@gmail.com wrote:

 Hello Michael,

 Thanks for responding!

 I do not have snapshots on any node of the cluster.

 Saludos / Regards.

 Analía Lorenzatto.

 Hapiness is not something really made. It comes from your own actions
 by Dalai Lama


 On 19 Aug 2015 6:19 pm, Laing, Michael michael.la...@nytimes.com
 wrote:

 Possibly you have snapshots? If so, use nodetool to clear them.

 On Wed, Aug 19, 2015 at 4:54 PM, Analia Lorenzatto 
 analialorenza...@gmail.com wrote:

 Hello guys,

 I have a cassandra cluster 2.1 comprised of 4 nodes.

 I removed a lot of data in a Column Family, then I ran manually a
 compaction on this Column family on every node.   After doing that, If I
 query that data, cassandra correctly says this data is not there.  But 
 the
 space on disk is exactly the same before removing that data.

 Also, I realized that  gc_grace_seconds = 0.  Some people on the
 internet say that it could produce zombie data, what do you think?

 I do not have a TTL defined on the Column family, and I do not have
 the possibility to create it.   So my questions is, given that I do not
 have a TTL defined is data going to be removed?  or the deleted data is
 never actually going to be deleted due to I do not have a TTL?


 Thanks in advance!

 --
 Saludos / Regards.

 Analía Lorenzatto.

 “It's possible to commit no errors and still lose. That is not
 weakness.  That is life.  By Captain Jean-Luc Picard.






 --
 Saludos / Regards.

 Analía Lorenzatto.

 “It's possible to commit no errors and still lose. That is not weakness.
 That is life.  By Captain Jean-Luc Picard.





Re: Question about how to remove data

2015-08-20 Thread Daniel Chia
Is this a LCS family, or Size Tiered? Manually running compaction on LCS
doesn't do anything until C* 2.2 (
https://issues.apache.org/jira/browse/CASSANDRA-7272)

Thanks,
Daniel

On Wed, Aug 19, 2015 at 6:56 PM, Analia Lorenzatto 
analialorenza...@gmail.com wrote:

 Hello Michael,

 Thanks for responding!

 I do not have snapshots on any node of the cluster.

 Saludos / Regards.

 Analía Lorenzatto.

 Hapiness is not something really made. It comes from your own actions by
 Dalai Lama


 On 19 Aug 2015 6:19 pm, Laing, Michael michael.la...@nytimes.com
 wrote:

 Possibly you have snapshots? If so, use nodetool to clear them.

 On Wed, Aug 19, 2015 at 4:54 PM, Analia Lorenzatto 
 analialorenza...@gmail.com wrote:

 Hello guys,

 I have a cassandra cluster 2.1 comprised of 4 nodes.

 I removed a lot of data in a Column Family, then I ran manually a
 compaction on this Column family on every node.   After doing that, If I
 query that data, cassandra correctly says this data is not there.  But the
 space on disk is exactly the same before removing that data.

 Also, I realized that  gc_grace_seconds = 0.  Some people on the
 internet say that it could produce zombie data, what do you think?

 I do not have a TTL defined on the Column family, and I do not have the
 possibility to create it.   So my questions is, given that I do not have a
 TTL defined is data going to be removed?  or the deleted data is never
 actually going to be deleted due to I do not have a TTL?


 Thanks in advance!

 --
 Saludos / Regards.

 Analía Lorenzatto.

 “It's possible to commit no errors and still lose. That is not
 weakness.  That is life.  By Captain Jean-Luc Picard.





Re: Question about how to remove data

2015-08-20 Thread Analia Lorenzatto
Hello,

Daniel, I am using Size Tiered compaction.

My concern is that as I do not have a TTL defined on the Column family, and
I do not have the possibility to create it.   Perhaps, the deleted data
is never actually going to be removed?

Thanks a lot!


On Thu, Aug 20, 2015 at 4:24 AM, Daniel Chia danc...@coursera.org wrote:

 Is this a LCS family, or Size Tiered? Manually running compaction on LCS
 doesn't do anything until C* 2.2 (
 https://issues.apache.org/jira/browse/CASSANDRA-7272)

 Thanks,
 Daniel

 On Wed, Aug 19, 2015 at 6:56 PM, Analia Lorenzatto 
 analialorenza...@gmail.com wrote:

 Hello Michael,

 Thanks for responding!

 I do not have snapshots on any node of the cluster.

 Saludos / Regards.

 Analía Lorenzatto.

 Hapiness is not something really made. It comes from your own actions
 by Dalai Lama


 On 19 Aug 2015 6:19 pm, Laing, Michael michael.la...@nytimes.com
 wrote:

 Possibly you have snapshots? If so, use nodetool to clear them.

 On Wed, Aug 19, 2015 at 4:54 PM, Analia Lorenzatto 
 analialorenza...@gmail.com wrote:

 Hello guys,

 I have a cassandra cluster 2.1 comprised of 4 nodes.

 I removed a lot of data in a Column Family, then I ran manually a
 compaction on this Column family on every node.   After doing that, If I
 query that data, cassandra correctly says this data is not there.  But the
 space on disk is exactly the same before removing that data.

 Also, I realized that  gc_grace_seconds = 0.  Some people on the
 internet say that it could produce zombie data, what do you think?

 I do not have a TTL defined on the Column family, and I do not have the
 possibility to create it.   So my questions is, given that I do not have a
 TTL defined is data going to be removed?  or the deleted data is never
 actually going to be deleted due to I do not have a TTL?


 Thanks in advance!

 --
 Saludos / Regards.

 Analía Lorenzatto.

 “It's possible to commit no errors and still lose. That is not
 weakness.  That is life.  By Captain Jean-Luc Picard.






-- 
Saludos / Regards.

Analía Lorenzatto.

“It's possible to commit no errors and still lose. That is not weakness.
That is life.  By Captain Jean-Luc Picard.


Re: Question about how to remove data

2015-08-20 Thread Daniel Chia
The TTL shouldn't matter if you deleted the data, since to my understanding
the delete should shadow the data signaling to C* that the data is a
candidate for removal on compaction.

Others might know better, but it could very well be the fact that
gc_grace_seconds is 0 that is causing your problems. Others might have
other suggestions, but you could potentially use sstable2json to see the
raw contents of the sstable on disk and see why data is still there.

Thanks,
Daniel

On Thu, Aug 20, 2015 at 12:55 PM, Analia Lorenzatto 
analialorenza...@gmail.com wrote:

 Hello,

 Daniel, I am using Size Tiered compaction.

 My concern is that as I do not have a TTL defined on the Column family,
 and I do not have the possibility to create it.   Perhaps, the deleted
 data is never actually going to be removed?

 Thanks a lot!


 On Thu, Aug 20, 2015 at 4:24 AM, Daniel Chia danc...@coursera.org wrote:

 Is this a LCS family, or Size Tiered? Manually running compaction on LCS
 doesn't do anything until C* 2.2 (
 https://issues.apache.org/jira/browse/CASSANDRA-7272)

 Thanks,
 Daniel

 On Wed, Aug 19, 2015 at 6:56 PM, Analia Lorenzatto 
 analialorenza...@gmail.com wrote:

 Hello Michael,

 Thanks for responding!

 I do not have snapshots on any node of the cluster.

 Saludos / Regards.

 Analía Lorenzatto.

 Hapiness is not something really made. It comes from your own actions
 by Dalai Lama


 On 19 Aug 2015 6:19 pm, Laing, Michael michael.la...@nytimes.com
 wrote:

 Possibly you have snapshots? If so, use nodetool to clear them.

 On Wed, Aug 19, 2015 at 4:54 PM, Analia Lorenzatto 
 analialorenza...@gmail.com wrote:

 Hello guys,

 I have a cassandra cluster 2.1 comprised of 4 nodes.

 I removed a lot of data in a Column Family, then I ran manually a
 compaction on this Column family on every node.   After doing that, If I
 query that data, cassandra correctly says this data is not there.  But the
 space on disk is exactly the same before removing that data.

 Also, I realized that  gc_grace_seconds = 0.  Some people on the
 internet say that it could produce zombie data, what do you think?

 I do not have a TTL defined on the Column family, and I do not have
 the possibility to create it.   So my questions is, given that I do not
 have a TTL defined is data going to be removed?  or the deleted data is
 never actually going to be deleted due to I do not have a TTL?


 Thanks in advance!

 --
 Saludos / Regards.

 Analía Lorenzatto.

 “It's possible to commit no errors and still lose. That is not
 weakness.  That is life.  By Captain Jean-Luc Picard.






 --
 Saludos / Regards.

 Analía Lorenzatto.

 “It's possible to commit no errors and still lose. That is not weakness.
 That is life.  By Captain Jean-Luc Picard.



Question about how to remove data

2015-08-19 Thread Analia Lorenzatto
Hello guys,

I have a cassandra cluster 2.1 comprised of 4 nodes.

I removed a lot of data in a Column Family, then I ran manually a
compaction on this Column family on every node.   After doing that, If I
query that data, cassandra correctly says this data is not there.  But the
space on disk is exactly the same before removing that data.

Also, I realized that  gc_grace_seconds = 0.  Some people on the internet
say that it could produce zombie data, what do you think?

I do not have a TTL defined on the Column family, and I do not have the
possibility to create it.   So my questions is, given that I do not have a
TTL defined is data going to be removed?  or the deleted data is never
actually going to be deleted due to I do not have a TTL?


Thanks in advance!

-- 
Saludos / Regards.

Analía Lorenzatto.

“It's possible to commit no errors and still lose. That is not weakness.
That is life.  By Captain Jean-Luc Picard.


Re: Question about how to remove data

2015-08-19 Thread Laing, Michael
Possibly you have snapshots? If so, use nodetool to clear them.

On Wed, Aug 19, 2015 at 4:54 PM, Analia Lorenzatto 
analialorenza...@gmail.com wrote:

 Hello guys,

 I have a cassandra cluster 2.1 comprised of 4 nodes.

 I removed a lot of data in a Column Family, then I ran manually a
 compaction on this Column family on every node.   After doing that, If I
 query that data, cassandra correctly says this data is not there.  But the
 space on disk is exactly the same before removing that data.

 Also, I realized that  gc_grace_seconds = 0.  Some people on the internet
 say that it could produce zombie data, what do you think?

 I do not have a TTL defined on the Column family, and I do not have the
 possibility to create it.   So my questions is, given that I do not have a
 TTL defined is data going to be removed?  or the deleted data is never
 actually going to be deleted due to I do not have a TTL?


 Thanks in advance!

 --
 Saludos / Regards.

 Analía Lorenzatto.

 “It's possible to commit no errors and still lose. That is not weakness.
 That is life.  By Captain Jean-Luc Picard.



Re: Question about how to remove data

2015-08-19 Thread Analia Lorenzatto
Hello Michael,

Thanks for responding!

I do not have snapshots on any node of the cluster.

Saludos / Regards.

Analía Lorenzatto.

Hapiness is not something really made. It comes from your own actions by
Dalai Lama


On 19 Aug 2015 6:19 pm, Laing, Michael michael.la...@nytimes.com wrote:

 Possibly you have snapshots? If so, use nodetool to clear them.

 On Wed, Aug 19, 2015 at 4:54 PM, Analia Lorenzatto 
 analialorenza...@gmail.com wrote:

 Hello guys,

 I have a cassandra cluster 2.1 comprised of 4 nodes.

 I removed a lot of data in a Column Family, then I ran manually a
 compaction on this Column family on every node.   After doing that, If I
 query that data, cassandra correctly says this data is not there.  But the
 space on disk is exactly the same before removing that data.

 Also, I realized that  gc_grace_seconds = 0.  Some people on the internet
 say that it could produce zombie data, what do you think?

 I do not have a TTL defined on the Column family, and I do not have the
 possibility to create it.   So my questions is, given that I do not have a
 TTL defined is data going to be removed?  or the deleted data is never
 actually going to be deleted due to I do not have a TTL?


 Thanks in advance!

 --
 Saludos / Regards.

 Analía Lorenzatto.

 “It's possible to commit no errors and still lose. That is not weakness.
 That is life.  By Captain Jean-Luc Picard.





Re: Is there a way to remove a node with Opscenter?

2015-07-28 Thread Sid Tantia
I know this is an old thread but just FYI for others having the same
problem (OpsCenter trying to connect to node that is already removed)...the
solution is to ssh into the OpsCenter node and run `sudo service opscenterd
restart`

On Thu, Jul 9, 2015 at 3:52 PM, Sid Tantia sid.tan...@baseboxsoftware.com
wrote:

 Found my mistake: I was typing the command on the node I was trying to
 remove from the cluster. After trying the command on another node in the
 cluster, it worked (`nodetool status` shows the node as removed), however
 OpsCenter still does not recognize the node as removed.

 Any ways to fix OpsCenter so that it stops trying to connect to the node
 that is already removed?



 On Tue, Jul 7, 2015 at 11:38 PM, Jean Tremblay 
 jean.tremb...@zen-innovations.com wrote:

 When you do a nodetool command and you don’t specify a hostname, it sends
 the requests via JMX to the localhost node. If that node is down then the
 command will not succeed.
 In your case you are probably doing the command from a machine which has
 not cassandra running, in that case you need to specify a node with the
 switch -h.

 So for your that would be:

 nodetool -h a-node-ip-address removenode Host ID

 where a-node-ip-address is the address of a server which has cassandra
 daemon running.

 Cheers

 Jean

  On 08 Jul 2015, at 01:39 , Sid Tantia sid.tan...@baseboxsoftware.com
 wrote:

 I tried both `nodetool remove node Host ID` and `nodetool decommission`
 and they both give the error:

  nodetool: Failed to connect to '127.0.0.1:7199' - ConnectException:
 'Connection refused’.

 Here is what I have tried to fix this:

 1) Uncommented JVM_OPTS=”$JVM_OPTS -Djava.rmi.server.hostname=public
 name”
 2) Changed rpc_address to 0.0.0.0
 3) Restarted cassandra
 4) Restarted datastax-agent

 (Note that I installed my cluster using opscenter so that may have
 something to do with it? )



 On Tue, Jul 7, 2015 at 2:08 PM, Surbhi Gupta surbhi.gupt...@gmail.com
 wrote:

  If node is down use :

 nodetool removenode Host ID

 We have to run the below command when the node is down  if the cluster
 does not use vnodes, before running the nodetool removenode command, adjust
 the tokens.

 If the node is up, then the command would be “nodetool decommission” to
 remove the node.

 Remove the node from the “seed list” within the configuration
 cassandra.yaml.

 On 7 July 2015 at 12:56, Sid Tantia sid.tan...@baseboxsoftware.com
 wrote:

 Thanks for the response. I’m trying to remove a node that’s already
 down for some reason so its not allowing me to decommission it, is there
 some other way to do this?



 On Tue, Jul 7, 2015 at 12:45 PM, Kiran mk coolkiran2...@gmail.com
 wrote:

  Yes, if your intension is to decommission a node.  You can do that
 by clicking on the node and decommission.

 Best Regards,
 Kiran.M.K.
 On Jul 8, 2015 1:00 AM, Sid Tantia sid.tan...@baseboxsoftware.com
 wrote:

   I know you can use `nodetool removenode` from the command line but
 is there a way to remove a node from a cluster using OpsCenter?









Re: Is there a way to remove a node with Opscenter?

2015-07-09 Thread Sid Tantia
Found my mistake: I was typing the command on the node I was trying to remove 
from the cluster. After trying the command on another node in the cluster, it 
worked (`nodetool status` shows the node as removed), however OpsCenter still 
does not recognize the node as removed.


Any ways to fix OpsCenter so that it stops trying to connect to the node that 
is already removed?

On Tue, Jul 7, 2015 at 11:38 PM, Jean Tremblay
jean.tremb...@zen-innovations.com wrote:

 When you do a nodetool command and you don’t specify a hostname, it sends the 
 requests via JMX to the localhost node. If that node is down then the command 
 will not succeed.
 In your case you are probably doing the command from a machine which has not 
 cassandra running, in that case you need to specify a node with the switch -h.
 So for your that would be:
 nodetool -h a-node-ip-address removenode Host ID
 where a-node-ip-address is the address of a server which has cassandra 
 daemon running.
 Cheers
 Jean
 On 08 Jul 2015, at 01:39 , Sid Tantia 
 sid.tan...@baseboxsoftware.commailto:sid.tan...@baseboxsoftware.com wrote:
 I tried both `nodetool remove node Host ID` and `nodetool decommission` and 
 they both give the error:
 nodetool: Failed to connect to '127.0.0.1:7199' - ConnectException: 
 'Connection refused’.
 Here is what I have tried to fix this:
 1) Uncommented JVM_OPTS=”$JVM_OPTS -Djava.rmi.server.hostname=public name”
 2) Changed rpc_address to 0.0.0.0
 3) Restarted cassandra
 4) Restarted datastax-agent
 (Note that I installed my cluster using opscenter so that may have something 
 to do with it? )
 On Tue, Jul 7, 2015 at 2:08 PM, Surbhi Gupta 
 surbhi.gupt...@gmail.commailto:surbhi.gupt...@gmail.com wrote:
 If node is down use :
 nodetool removenode Host ID
 We have to run the below command when the node is down  if the cluster does 
 not use vnodes, before running the nodetool removenode command, adjust the 
 tokens.
 If the node is up, then the command would be “nodetool decommission” to 
 remove the node.
 Remove the node from the “seed list” within the configuration cassandra.yaml.
 On 7 July 2015 at 12:56, Sid Tantia 
 sid.tan...@baseboxsoftware.commailto:sid.tan...@baseboxsoftware.com wrote:
 Thanks for the response. I’m trying to remove a node that’s already down for 
 some reason so its not allowing me to decommission it, is there some other 
 way to do this?
 On Tue, Jul 7, 2015 at 12:45 PM, Kiran mk 
 coolkiran2...@gmail.commailto:coolkiran2...@gmail.com wrote:
 Yes, if your intension is to decommission a node.  You can do that by 
 clicking on the node and decommission.
 Best Regards,
 Kiran.M.K.
 On Jul 8, 2015 1:00 AM, Sid Tantia 
 sid.tan...@baseboxsoftware.commailto:sid.tan...@baseboxsoftware.com wrote:
 I know you can use `nodetool removenode` from the command line but is there a 
 way to remove a node from a cluster using OpsCenter?

Re: Is there a way to remove a node with Opscenter?

2015-07-08 Thread Jean Tremblay
When you do a nodetool command and you don’t specify a hostname, it sends the 
requests via JMX to the localhost node. If that node is down then the command 
will not succeed.
In your case you are probably doing the command from a machine which has not 
cassandra running, in that case you need to specify a node with the switch -h.

So for your that would be:

nodetool -h a-node-ip-address removenode Host ID

where a-node-ip-address is the address of a server which has cassandra daemon 
running.

Cheers

Jean

On 08 Jul 2015, at 01:39 , Sid Tantia 
sid.tan...@baseboxsoftware.commailto:sid.tan...@baseboxsoftware.com wrote:

I tried both `nodetool remove node Host ID` and `nodetool decommission` and 
they both give the error:

nodetool: Failed to connect to '127.0.0.1:7199' - ConnectException: 'Connection 
refused’.

Here is what I have tried to fix this:

1) Uncommented JVM_OPTS=”$JVM_OPTS -Djava.rmi.server.hostname=public name”
2) Changed rpc_address to 0.0.0.0
3) Restarted cassandra
4) Restarted datastax-agent

(Note that I installed my cluster using opscenter so that may have something to 
do with it? )




On Tue, Jul 7, 2015 at 2:08 PM, Surbhi Gupta 
surbhi.gupt...@gmail.commailto:surbhi.gupt...@gmail.com wrote:

If node is down use :

nodetool removenode Host ID

We have to run the below command when the node is down  if the cluster does 
not use vnodes, before running the nodetool removenode command, adjust the 
tokens.

If the node is up, then the command would be “nodetool decommission” to remove 
the node.


Remove the node from the “seed list” within the configuration cassandra.yaml.

On 7 July 2015 at 12:56, Sid Tantia 
sid.tan...@baseboxsoftware.commailto:sid.tan...@baseboxsoftware.com wrote:
Thanks for the response. I’m trying to remove a node that’s already down for 
some reason so its not allowing me to decommission it, is there some other way 
to do this?




On Tue, Jul 7, 2015 at 12:45 PM, Kiran mk 
coolkiran2...@gmail.commailto:coolkiran2...@gmail.com wrote:

Yes, if your intension is to decommission a node.  You can do that by clicking 
on the node and decommission.

Best Regards,
Kiran.M.K.

On Jul 8, 2015 1:00 AM, Sid Tantia 
sid.tan...@baseboxsoftware.commailto:sid.tan...@baseboxsoftware.com wrote:
I know you can use `nodetool removenode` from the command line but is there a 
way to remove a node from a cluster using OpsCenter?







Re: Is there a way to remove a node with Opscenter?

2015-07-07 Thread Sid Tantia
Thanks for the response. I’m trying to remove a node that’s already down for 
some reason so its not allowing me to decommission it, is there some other way 
to do this?

On Tue, Jul 7, 2015 at 12:45 PM, Kiran mk coolkiran2...@gmail.com wrote:

 Yes, if your intension is to decommission a node.  You can do that by
 clicking on the node and decommission.
 Best Regards,
 Kiran.M.K.
 On Jul 8, 2015 1:00 AM, Sid Tantia sid.tan...@baseboxsoftware.com wrote:
  I know you can use `nodetool removenode` from the command line but is
 there a way to remove a node from a cluster using OpsCenter?



Is there a way to remove a node with Opscenter?

2015-07-07 Thread Sid Tantia
I know you can use `nodetool removenode` from the command line but is there a 
way to remove a node from a cluster using OpsCenter?

Re: Is there a way to remove a node with Opscenter?

2015-07-07 Thread Kiran mk
Yes, if your intension is to decommission a node.  You can do that by
clicking on the node and decommission.

Best Regards,
Kiran.M.K.
On Jul 8, 2015 1:00 AM, Sid Tantia sid.tan...@baseboxsoftware.com wrote:

  I know you can use `nodetool removenode` from the command line but is
 there a way to remove a node from a cluster using OpsCenter?




Re: Is there a way to remove a node with Opscenter?

2015-07-07 Thread Surbhi Gupta
If node is down use :

nodetool removenode Host ID

We have to run the below command when the node is down  if the cluster
does not use vnodes, before running the nodetool removenode command, adjust
the tokens.

If the node is up, then the command would be “nodetool decommission” to
remove the node.

Remove the node from the “seed list” within the configuration
cassandra.yaml.

On 7 July 2015 at 12:56, Sid Tantia sid.tan...@baseboxsoftware.com wrote:

 Thanks for the response. I’m trying to remove a node that’s already down
 for some reason so its not allowing me to decommission it, is there some
 other way to do this?



 On Tue, Jul 7, 2015 at 12:45 PM, Kiran mk coolkiran2...@gmail.com wrote:

 Yes, if your intension is to decommission a node.  You can do that by
 clicking on the node and decommission.

 Best Regards,
 Kiran.M.K.
 On Jul 8, 2015 1:00 AM, Sid Tantia sid.tan...@baseboxsoftware.com
 wrote:

  I know you can use `nodetool removenode` from the command line but is
 there a way to remove a node from a cluster using OpsCenter?





Re: Is there a way to remove a node with Opscenter?

2015-07-07 Thread Sid Tantia
I tried both `nodetool remove node Host ID` and `nodetool decommission` and 
they both give the error:




nodetool: Failed to connect to '127.0.0.1:7199' - ConnectException: 'Connection 
refused’.




Here is what I have tried to fix this:




1) Uncommented JVM_OPTS=”$JVM_OPTS -Djava.rmi.server.hostname=public name”

2) Changed rpc_address to 0.0.0.0

3) Restarted cassandra

4) Restarted datastax-agent




(Note that I installed my cluster using opscenter so that may have something to 
do with it? )

On Tue, Jul 7, 2015 at 2:08 PM, Surbhi Gupta surbhi.gupt...@gmail.com
wrote:

 If node is down use :
 nodetool removenode Host ID
 We have to run the below command when the node is down  if the cluster
 does not use vnodes, before running the nodetool removenode command, adjust
 the tokens.
 If the node is up, then the command would be “nodetool decommission” to
 remove the node.
 Remove the node from the “seed list” within the configuration
 cassandra.yaml.
 On 7 July 2015 at 12:56, Sid Tantia sid.tan...@baseboxsoftware.com wrote:
 Thanks for the response. I’m trying to remove a node that’s already down
 for some reason so its not allowing me to decommission it, is there some
 other way to do this?



 On Tue, Jul 7, 2015 at 12:45 PM, Kiran mk coolkiran2...@gmail.com wrote:

 Yes, if your intension is to decommission a node.  You can do that by
 clicking on the node and decommission.

 Best Regards,
 Kiran.M.K.
 On Jul 8, 2015 1:00 AM, Sid Tantia sid.tan...@baseboxsoftware.com
 wrote:

  I know you can use `nodetool removenode` from the command line but is
 there a way to remove a node from a cluster using OpsCenter?




Re: Is there a way to remove a node with Opscenter?

2015-07-07 Thread Robert Coli
On Tue, Jul 7, 2015 at 4:39 PM, Sid Tantia sid.tan...@baseboxsoftware.com
wrote:

 I tried both `nodetool remove node Host ID` and `nodetool decommission`
 and they both give the error:

  nodetool: Failed to connect to '127.0.0.1:7199' - ConnectException:
 'Connection refused’.

 Here is what I have tried to fix this:


Instead of that stuff, why not :

1) use lsof to determine what IP is being listened to on 7199 by the
running process?
2) connect to that IP

?

=Rob


Re: Is there a way to remove a node with Opscenter?

2015-07-07 Thread Michael Shuler

On 07/07/2015 07:27 PM, Robert Coli wrote:

On Tue, Jul 7, 2015 at 4:39 PM, Sid Tantia
sid.tan...@baseboxsoftware.com mailto:sid.tan...@baseboxsoftware.com
wrote:

I tried both `nodetool remove node Host ID` and `nodetool
decommission` and they both give the error:

nodetool: Failed to connect to '127.0.0.1:7199
http://127.0.0.1:7199' - ConnectException: 'Connection refused’.

Here is what I have tried to fix this:


Instead of that stuff, why not :

1) use lsof to determine what IP is being listened to on 7199 by the
running process?
2) connect to that IP


OP said node was already down/dead:

Don't forget the hierarchy of node removal in #cassandra: decommission, 
removenode, removenode force, assassinate.  Escalate in that order.


https://twitter.com/faltering/status/559845791741657088

:)
Michael



abnormal log after remove a node

2015-06-19 Thread 曹志富
I have a C* 2.1.5 with 24 nodes.A few days ago ,I have remove a node from
this cluster using nodetool decommission.

But tody I find some log like this:

INFO  [GossipStage:1] 2015-06-19 17:38:05,616 Gossiper.java:968 -
InetAddress /172.19.105.41 is now DOWN
INFO  [GossipStage:1] 2015-06-19 17:38:05,617 StorageService.java:1885 -
Removing tokens [-1014432261309809702, -1055322450438958612,
-1120728727235087395, -1191392141261832305, -1203676771883970142,
-1215563040745505837, -1215648909329054362, -1269531760567530381,
-1278047879489577908, -1313427877031136549, -1342822572958042617,
-1350792764922315814, -1383390744017639599, -139000372807970456,
-140827955201469664, -1631551789771606023, -1633789813430312609,
-1795528665156349205, -1836619444785023397, -1879127294549041822,
-1962337787208890426, -2022309807234530256, -2033402140526360327,
-2089413865145942100, -210961549458416802, -2148530352195763113,
-2184481573787758786, -610790268720205, -2340762266634834427,
-2513416003567685694, -2520971378752190013, -2596695976621541808,
-2620636796023437199, -2640378596436678113, -2679143017361311011,
-2721176590519112233, -2749213392354746126, -279267896827516626,
-2872377759991294853, -2904711688111888325, -290489381926812623,
-3000574339499272616, -301428600802598523, -3019280155316984595,
-3024451041907074275, -3056898917375012425, -3161300347260716852,
-3166392383659271772, -3327634380871627036, -3530685865340274372,
-3563112657791369745, -366930313427781469, -3729582520450700795,
-3901838244986519991, -4065326606010524312, -4174346928341550117,
-4184239233207315432, -4204369933734181327, -4206479093137814808,
-421410317165821100, -4311166118017934135, -4407123461118340117,
-4466364858622123151, -4466939645485100087, -448955147512581975,
-4587780638857304626, -4649897584350376674, -4674234125365755024
, -4833801201210885896, -4857586579802212277, -4868896650650107463,
-4980063310159547694, -4983471821416248610, -4992846054037653676,
-5026994389965137674, -514302500353679181
0, -5198414516309928594, -5245363745777287346, -5346838390293957674,
-5374413419545696184, -5427881744040857637, -5453876964430787287,
-5491923669475601173, -55219734138599212
6, -5523011502670737422, -5537121117160410549, -5557015938925208697,
-5572489682738121748, -5745899409803353484, -5771239101488682535,
-5893479791287484099, -59766730414807540
44, -6014643892406938367, -6086002438656595783, -6129360679394503700,
-6224240257573911174, -6290393495130499466, -6378712056928268929,
-6430306056990093461, -6800188263839065
013, -6912720411187525051, -7160327814305587432, -7175004328733776324,
-7272070430660252577, -7307945744786025148, -742448651973108101,
-7539255117639002578, -7657460716997978
94, -7846698077070579798, -7870621904906244395, -7900841391761900719,
-7918145426423910061, -7936795453892692473, -8070255024778921411,
-8086888710627677669, -8124855925323654
631, -8175270408138820500, -8271197636596881168, -8336685710406477123,
-8466220397076441627, -8534337908154758270, -8550484400487603561,
-862246738021989870, -8727219287242892
185, -8895705475282612927, -8921801772904834063, -9057266752652143883,
-9059183540698454288, -9067986437682229598, -9148183367896132028,
-962208188860606543, 10859447725819218
30, 1189775396643491793, 1253728955879686947, 1389982523380382228,
1429632314664544045, 143610053770130548, 150118120072602242,
1575692041584712198, 1624575905722628764, 17894
76212785155173, 1995296121962835019, 2041217364870030239,
2120277336231792146, 2124445736743406711, 2154979704292433983,
2340726755918680765, 23481654796845972, 23620268084352
24407, 2366144489007464626, 2381492708106933027, 2398868971489617398,
2427315953339163528, 2433999003913998534, 2633074510238705620,
266659839023809792, 2677817641360639089, 2
719725410894526151, 2751925111749406683, 2815703589803785617,
3041515796379693113, 3044903149214270978, 3094954503756703989,
3243933267690865263, 3246086646486800371, 33270068
97333869434, 3393657685587750192, 3395065499228709345, 3426126123948029459,
3500469615600510698, 3644011364716880512, 3693249207133187620,
3776164494954636918, 38780676797
8035, 3872151295451662867, 3937077827707223414, 4041082935346014761,
4060208918173638435, 4086747843759164940, 4165638694482690057,
4203996339238989224, 4220155275330961826, 4
366784953339236686, 4390116924352514616, 4391225331964772681,
4392419346255765958, 4448400054980766409, 4463335839328115373,
4547306976104362915, 4588174843388248100, 48438580
67983993745, 4912719175808770608, 499628843707992459, 5004392861473086088,
5021047773702107258, 510226752691159107, 5109551630357971118,
5157669927051121583, 51627694176199618
24, 5238710860488961530, 5245958115092331518, 5302459768185143407,
5373077323749320571, 5445982956737768774, 5526076427753104565,
5531878975169972758, 5590672474842108747, 561
8238086143944892, 5645763748154253201, 5648082473497629258,
5799608283794045232, 5968931466409317704, 6080339666926312644,
6222992739052178144, 6329332485451402638

Re: Howto remove currently assigned data directory from 2.0.12 nodes

2015-03-05 Thread Robert Coli
On Wed, Mar 4, 2015 at 6:12 PM, Steffen Winther cassandra.u...@siimnet.dk
wrote:

 Robert Coli rcoli at eventbrite.com writes:

 
  1) stop node
  2) move sstables from no-longer-data-directories into
 still-data-directories

 Okay, just into any other random data dir?
 Few files here and there to spread amount of data between still-data-dirs?


Yep, that's more or less what JBOD allocation tries to do, heh.


 What about rebalancing amount of data among nodes or doesn't this matter?


Your node isn't losing any data, it's just storing the same data in fewer
directories. I'm not sure what rebalancing among nodes could be required in
this case?

=Rob


Re: Howto remove currently assigned data directory from 2.0.12 nodes

2015-03-04 Thread Robert Coli
On Wed, Mar 4, 2015 at 3:28 PM, Steffen Winther cassandra.u...@siimnet.dk
wrote:

 Howto remove already assigned
 data file directories from running nodes?


1) stop node
2) move sstables from no-longer-data-directories into still-data-directories
3) modify conf file
4) start node

I wonder how pending compactions are handled in this case; potential edge
case problem. Try it and see.

=Rob
http://twitter.com/rcolidba


Re: Howto remove currently assigned data directory from 2.0.12 nodes

2015-03-04 Thread Steffen Winther
Robert Coli rcoli at eventbrite.com writes:

 
 1) stop node
 2) move sstables from no-longer-data-directories into still-data-directories

Okay, just into any other random data dir?
Few files here and there to spread amount of data between still-data-dirs?

 3) modify conf file
 4) start node

 I wonder how pending compactions are handled in this case;
potential edge case problem.
Try it and see.
What about rebalancing amount of data among nodes
or doesn't this matter?



Howto remove currently assigned data directory from 2.0.12 nodes

2015-03-04 Thread Steffen Winther
HI

Got a cassandra cluster 2.0.12 with three nodes,
that I would like to reduce storage capacity as I 
would like to reuse some disks for a PoC
cassandra 1.2.15 cluster on the same nodes.

Howto remove already assigned
data file directories from running nodes?

f.ex. got:

data_file_directories : [ /data1, /data2, /data3, /data4]

which I would like to reduce to:

data_file_directories : [ /data1, /data2]

I would still off course have free storage in this cluster :)

TIA





Re: How to remove obsolete error message in Datastax Opscenter?

2015-02-10 Thread Björn Hachmann
Thank you!
I would like to add, that Opscenter is a valuable tool for my work. Thanks
for your work!

Kind regards
Björn

--

Björn Hachmann
metrigo GmbH

NEUE ADRESSE:
Lagerstraße 36
20357 Hamburg
p: +49 40 2093108-88
Geschäftsführer: Christian Müller, Tobias Schlottke, Philipp Westermeyer

Die Gesellschaft ist eingetragen beim Registergericht Hamburg
Nr. HRB 120447.


2015-02-09 18:22 GMT+01:00 Nick Bailey n...@datastax.com:

 To clarify what Chris said, restarting opscenter will remove the
 notification, but we also have a bug filed to make that behavior a little
 better and allow dismissing that notification without a restart. Thanks for
 reporting the issue!

 -Nick

 On Mon, Feb 9, 2015 at 9:00 AM, Chris Lohfink clohfin...@gmail.com
 wrote:

 Restarting opscenter service will get rid of it.

 Chris

 On Mon, Feb 9, 2015 at 3:01 AM, Björn Hachmann 
 bjoern.hachm...@metrigo.de wrote:

 Good morning,

 unfortunately my last rolling restart of our Cassandra cluster issued
 from OpsCenter (5.0.2) failed. No big deal, but since then OpsCenter is
 showing an error message at the top of its screen:
 Error restarting cluster: Timed out waiting for Cassandra to start..

 Does anybody know how to remove that message permanently?

 Thank you very much in advance!

 Kind regards
 Björn Hachmann






How to remove obsolete error message in Datastax Opscenter?

2015-02-09 Thread Björn Hachmann
Good morning,

unfortunately my last rolling restart of our Cassandra cluster issued from
OpsCenter (5.0.2) failed. No big deal, but since then OpsCenter is showing
an error message at the top of its screen:
Error restarting cluster: Timed out waiting for Cassandra to start..

Does anybody know how to remove that message permanently?

Thank you very much in advance!

Kind regards
Björn Hachmann


Re: How to remove obsolete error message in Datastax Opscenter?

2015-02-09 Thread Nick Bailey
To clarify what Chris said, restarting opscenter will remove the
notification, but we also have a bug filed to make that behavior a little
better and allow dismissing that notification without a restart. Thanks for
reporting the issue!

-Nick

On Mon, Feb 9, 2015 at 9:00 AM, Chris Lohfink clohfin...@gmail.com wrote:

 Restarting opscenter service will get rid of it.

 Chris

 On Mon, Feb 9, 2015 at 3:01 AM, Björn Hachmann bjoern.hachm...@metrigo.de
  wrote:

 Good morning,

 unfortunately my last rolling restart of our Cassandra cluster issued
 from OpsCenter (5.0.2) failed. No big deal, but since then OpsCenter is
 showing an error message at the top of its screen:
 Error restarting cluster: Timed out waiting for Cassandra to start..

 Does anybody know how to remove that message permanently?

 Thank you very much in advance!

 Kind regards
 Björn Hachmann





Re: How to remove obsolete error message in Datastax Opscenter?

2015-02-09 Thread Colin
Stop using opscenter?

:)

Sorry, couldnt resist...

--
Colin Clark 
+1 612 859 6129
Skype colin.p.clark

 On Feb 9, 2015, at 3:01 AM, Björn Hachmann bjoern.hachm...@metrigo.de wrote:
 
 Good morning,
 
 unfortunately my last rolling restart of our Cassandra cluster issued from 
 OpsCenter (5.0.2) failed. No big deal, but since then OpsCenter is showing an 
 error message at the top of its screen:
 Error restarting cluster: Timed out waiting for Cassandra to start..
 
 Does anybody know how to remove that message permanently?
 
 Thank you very much in advance!
 
 Kind regards
 Björn Hachmann


Re: How to remove obsolete error message in Datastax Opscenter?

2015-02-09 Thread Chris Lohfink
Restarting opscenter service will get rid of it.

Chris

On Mon, Feb 9, 2015 at 3:01 AM, Björn Hachmann bjoern.hachm...@metrigo.de
wrote:

 Good morning,

 unfortunately my last rolling restart of our Cassandra cluster issued from
 OpsCenter (5.0.2) failed. No big deal, but since then OpsCenter is showing
 an error message at the top of its screen:
 Error restarting cluster: Timed out waiting for Cassandra to start..

 Does anybody know how to remove that message permanently?

 Thank you very much in advance!

 Kind regards
 Björn Hachmann



Re: nodetool compact cannot remove tombstone in system keyspace

2015-01-13 Thread Xu Zhongxing
Thanks for confirming the tombstones will only get removed during compaction 
if they are older than GC_Grace_Seconds for that CF. I didn't find such a 
clarification in the documentation. That answered my question.


Since the table that has too many tombstones is in the system keyspace, I 
cannot alter its gc_grace_seconds setting. gc_grace_seconds is now 7 days, 
which is certainly longer than the age of the tombstones. 


Is there any way that I can remove the tombstones in the system keyspace 
immediately?

At 2015-01-13 19:49:47, Rahul Neelakantan ra...@rahul.be wrote:

I am not sure about the tombstone_failure_threshold, but the tombstones will 
only get removed during compaction if they are older than GC_Grace_Seconds for 
that CF. How old are these tombstones?

Rahul

On Jan 12, 2015, at 11:27 PM, Xu Zhongxing xu_zhong_x...@163.com wrote:


Hi,


When I connect to C* with driver, I found some warnings in the log (I increased 
tombstone_failure_threshold to 15 to see the warning)


WARN [ReadStage:5] 2015-01-13 12:21:14,595 SliceQueryFilter.java (line 225) 
Read 34188 live and 104186 tombstoned cells in system.schema_columns (see 
tombstone_warn_threshold). 2147483387 columns was requested, slices=[-], 
delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647}
 WARN [ReadStage:5] 2015-01-13 12:21:15,562 SliceQueryFilter.java (line 225) 
Read 34209 live and 104247 tombstoned cells in system.schema_columns (see 
tombstone_warn_threshold). 2147449199 columns was requested, slices=[-], 
delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647}


I run the command:
nodetool compact system 


But the tombstone number does not decrease. I still see the warnings with the 
exact number of tombstones.
Why is this happening? What should I do to remove the tombstones in the system 
keyspace?

Re: nodetool compact cannot remove tombstone in system keyspace

2015-01-13 Thread Rahul Neelakantan
I am not sure about the tombstone_failure_threshold, but the tombstones will 
only get removed during compaction if they are older than GC_Grace_Seconds for 
that CF. How old are these tombstones?

Rahul

 On Jan 12, 2015, at 11:27 PM, Xu Zhongxing xu_zhong_x...@163.com wrote:
 
 Hi,
 
 When I connect to C* with driver, I found some warnings in the log (I 
 increased tombstone_failure_threshold to 15 to see the warning)
 
 WARN [ReadStage:5] 2015-01-13 12:21:14,595 SliceQueryFilter.java (line 225) 
 Read 34188 live and 104186 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147483387 columns was requested, slices=[-], 
 delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647}
  WARN [ReadStage:5] 2015-01-13 12:21:15,562 SliceQueryFilter.java (line 225) 
 Read 34209 live and 104247 tombstoned cells in system.schema_columns (see 
 tombstone_warn_threshold). 2147449199 columns was requested, slices=[-], 
 delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647}
 
 I run the command:
 nodetool compact system 
 
 But the tombstone number does not decrease. I still see the warnings with the 
 exact number of tombstones.
 Why is this happening? What should I do to remove the tombstones in the 
 system keyspace?


nodetool compact cannot remove tombstone in system keyspace

2015-01-12 Thread Xu Zhongxing
Hi,


When I connect to C* with driver, I found some warnings in the log (I increased 
tombstone_failure_threshold to 15 to see the warning)


WARN [ReadStage:5] 2015-01-13 12:21:14,595 SliceQueryFilter.java (line 225) 
Read 34188 live and 104186 tombstoned cells in system.schema_columns (see 
tombstone_warn_threshold). 2147483387 columns was requested, slices=[-], 
delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647}
 WARN [ReadStage:5] 2015-01-13 12:21:15,562 SliceQueryFilter.java (line 225) 
Read 34209 live and 104247 tombstoned cells in system.schema_columns (see 
tombstone_warn_threshold). 2147449199 columns was requested, slices=[-], 
delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647}


I run the command:
nodetool compact system 


But the tombstone number does not decrease. I still see the warnings with the 
exact number of tombstones.
Why is this happening? What should I do to remove the tombstones in the system 
keyspace?

Re: Cassandra add a node and remove a node

2014-12-02 Thread Robert Coli
On Mon, Dec 1, 2014 at 7:10 PM, Neha Trivedi nehajtriv...@gmail.com wrote:

 No the old node is not defective. We Just want to separate out that Server
 for testing.
 And add a new node. (Present cluster has two Nodes and RF=2)


If you currently have two nodes and RF=2, you must add the new node before
removing the old node, so that you always have at least two nodes in the
cluster.

=Rob


Re: Cassandra add a node and remove a node

2014-12-02 Thread Neha Trivedi
Thanks Jens and Robert !!!

On Wed, Dec 3, 2014 at 2:20 AM, Robert Coli rc...@eventbrite.com wrote:

 On Mon, Dec 1, 2014 at 7:10 PM, Neha Trivedi nehajtriv...@gmail.com
 wrote:

 No the old node is not defective. We Just want to separate out that
 Server for testing.
 And add a new node. (Present cluster has two Nodes and RF=2)


 If you currently have two nodes and RF=2, you must add the new node before
 removing the old node, so that you always have at least two nodes in the
 cluster.

 =Rob




Re: Cassandra add a node and remove a node

2014-12-01 Thread Robert Coli
On Sun, Nov 30, 2014 at 10:15 PM, Neha Trivedi nehajtriv...@gmail.com
wrote:

 I need to Add new Node and remove existing node.


What is the purpose of this action? Is the old node defective, and being
replaced 1:1 with the new node?

=Rob


Re: Cassandra add a node and remove a node

2014-12-01 Thread Neha Trivedi
No the old node is not defective. We Just want to separate out that Server
for testing.
And add a new node. (Present cluster has two Nodes and RF=2)

thanks

On Tue, Dec 2, 2014 at 12:04 AM, Robert Coli rc...@eventbrite.com wrote:

 On Sun, Nov 30, 2014 at 10:15 PM, Neha Trivedi nehajtriv...@gmail.com
 wrote:

 I need to Add new Node and remove existing node.


 What is the purpose of this action? Is the old node defective, and being
 replaced 1:1 with the new node?

 =Rob




  1   2   >