Cassandra Cleanup and disk space

2015-11-26 Thread Luigi Tagliamonte
Hi Everyone,
I'd like to understand what cleanup does on a running cluster when there is
no cluster topology change, i did a test and i saw the cluster disk space
shrink of 200GB.
I'm using cassandra 2.1.9.
-- 
Luigi
---
“The only way to get smarter is by playing a smarter opponent.”


Change the rack of a server

2015-11-26 Thread Badrjan
So I have a 8 node cluster and I would like to change the rack of one node. 
How should I do that? 
B.

Re: Cassandra Cleanup and disk space

2015-11-26 Thread Carlos Alonso
May it be a SizeTieredCompaction of big SSTables just finished and freed
some space?

Carlos Alonso | Software Engineer | @calonso 

On 26 November 2015 at 08:55, Luigi Tagliamonte  wrote:

> Hi Everyone,
> I'd like to understand what cleanup does on a running cluster when there
> is no cluster topology change, i did a test and i saw the cluster disk
> space shrink of 200GB.
> I'm using cassandra 2.1.9.
> --
> Luigi
> ---
> “The only way to get smarter is by playing a smarter opponent.”
>


RE: list data value multiplied x2 in multi-datacenter environment

2015-11-26 Thread Ngoc Minh VO
Thanks Duy Hai for these details.

We you know whether the problems have been fixed or planned to be fixed? We are 
using C* 2.0.14.

I didn't find any jira ticket concerning the issue.
Regards,



From: DuyHai Doan
Sent: Wednesday, November 25, 2015 9:39:40 PM
To: user@cassandra.apache.org
Subject: Re: list data value multiplied x2 in multi-datacenter environment

There was several bugs in the past related to list in CQL.

Indeed the timestamp used for list columns are computed server side using a 
special algorithm. I wonder if in case of read-repair or/and hinted-handoff, 
would the original timestamp (the timestamp generated by the coordinator at the 
first insert/update) be used or the server will generate another one using its 
algorithm, it may explain the behavior.



On Wed, Nov 25, 2015 at 9:36 PM, Ngoc Minh VO 
> wrote:
Our insert/select queries use CL = QUORUM.

We don’t use BatchStatement to import data but executeAsync(Statement) with a 
fixed-size queue.

Regards,

From: Jack Krupansky 
[mailto:jack.krupan...@gmail.com]
Sent: mercredi 25 novembre 2015 18:09

To: user@cassandra.apache.org
Subject: Re: list data value multiplied x2 in multi-datacenter environment

Be sire to include your actual insert statement. Also, what consistency level 
was used for the insert (all, quorum, local quorum, one, or...)?


-- Jack Krupansky

On Wed, Nov 25, 2015 at 11:43 AM, Ngoc Minh VO 
> wrote:
No. We do not use update.
All inserts are idempotent and there is no read-before-write query.

On the corrupted data row, we have verified that the data only written once.

Thanks for your answer!

From: Laing, Michael 
[mailto:michael.la...@nytimes.com]
Sent: mercredi 25 novembre 2015 15:39
To: user@cassandra.apache.org
Subject: Re: list data value multiplied x2 in multi-datacenter environment

You don't have any syntax in your application anywhere such as:

UPDATE data SET field5 = field5 + [ 1,2,3 ] WHERE field1=...;

Just a quick idempotency check :)

On Wed, Nov 25, 2015 at 9:16 AM, Jack Krupansky 
> wrote:
Is the data corrupted exactly the same way on all three nodes and in both data 
centers, or just on one or two nodes or in only one data center?

Are both columns doubled in the same row, or only one of them in a particular 
row?

Does sound like a bug though, worthy of a Jira ticket.

-- Jack Krupansky

On Wed, Nov 25, 2015 at 4:05 AM, Ngoc Minh VO 
> wrote:
Hello all,

We encounter an issue on our Production environment that cannot be reproduced 
on Test environment: list (T = double or text) value is randomly 
“multiplied” by 2 (i.e. value sent to C*= [a, b, c], value stored in C* = [a, 
b, c, a, b, c]).

I know that it sounds weird but we just want to know whether it is a known 
issue (found nothing with Google…). We are working on a small dataset to narrow 
down issue with log data and maybe create a ticket in for DataStax Java Driver 
or Cassandra teams.

Cassandra v2.0.14
DataStax Java Driver v2.1.7.1
OS RHEL6
Prod Cluster topology = 16 nodes over 2 datacenters (RF = 3 per DC)
UAT Cluster topology = 6 nodes on 1 datacenter (RF = 3)

The only difference between Prod and UAT cluster is the multi-datacenter mode 
on Prod one.
We do not insert twice the same data on the same column of any specific row. 
All inserts/updates are idempotent!

Data table:
CREATE TABLE data (
field1 text,
field2 int,
field3 text,
field4 double,
field5 list, -- randomly having corrupted data, containing [1, 2, 
3, 1, 2, 3] instead of [1, 2, 3]
field6 text,
field7 list,   -- randomly having corrupted data, containing [a, b, 
c, a, b, c] instead of [a, b, c]
PRIMARY KEY ((field1, field2), field3)
) WITH compaction = { 'class' : 'LeveledCompactionStrategy' };

Thanks in advance for your help.
Best regards,
Minh

This message and any attachments (the "message") is
intended solely for the intended addressees and is confidential.
If you receive this message in error,or are not the intended recipient(s),
please delete it and any copies from your systems and immediately notify
the sender. Any unauthorized view, use that does not comply with its purpose,
dissemination or disclosure, either whole or partial, is prohibited. Since the 
internet
cannot guarantee the integrity of this message which may not be reliable, BNP 
PARIBAS
(and its subsidiaries) shall not be liable for the message if modified, changed 
or falsified.
Do not print this message unless it is necessary,consider the environment.


Re: Cassandra Cleanup and disk space

2015-11-26 Thread Luigi Tagliamonte
I did it 2 times and in both times it freed a lot of space, don't think
that it's just a coincidence.
On Nov 26, 2015 10:56 AM, "Carlos Alonso"  wrote:

> May it be a SizeTieredCompaction of big SSTables just finished and freed
> some space?
>
> Carlos Alonso | Software Engineer | @calonso 
>
> On 26 November 2015 at 08:55, Luigi Tagliamonte  wrote:
>
>> Hi Everyone,
>> I'd like to understand what cleanup does on a running cluster when there
>> is no cluster topology change, i did a test and i saw the cluster disk
>> space shrink of 200GB.
>> I'm using cassandra 2.1.9.
>> --
>> Luigi
>> ---
>> “The only way to get smarter is by playing a smarter opponent.”
>>
>
>


Re: list data value multiplied x2 in multi-datacenter environment

2015-11-26 Thread DuyHai Doan
First you need to provide a way to reproduce it otherwise the issue won't
be processed. Then create a JIRA .

On Thu, Nov 26, 2015 at 4:44 PM, Ngoc Minh VO 
wrote:

> Thanks Duy Hai for these details.
>
> We you know whether the problems have been fixed or planned to be fixed?
> We are using C* 2.0.14.
>
> I didn't find any jira ticket concerning the issue.
> Regards,
>
>
> --
> *From:* DuyHai Doan
> *Sent:* Wednesday, November 25, 2015 9:39:40 PM
>
> *To:* user@cassandra.apache.org
> *Subject:* Re: list data value multiplied x2 in multi-datacenter
> environment
>
> There was several bugs in the past related to list in CQL.
>
> Indeed the timestamp used for list columns are computed server side using
> a special algorithm. I wonder if in case of read-repair or/and
> hinted-handoff, would the original timestamp (the timestamp generated by
> the coordinator at the first insert/update) be used or the server will
> generate another one using its algorithm, it may explain the behavior.
>
>
>
> On Wed, Nov 25, 2015 at 9:36 PM, Ngoc Minh VO 
> wrote:
>
>> Our insert/select queries use CL = QUORUM.
>>
>>
>>
>> We don’t use BatchStatement to import data but executeAsync(Statement)
>> with a fixed-size queue.
>>
>>
>>
>> Regards,
>>
>>
>>
>> *From:* Jack Krupansky [mailto:jack.krupan...@gmail.com]
>> *Sent:* mercredi 25 novembre 2015 18:09
>>
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: list data value multiplied x2 in multi-datacenter
>> environment
>>
>>
>>
>> Be sire to include your actual insert statement. Also, what consistency
>> level was used for the insert (all, quorum, local quorum, one, or...)?
>>
>>
>>
>>
>> -- Jack Krupansky
>>
>>
>>
>> On Wed, Nov 25, 2015 at 11:43 AM, Ngoc Minh VO <
>> ngocminh...@bnpparibas.com> wrote:
>>
>> No. We do not use update.
>>
>> All inserts are idempotent and there is no read-before-write query.
>>
>>
>>
>> On the corrupted data row, we have verified that the data only written
>> once.
>>
>>
>>
>> Thanks for your answer!
>>
>>
>>
>> *From:* Laing, Michael [mailto:michael.la...@nytimes.com]
>> *Sent:* mercredi 25 novembre 2015 15:39
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: list data value multiplied x2 in multi-datacenter
>> environment
>>
>>
>>
>> You don't have any syntax in your application anywhere such as:
>>
>>
>>
>> UPDATE data SET field5 = field5 + [ 1,2,3 ] WHERE field1=...;
>>
>>
>>
>> Just a quick idempotency check :)
>>
>>
>>
>> On Wed, Nov 25, 2015 at 9:16 AM, Jack Krupansky 
>> wrote:
>>
>> Is the data corrupted exactly the same way on all three nodes and in both
>> data centers, or just on one or two nodes or in only one data center?
>>
>>
>>
>> Are both columns doubled in the same row, or only one of them in a
>> particular row?
>>
>>
>>
>> Does sound like a bug though, worthy of a Jira ticket.
>>
>>
>> -- Jack Krupansky
>>
>>
>>
>> On Wed, Nov 25, 2015 at 4:05 AM, Ngoc Minh VO 
>> wrote:
>>
>> Hello all,
>>
>>
>>
>> We encounter an issue on our Production environment that cannot be
>> reproduced on Test environment: list (T = double or text) value is
>> randomly “multiplied” by 2 (i.e. value sent to C*= [a, b, c], value stored
>> in C* = [a, b, c, a, b, c]).
>>
>>
>>
>> I know that it sounds weird but we just want to know whether it is a
>> known issue (found nothing with Google…). We are working on a small dataset
>> to narrow down issue with log data and maybe create a ticket in for
>> DataStax Java Driver or Cassandra teams.
>>
>>
>>
>> Cassandra v2.0.14
>>
>> DataStax Java Driver v2.1.7.1
>>
>> OS RHEL6
>>
>> Prod Cluster topology = 16 nodes over 2 datacenters (RF = 3 per DC)
>>
>> UAT Cluster topology = 6 nodes on 1 datacenter (RF = 3)
>>
>>
>>
>> The only difference between Prod and UAT cluster is the multi-datacenter
>> mode on Prod one.
>>
>> We do not insert twice the same data on the same column of any specific
>> row. All inserts/updates are idempotent!
>>
>>
>>
>> Data table:
>>
>> CREATE TABLE data (
>>
>> field1 text,
>>
>> field2 int,
>>
>> field3 text,
>>
>> field4 double,
>>
>> field5 list, -- randomly having corrupted data, containing
>> [1, 2, 3, 1, 2, 3] instead of [1, 2, 3]
>>
>> field6 text,
>>
>> field7 list,   -- randomly having corrupted data, containing
>> [a, b, c, a, b, c] instead of [a, b, c]
>>
>> PRIMARY KEY ((field1, field2), field3)
>>
>> ) WITH compaction = { 'class' : 'LeveledCompactionStrategy' };
>>
>>
>>
>> Thanks in advance for your help.
>>
>> Best regards,
>>
>> Minh
>>
>> This message and any attachments (the "message") is
>> intended solely for the intended addressees and is confidential.
>> If you receive this message in error,or are not the intended
>> recipient(s),
>> please delete it and any copies from your systems and immediately notify
>> the sender. Any unauthorized view, use that does not comply with its
>> purpose,
>> 

Re: Cassandra Cleanup and disk space

2015-11-26 Thread sai krishnam raju potturi
Could it have been that you expanded your cluster a while back, but did not
cleanup then.

On Thu, Nov 26, 2015, 07:51 Luigi Tagliamonte  wrote:

> I did it 2 times and in both times it freed a lot of space, don't think
> that it's just a coincidence.
> On Nov 26, 2015 10:56 AM, "Carlos Alonso"  wrote:
>
>> May it be a SizeTieredCompaction of big SSTables just finished and freed
>> some space?
>>
>> Carlos Alonso | Software Engineer | @calonso
>> 
>>
>> On 26 November 2015 at 08:55, Luigi Tagliamonte  wrote:
>>
>>> Hi Everyone,
>>> I'd like to understand what cleanup does on a running cluster when there
>>> is no cluster topology change, i did a test and i saw the cluster disk
>>> space shrink of 200GB.
>>> I'm using cassandra 2.1.9.
>>> --
>>> Luigi
>>> ---
>>> “The only way to get smarter is by playing a smarter opponent.”
>>>
>>
>>


Re: Cassandra Cleanup and disk space

2015-11-26 Thread Jai Bheemsen Rao Dhanwada
Cleanup is specific to a node, may be cleanup was ran one node before and
was ran another node the second time.

On Thu, Nov 26, 2015 at 6:37 PM, sai krishnam raju potturi <
pskraj...@gmail.com> wrote:

> Could it have been that you expanded your cluster a while back, but did
> not cleanup then.
>
> On Thu, Nov 26, 2015, 07:51 Luigi Tagliamonte  wrote:
>
>> I did it 2 times and in both times it freed a lot of space, don't think
>> that it's just a coincidence.
>> On Nov 26, 2015 10:56 AM, "Carlos Alonso"  wrote:
>>
>>> May it be a SizeTieredCompaction of big SSTables just finished and freed
>>> some space?
>>>
>>> Carlos Alonso | Software Engineer | @calonso
>>> 
>>>
>>> On 26 November 2015 at 08:55, Luigi Tagliamonte 
>>> wrote:
>>>
 Hi Everyone,
 I'd like to understand what cleanup does on a running cluster when
 there is no cluster topology change, i did a test and i saw the cluster
 disk space shrink of 200GB.
 I'm using cassandra 2.1.9.
 --
 Luigi
 ---
 “The only way to get smarter is by playing a smarter opponent.”

>>>
>>>


Three questions about cassandra

2015-11-26 Thread Hadmut Danisch
Hi, 

I'm currently reading through heaps of docs and web pages to learn
cassandra, but there's still three questions I could not find answers
for, maybe someone could help:


1. What happens, if a node is down for some time (hours, days,
   weeks,...) for whatever reason (hardware, power, or network
   failure, maintenance...) and gets back online?

   Does the node remain in its former state and thus become
   inconsistent, have outdated data, or does it update the changes
   that occured during its downtime from other nodes?

   Can nodes be easily offline for some time, then return and proceed,
   or do they have to be added as a fresh node replacement (of their
   own) to start from scratch?



2. cassandra allows to choose from several data consistency levels,
   especially allowing write access that does not update all nodes
   (i.e. QUORUM, ONE, TWO, THREE). 

   What happens with those nodes who did not get an update? Will they
   synchronize with the updated nodes automatically, or will they
   remain in their old state (forever or until next explicit write
   access)?





3. What exactly happens, when a new node is added to a cluster? Will
   all records now belonging to the new node be automatically shifted
   from others?

   Web page
   
http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
   describes a "streaming process", which sounds as if a new node was
   busy to collect it's belongings from others, but it also says to
   perform a
   
   nodetool cleanup

   on all the old nodes, which would "remove the keys no longer
   belonging to those nodes", which rather sounds like a simple drop,
   i.e. having those records lost. 

   So does cassandra safely fill new nodes, or do they start as empty
   ones and their data is lost?



Thank you!

regards
Hadmut


Re: Three questions about cassandra

2015-11-26 Thread Jeff Jirsa
1) It comes online in its former state. The operator is responsible for 
consistency beyond that point. Common solutions would be `nodetool repair` (and 
if you get really smart, you can start the daemon with the thrift/native 
listeners disabled, run repair, and then enable listeners, so that when it DOES 
serve requests, they’re not out of date)

2) Consistency level tells cassandra how many replicas it will wait to 
acknowledge the write - it doesn’t necessarily tell us how many replicas 
will/won’t get the write (even writing at QUORUM, it’s likely that replicas 
will get the write). Those that do not may get the writes later via read 
repair, or explicit repair (`nodetool repair`).

3) Yes, joining nodes acquire a part of the token range, and data will be 
streamed to the joining node





On 11/26/15, 7:10 AM, "Hadmut Danisch"  wrote:

>Hi, 
>
>I'm currently reading through heaps of docs and web pages to learn
>cassandra, but there's still three questions I could not find answers
>for, maybe someone could help:
>
>
>1. What happens, if a node is down for some time (hours, days,
>   weeks,...) for whatever reason (hardware, power, or network
>   failure, maintenance...) and gets back online?
>
>   Does the node remain in its former state and thus become
>   inconsistent, have outdated data, or does it update the changes
>   that occured during its downtime from other nodes?
>
>   Can nodes be easily offline for some time, then return and proceed,
>   or do they have to be added as a fresh node replacement (of their
>   own) to start from scratch?
>
>
>
>2. cassandra allows to choose from several data consistency levels,
>   especially allowing write access that does not update all nodes
>   (i.e. QUORUM, ONE, TWO, THREE). 
>
>   What happens with those nodes who did not get an update? Will they
>   synchronize with the updated nodes automatically, or will they
>   remain in their old state (forever or until next explicit write
>   access)?
>
>
>
>
>
>3. What exactly happens, when a new node is added to a cluster? Will
>   all records now belonging to the new node be automatically shifted
>   from others?
>
>   Web page
>   
> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html
>   describes a "streaming process", which sounds as if a new node was
>   busy to collect it's belongings from others, but it also says to
>   perform a
>   
>   nodetool cleanup
>
>   on all the old nodes, which would "remove the keys no longer
>   belonging to those nodes", which rather sounds like a simple drop,
>   i.e. having those records lost. 
>
>   So does cassandra safely fill new nodes, or do they start as empty
>   ones and their data is lost?
>
>
>
>Thank you!
>
>regards
>Hadmut

smime.p7s
Description: S/MIME cryptographic signature


Re: Change the rack of a server

2015-11-26 Thread Jack Krupansky
What RF are you using? How many data centers? What rack configuration are
you currently using/ Are you in fact using a rack-aware network topology
partitioner?

Specifically, what are you attempting to accomplish - why change the rack
at all? Not that changing the rack is necessarily bad, just to clarify your
objective.

See:
http://docs.datastax.com/en/cassandra/2.0/cassandra/architecture/architectureDataDistributeReplication_c.html

You may just have to bootstrap the new node with the proper rack and then
run repair on the nodes which formerly held replicas of the old node.

Or, you may have to run a full repair for all nodes of the cluster:
http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/opsMoveNodeRack.html


-- Jack Krupansky

On Thu, Nov 26, 2015 at 4:02 AM, Badrjan  wrote:

> So I have a 8 node cluster and I would like to change the rack of one
> node. How should I do that?
>
> B.
>


Re: Change the rack of a server

2015-11-26 Thread Paulo Motta
Changing the rack of a live node is discouraged, since the ring ranges the
node is responsible for will change, meaning the node  will not own part of
the data for its new ranges and other nodes may not have some of its
current data.

It will be a forbidden operation in the upcoming versions of Cassandra,
since it has caused trouble before, see
https://issues.apache.org/jira/browse/CASSANDRA-10242 for more background.
The safest thing is to decommission the node and bootstrap the node again
in a new rack, as Jack suggested.

2015-11-26 14:40 GMT-08:00 Jack Krupansky :

> What RF are you using? How many data centers? What rack configuration are
> you currently using/ Are you in fact using a rack-aware network topology
> partitioner?
>
> Specifically, what are you attempting to accomplish - why change the rack
> at all? Not that changing the rack is necessarily bad, just to clarify your
> objective.
>
> See:
>
> http://docs.datastax.com/en/cassandra/2.0/cassandra/architecture/architectureDataDistributeReplication_c.html
>
> You may just have to bootstrap the new node with the proper rack and then
> run repair on the nodes which formerly held replicas of the old node.
>
> Or, you may have to run a full repair for all nodes of the cluster:
>
> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/opsMoveNodeRack.html
>
>
> -- Jack Krupansky
>
> On Thu, Nov 26, 2015 at 4:02 AM, Badrjan  wrote:
>
>> So I have a 8 node cluster and I would like to change the rack of one
>> node. How should I do that?
>>
>> B.
>>
>
>


Re: Change the rack of a server

2015-11-26 Thread Jack Krupansky
Right, and I also meant to refer to the anti-pattern doc related to racks:
http://docs.datastax.com/en/cassandra/2.0/cassandra/architecture/architecturePlanningAntiPatterns_c.html

Although that doc seems to discourage rack selection entirely when in fact
people should try to have replicas placed in separate racks since a lot of
failures tend to be due to power, cooling, and network switching at the
rack level (or so I have heard but have no personal experience.)

-- Jack Krupansky

On Thu, Nov 26, 2015 at 7:41 PM, Paulo Motta 
wrote:

> Changing the rack of a live node is discouraged, since the ring ranges the
> node is responsible for will change, meaning the node  will not own part of
> the data for its new ranges and other nodes may not have some of its
> current data.
>
> It will be a forbidden operation in the upcoming versions of Cassandra,
> since it has caused trouble before, see
> https://issues.apache.org/jira/browse/CASSANDRA-10242 for more
> background. The safest thing is to decommission the node and bootstrap the
> node again in a new rack, as Jack suggested.
>
> 2015-11-26 14:40 GMT-08:00 Jack Krupansky :
>
>> What RF are you using? How many data centers? What rack configuration are
>> you currently using/ Are you in fact using a rack-aware network topology
>> partitioner?
>>
>> Specifically, what are you attempting to accomplish - why change the rack
>> at all? Not that changing the rack is necessarily bad, just to clarify your
>> objective.
>>
>> See:
>>
>> http://docs.datastax.com/en/cassandra/2.0/cassandra/architecture/architectureDataDistributeReplication_c.html
>>
>> You may just have to bootstrap the new node with the proper rack and then
>> run repair on the nodes which formerly held replicas of the old node.
>>
>> Or, you may have to run a full repair for all nodes of the cluster:
>>
>> http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/opsMoveNodeRack.html
>>
>>
>> -- Jack Krupansky
>>
>> On Thu, Nov 26, 2015 at 4:02 AM, Badrjan  wrote:
>>
>>> So I have a 8 node cluster and I would like to change the rack of one
>>> node. How should I do that?
>>>
>>> B.
>>>
>>
>>
>