Re: Deleted data comes back on node decommission

2017-12-15 Thread kurt greaves
X==5. I was meant to fill that in...

On 16 Dec. 2017 07:46, "kurt greaves"  wrote:

> Yep, if you don't run cleanup on all nodes (except new node) after step x,
> when you decommissioned node 4 and 5 later on, their tokens will be
> reclaimed by the previous owner. Suddenly the data in those SSTables is now
> live again because the token ownership has changed and any data in those
> SStables will be returned.
>
> Remember new nodes only add tokens to the ring, they don't affect other
> nodes tokens, so if you remove those tokens everything goes back to how it
> was before those nodes were added.
>
> Adding a maker would be incredibly complicated. Plugs not really fit the
> design of Cassandra. Here it's probably much easier to just follow
> recommended procedure when adding and removing nodes.
>
> On 16 Dec. 2017 01:37, "Python_Max"  wrote:
>
> Hello, Jeff.
>
>
> Using your hint I was able to reproduce my situation on 5 VMs.
> Simplified steps are:
> 1) set up 3-node cluster
> 2) create keyspace with RF=3 and table with gc_grace_seconds=60,
> compaction_interval=10 and unchecked_tombstone_compaction=true (to force
> compaction later)
> 3) insert 10..20 records with different partition and clustering keys
> (consistency 'all')
> 4) 'nodetool flush' on all 3 nodes
> 5) add 4th node, add 5th node
> 6) using 'nodetool getendpoints' find key that moved to both 4th and 5th
> node
> 7) delete that record from table (consistency 'all')
> 8) 'nodetool flush' on all 5 nodes, wait gc_grace_seconds, 'nodetool
> compact' on nodes which responsible for that key, check that key and
> tombstone gone using sstabledump
> 9) decommission 5th node, decommission 4th node
> 10) select data from table where key=key (consistency quorum)
>
> And the row is here.
>
> It sounds like bug in cassandra but since it is documented here
> https://docs.datastax.com/en/cassandra/3.0/cassandra/operati
> ons/opsAddNodeToCluster.html I suppose this counts as feature. It would
> be better when data which stays in sstable after new node added would have
> some marker and never returned as result to select query.
>
> Thank you very much, Jeff, for pointing me in right direction.
>
>
> On 13.12.17 18:43, Jeff Jirsa wrote:
>
>> Did you run cleanup before you shrank the cluster?
>>
>>
> --
>
> Best Regards,
> Python_Max.
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>
>


Re: Deleted data comes back on node decommission

2017-12-15 Thread kurt greaves
Yep, if you don't run cleanup on all nodes (except new node) after step x,
when you decommissioned node 4 and 5 later on, their tokens will be
reclaimed by the previous owner. Suddenly the data in those SSTables is now
live again because the token ownership has changed and any data in those
SStables will be returned.

Remember new nodes only add tokens to the ring, they don't affect other
nodes tokens, so if you remove those tokens everything goes back to how it
was before those nodes were added.

Adding a maker would be incredibly complicated. Plugs not really fit the
design of Cassandra. Here it's probably much easier to just follow
recommended procedure when adding and removing nodes.

On 16 Dec. 2017 01:37, "Python_Max"  wrote:

Hello, Jeff.


Using your hint I was able to reproduce my situation on 5 VMs.
Simplified steps are:
1) set up 3-node cluster
2) create keyspace with RF=3 and table with gc_grace_seconds=60,
compaction_interval=10 and unchecked_tombstone_compaction=true (to force
compaction later)
3) insert 10..20 records with different partition and clustering keys
(consistency 'all')
4) 'nodetool flush' on all 3 nodes
5) add 4th node, add 5th node
6) using 'nodetool getendpoints' find key that moved to both 4th and 5th
node
7) delete that record from table (consistency 'all')
8) 'nodetool flush' on all 5 nodes, wait gc_grace_seconds, 'nodetool
compact' on nodes which responsible for that key, check that key and
tombstone gone using sstabledump
9) decommission 5th node, decommission 4th node
10) select data from table where key=key (consistency quorum)

And the row is here.

It sounds like bug in cassandra but since it is documented here
https://docs.datastax.com/en/cassandra/3.0/cassandra/operati
ons/opsAddNodeToCluster.html I suppose this counts as feature. It would be
better when data which stays in sstable after new node added would have
some marker and never returned as result to select query.

Thank you very much, Jeff, for pointing me in right direction.


On 13.12.17 18:43, Jeff Jirsa wrote:

> Did you run cleanup before you shrank the cluster?
>
>
-- 

Best Regards,
Python_Max.


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org


Cassandra Replication Factor change from 2 to 3 for each data center

2017-12-15 Thread Harika Vangapelli -T (hvangape - AKRAYA INC at Cisco)
This is just basic question to ask..but it is worth to ask.

We changed Replication factor from 2 to 3 in our production cluster. We have 2 
data centers.

Does nodetool repair -dcpar from single node in one data center is sufficient 
for the whole replication to take effect? Please confirm.

Do I need to run it from each node?

[http://wwwin.cisco.com/c/dam/cec/organizations/gmcc/services-tools/signaturetool/images/logo/logo_gradient.png]



Harika Vangapelli
Engineer - IT
hvang...@cisco.com
Tel:

Cisco Systems, Inc.



United States
cisco.com


[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif]Think before you 
print.

This email may contain confidential and privileged material for the sole use of 
the intended recipient. Any review, use, distribution or disclosure by others 
is strictly prohibited. If you are not the intended recipient (or authorized to 
receive for the recipient), please contact the sender by reply email and delete 
all copies of this message.
Please click 
here for 
Company Registration Information.




Re: Deleted data comes back on node decommission

2017-12-15 Thread Alexander Dejanovski
Hi Max,

I don't know if it's related to your issue but on a side note, if you
decide to use Reaper (and use full repairs, not incremental ones), but mix
that with "nodetool repair", you'll end up with 2 pools of SSTables that
cannot get compacted together.
Reaper uses subrange repair which doesn't mark SSTables are repaired (no
anticompaction is performed, repairedAt remains at 0), while using nodetool
in full and incremental modes will perform anticompaction.

SSTables with repairedAt > 0 cannot be compacted with SSTables with
repairedAt = 0.

Bottom line is that if you want your SSTables to be compacted together
naturally, you have to run repairs either exclusively through Reaper or
exclusively through nodetool.
If you decide to use Reaper exclusively, you have to revert the repairedAt
value to 0 for all sstables on all nodes, using sstablerepairedset

.

Cheers,

On Fri, Dec 15, 2017 at 4:57 PM Jeff Jirsa  wrote:

> The generation (integer id in file names) doesn’t matter for ordering like
> this
>
> It matters in schema tables for addition of new columns/types, but it’s
> irrelevant for normal tables - you could do a user defined compaction on
> 31384 right now and it’d be rewritten as-is (minus purgable data) with the
> new highest generation, even though it’s all old data.
>
>
> --
> Jeff Jirsa
>
>
> On Dec 15, 2017, at 6:55 AM, Python_Max  wrote:
>
> Hi, Kurt.
>
> Thank you for response.
>
>
> Repairs are marked as 'done' without errors in reaper history.
>
> Example of 'wrong order':
>
> * file mc-31384-big-Data.db contains tombstone:
>
> {
> "type" : "row",
> "position" : 7782,
> "clustering" : [ "9adab970-b46d-11e7-a5cd-a1ba8cfc1426" ],
> "deletion_info" : { "marked_deleted" :
> "2017-10-28T04:51:20.589394Z", "local_delete_time" : "2017-10-28T04:51:20Z"
> },
> "cells" : [ ]
>   }
>
> * file mc-31389-big-Data.db contains data:
>
> {
> "type" : "row",
> "position" : 81317,
> "clustering" : [ "9adab970-b46d-11e7-a5cd-a1ba8cfc1426" ],
> "liveness_info" : { "tstamp" : "2017-10-19T01:34:10.055389Z" },
> "cells" : [...]
>   }
> Index 31384 is less than 31389 but I'm not sure whether it matters at all.
>
> I assume that data and tombsones are not compacting due to another reason:
> the tokens are not owned by that node anymore and the only way to purge
> such keys is 'nodetool cleanup', isn't it?
>
>
> On 14.12.17 16:14, kurt greaves wrote:
>
> Are you positive your repairs are completing successfully? Can you send
> through an example of the data in the wrong order? What you're saying
> certainly shouldn't happen, but there's a lot of room for mistakes.
>
> On 14 Dec. 2017 20:13, "Python_Max"  wrote:
>
>> Thank you for reply.
>>
>> No, I did not execute 'nodetool cleanup'. Documentation
>> https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRemoveNode.html
>> does not mention that cleanup is required.
>>
>> Do yo think that extra data which node is not responsible for can lead to
>> zombie data?
>>
>>
>> On 13.12.17 18:43, Jeff Jirsa wrote:
>>
>>> Did you run cleanup before you shrank the cluster?
>>>
>>>
>> --
>>
>> Best Regards,
>> Python_Max.
>>
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>> happen
>
>
> --
>
> Best Regards,
> Python_Max.
>
> --
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: Deleted data comes back on node decommission

2017-12-15 Thread Jeff Jirsa
The generation (integer id in file names) doesn’t matter for ordering like this 

It matters in schema tables for addition of new columns/types, but it’s 
irrelevant for normal tables - you could do a user defined compaction on 31384 
right now and it’d be rewritten as-is (minus purgable data) with the new 
highest generation, even though it’s all old data. 


-- 
Jeff Jirsa


> On Dec 15, 2017, at 6:55 AM, Python_Max  wrote:
> 
> Hi, Kurt.
> 
> Thank you for response.
> 
> 
> Repairs are marked as 'done' without errors in reaper history.
> 
> Example of 'wrong order':
> * file mc-31384-big-Data.db contains tombstone:
> {
> "type" : "row",
> "position" : 7782,
> "clustering" : [ "9adab970-b46d-11e7-a5cd-a1ba8cfc1426" ],
> "deletion_info" : { "marked_deleted" : "2017-10-28T04:51:20.589394Z", 
> "local_delete_time" : "2017-10-28T04:51:20Z" },
> "cells" : [ ]
>   }
> 
> * file mc-31389-big-Data.db contains data:
> {
> "type" : "row",
> "position" : 81317,
> "clustering" : [ "9adab970-b46d-11e7-a5cd-a1ba8cfc1426" ],
> "liveness_info" : { "tstamp" : "2017-10-19T01:34:10.055389Z" },
> "cells" : [...]
>   }
> Index 31384 is less than 31389 but I'm not sure whether it matters at all.
> 
> I assume that data and tombsones are not compacting due to another reason: 
> the tokens are not owned by that node anymore and the only way to purge such 
> keys is 'nodetool cleanup', isn't it?
> 
> 
>> On 14.12.17 16:14, kurt greaves wrote:
>> Are you positive your repairs are completing successfully? Can you send 
>> through an example of the data in the wrong order? What you're saying 
>> certainly shouldn't happen, but there's a lot of room for mistakes.
>> 
>>> On 14 Dec. 2017 20:13, "Python_Max"  wrote:
>>> Thank you for reply.
>>> 
>>> No, I did not execute 'nodetool cleanup'. Documentation 
>>> https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRemoveNode.html
>>>  does not mention that cleanup is required.
>>> 
>>> Do yo think that extra data which node is not responsible for can lead to 
>>> zombie data?
>>> 
>>> 
 On 13.12.17 18:43, Jeff Jirsa wrote:
 Did you run cleanup before you shrank the cluster?
 
>>> 
>>> -- 
>>> 
>>> Best Regards,
>>> Python_Max.
>>> 
>>> 
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>> happen
> 
> -- 
> 
> Best Regards,
> Python_Max.


Re: Data Node Density

2017-12-15 Thread Jeff Jirsa
Typing this on a phone during my commute, please excuse the inevitable typos in 
what I expect will be a long email because there’s nothing else for me to do 
right now. 

There’s a few reasons people don’t typically recommend huge nodes, the biggest 
reason being expansion and replacement. This question comes up from time to 
time, so here’s at least one other explanation I’ve written in the past: 
https://stackoverflow.com/questions/31563447/cassandra-cluster-data-density-data-size-per-node-looking-for-feedback-and/31690279#31690279

Streaming (the mechanism for bootstrap / rebuild / repair) doesn’t have a ton 
of retries built in. The larger the amount of data to stream, the more 
opportunities there are for failures. Streaming a terabyte probably succeeds 
just fine 99% of the time, 60TB probably much lower. In 2.2 and newer, 
resumable bootstrap makes this slightly less of a concern (assuming it’s 
implemented correctly).

There’s also some internals in play. When you bootstrap a new node, we create a 
streaming plan. To create that, we need to inspect all of the data files on 
disk, figure which files transfer, figure out how much actual data that is 
(which involves interacting with the compression info), queue them up, send 
them, where the other side compresses it again, recalculated metadata, and 
writes it to disk

The compression/metadata runs single threaded per stream, so you’re typically 
bound by the performance of the number streams, which correlates to the number 
of sending hosts. If you use vnodes, you can set the number of vnodes near how 
many cores/machines you’ll have, so you end up with approximately as many 
streams as cores.

 If you’ve already bought the hardware, you can try to make it work. You’ll 
need the heap to be big enough to calculate the streaming plans, and you’ll 
want to think about how you lay out the data directories (for JBOD to be safe 
you’ll need to be on 3.11, otherwise just raid0 it). Alternatively, as someone 
mentioned on this list in the past few weeks,  you can try to add some extra 
IPs and run more than one Cassandra instance per host - doing so let’s you 
treat each of them as a smaller instance. If you do this you’ll need to use 
rack awareness to make sure you don’t have multiple copies of data on the same 
machine, or a single hardware failure could make you lose data.

If you’re having specific problems trying to run a rebuild or bootstrap, you 
may have better luck with subrange repair - you’ll stream less data, and you 
can do it in very small chunks. Most importantly, if you’re having specific 
problems, don’t ask us if it works, tell us what’s failing and show us the 
errors. 

Having an outside firm come in and help explain and troubleshoot this for you 
is probably a good idea. The firms I’d personally trust if you were a close 
relative of mine asking for help are TheLastPickle and Instaclustr, but there’s 
also some very competent people at Pythian and SmartCat.io.



-- 
Jeff Jirsa


> On Dec 15, 2017, at 6:37 AM, Amit Agrawal  wrote:
> 
> Thanks Nicholas. Am aware of the official recommendations. However, in the 
> last project, we tried with 5 TB and it worked fine. 
> 
> So asking for expereinces around.
> 
> Anybody knows anyone who provides a consultancy on open source cassandra. 
> Datastax just does it for the enterprise version! 
> 
>> On Fri, Dec 15, 2017 at 3:08 PM, Nicolas Guyomar  
>> wrote:
>> Hi Amit,
>> 
>> This is way too much data per node, official recommendation are to try to 
>> stay below 2Tb per node, I have seen nodes up to 4Tb but then maintenance 
>> gets really complicated (backup, boostrap, streaming for repair etc etc)
>> 
>> Nicolas
>> 
>>> On 15 December 2017 at 15:01, Amit Agrawal  
>>> wrote:
>>> Hi,
>>> 
>>> We are trying to setup a 3 node cluster with 20 TB HD on each node. 
>>> its a bare metal setup with 44 cores on each node. 
>>> 
>>> So in total 60 TB, 66 cores , 3 node cluster.
>>> 
>>> The data velocity is very less, low access rates. 
>>> 
>>> has anyone tried with this configuration ?
>>> 
>>> A bit urgent. 
>>> 
>>> Regards,
>>> -A
>>> 
>>> 
>> 
> 


Re: Deleted data comes back on node decommission

2017-12-15 Thread Python_Max

Hi, Kurt.

Thank you for response.


Repairs are marked as 'done' without errors in reaper history.

Example of 'wrong order':

* file mc-31384-big-Data.db contains tombstone:

    {
    "type" : "row",
    "position" : 7782,
    "clustering" : [ "9adab970-b46d-11e7-a5cd-a1ba8cfc1426" ],
    "deletion_info" : { "marked_deleted" : 
"2017-10-28T04:51:20.589394Z", "local_delete_time" : 
"2017-10-28T04:51:20Z" },

    "cells" : [ ]
  }

* file mc-31389-big-Data.db contains data:

    {
    "type" : "row",
    "position" : 81317,
    "clustering" : [ "9adab970-b46d-11e7-a5cd-a1ba8cfc1426" ],
    "liveness_info" : { "tstamp" : "2017-10-19T01:34:10.055389Z" },
    "cells" : [...]
  }

Index 31384 is less than 31389 but I'm not sure whether it matters at all.

I assume that data and tombsones are not compacting due to another 
reason: the tokens are not owned by that node anymore and the only way 
to purge such keys is 'nodetool cleanup', isn't it?



On 14.12.17 16:14, kurt greaves wrote:
Are you positive your repairs are completing successfully? Can you 
send through an example of the data in the wrong order? What you're 
saying certainly shouldn't happen, but there's a lot of room for mistakes.


On 14 Dec. 2017 20:13, "Python_Max" > wrote:


Thank you for reply.

No, I did not execute 'nodetool cleanup'. Documentation

https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRemoveNode.html


does not mention that cleanup is required.

Do yo think that extra data which node is not responsible for can
lead to zombie data?


On 13.12.17 18:43, Jeff Jirsa wrote:

Did you run cleanup before you shrank the cluster?


-- 


Best Regards,
Python_Max.


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org

For additional commands, e-mail: user-h...@cassandra.apache.org

happen 



--

Best Regards,
Python_Max.



Re: Deleted data comes back on node decommission

2017-12-15 Thread Python_Max

Hello, Jeff.


Using your hint I was able to reproduce my situation on 5 VMs.
Simplified steps are:
1) set up 3-node cluster
2) create keyspace with RF=3 and table with gc_grace_seconds=60, 
compaction_interval=10 and unchecked_tombstone_compaction=true (to force 
compaction later)
3) insert 10..20 records with different partition and clustering keys 
(consistency 'all')

4) 'nodetool flush' on all 3 nodes
5) add 4th node, add 5th node
6) using 'nodetool getendpoints' find key that moved to both 4th and 5th 
node

7) delete that record from table (consistency 'all')
8) 'nodetool flush' on all 5 nodes, wait gc_grace_seconds, 'nodetool 
compact' on nodes which responsible for that key, check that key and 
tombstone gone using sstabledump

9) decommission 5th node, decommission 4th node
10) select data from table where key=key (consistency quorum)

And the row is here.

It sounds like bug in cassandra but since it is documented here 
https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddNodeToCluster.html 
I suppose this counts as feature. It would be better when data which 
stays in sstable after new node added would have some marker and never 
returned as result to select query.


Thank you very much, Jeff, for pointing me in right direction.

On 13.12.17 18:43, Jeff Jirsa wrote:

Did you run cleanup before you shrank the cluster?



--

Best Regards,
Python_Max.


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Data Node Density

2017-12-15 Thread Amit Agrawal
Thanks Nicholas. Am aware of the official recommendations. However, in the
last project, we tried with 5 TB and it worked fine.

So asking for expereinces around.

Anybody knows anyone who provides a consultancy on open source cassandra.
Datastax just does it for the enterprise version!

On Fri, Dec 15, 2017 at 3:08 PM, Nicolas Guyomar 
wrote:

> Hi Amit,
>
> This is way too much data per node, official recommendation are to try to
> stay below 2Tb per node, I have seen nodes up to 4Tb but then maintenance
> gets really complicated (backup, boostrap, streaming for repair etc etc)
>
> Nicolas
>
> On 15 December 2017 at 15:01, Amit Agrawal 
> wrote:
>
>> Hi,
>>
>> We are trying to setup a 3 node cluster with 20 TB HD on each node.
>> its a bare metal setup with 44 cores on each node.
>>
>> So in total 60 TB, 66 cores , 3 node cluster.
>>
>> The data velocity is very less, low access rates.
>>
>> has anyone tried with this configuration ?
>>
>> A bit urgent.
>>
>> Regards,
>> -A
>>
>>
>>
>


Re: Data Node Density

2017-12-15 Thread Nicolas Guyomar
Hi Amit,

This is way too much data per node, official recommendation are to try to
stay below 2Tb per node, I have seen nodes up to 4Tb but then maintenance
gets really complicated (backup, boostrap, streaming for repair etc etc)

Nicolas

On 15 December 2017 at 15:01, Amit Agrawal 
wrote:

> Hi,
>
> We are trying to setup a 3 node cluster with 20 TB HD on each node.
> its a bare metal setup with 44 cores on each node.
>
> So in total 60 TB, 66 cores , 3 node cluster.
>
> The data velocity is very less, low access rates.
>
> has anyone tried with this configuration ?
>
> A bit urgent.
>
> Regards,
> -A
>
>
>


Data Node Density

2017-12-15 Thread Amit Agrawal
Hi,

We are trying to setup a 3 node cluster with 20 TB HD on each node.
its a bare metal setup with 44 cores on each node.

So in total 60 TB, 66 cores , 3 node cluster.

The data velocity is very less, low access rates.

has anyone tried with this configuration ?

A bit urgent.

Regards,
-A


Re: Batch : Isolation and Atomicity for same partition on multiple table

2017-12-15 Thread Mickael Delanoë
Yes, we try to rely on conditional batches when possible but in this case
it could not be used :
We did some tests with the conditional batches and they could not be
applied when several tables are involved in the batch, even if the tables
use the same partition key : we had the following error "batch with
conditions cannot span multiple tables".
So it could not be applied in our case.
Moreover we would like "isolation" to ensure all data are available on any
table (not only part of them) when a read occurs while the batch is
applied, which is not achievable with conditional batches.

Mickaël




Le 15 déc. 2017 07:12, "Jeff Jirsa"  a écrit :

Again, a lot of potential problems can be solved with data modeling - in
particular consider things like conditional batches where the condition is
on a static cell/column and writes go to different CQL rows.

-- 
Jeff Jirsa


On Dec 14, 2017, at 9:57 PM, Mickael Delanoë  wrote:

Thanks Jeff,
I am a little disappointed when you said the guarantee are even weeker.But
I will take a look on this and try to understand what is really done.



Le 13 déc. 2017 18:18, "Jeff Jirsa"  a écrit :

Entry point is here: https://github.com/apache/cassandra/blob/trunk/src/jav
a/org/apache/cassandra/cql3/statements/BatchStatement.java#L346 , which
will call through to https://github.com/apache/c
assandra/blob/trunk/src/java/org/apache/cassandra/service/St
orageProxy.java#L938-L953

I believe the guarantees are weaker than the blog suggests, but it's
nuanced, and a lot of these types of questions come down to data model (you
can model it in a way that you can avoid problems with weaknesses in
isolation, but that requires a detailed explanation of your use case, etc).




On Wed, Dec 13, 2017 at 8:56 AM, Mickael Delanoë 
wrote:

> Hi Nicolas,
> Thanks for you answer.
> Is your assumption 100% sure ?
> Because the few test I did - using nodetools getendpoints - shown that the
> data for the two tables when I used the same partition key went to the same
> "nodes" . So I would have expected cassandra to be smart enough to apply
> them in the memtable in a single operation to achieve the isolation as the
> whole batch will be executed on a single node.
> Does anybody know where I can find, where the batch operations are
> processed in the Cassandra source code, so I could check how all this is
> processed ?
>
> Regards,
> Mickaël
>
>
>
> 2017-12-13 11:18 GMT+01:00 Nicolas Guyomar :
>
>> Hi Mickael,
>>
>> Partition are related to the table they exist in, so in your case, you
>> are targeting 2 partitions in 2 different tables.
>> Therefore, IMHO, you will only get atomicity using your batch statement
>>
>> On 11 December 2017 at 15:59, Mickael Delanoë 
>> wrote:
>>
>>> Hello,
>>>
>>> I have a question regarding batch isolation and atomicity with query
>>> using a same partition key.
>>>
>>> The Datastax documentation says about the batches :
>>> "Combines multiple DML statements to achieve atomicity and isolation
>>> when targeting a single partition or only atomicity when targeting multiple
>>> partitions. A batch applies all DMLs within a single partition before the
>>> data is available, ensuring atomicity and isolation.""
>>>
>>> But I try to find exactly what can be considered as a "single partition"
>>> and I cannot find a clear response yet. The examples and explanations
>>> always speak about partition with only one table used inside the batch. My
>>> concern is about partition when we use different table in a batch. So I
>>> would like some clarification.
>>>
>>> Here is my use case, I have 2 tables with the same partition-key which
>>> is "user_id" :
>>>
>>> CREATE TABLE tableA (
>>>user_id text,
>>>clustering text,
>>>value text,
>>>PRIMARY KEY (user_id, clustering));
>>>
>>> CREATE TABLE tableB (
>>>user_id text,
>>>clustering1 text,
>>>clustering2 text,
>>>value text,
>>>PRIMARY KEY (user_id, clustering1, clustering2));
>>>
>>> If I do a batch query like this :
>>>
>>> BEGIN BATCH
>>> INSERT INTO tableA (user_id, clustering, value) VALUES ('1234', 'c1',
>>> 'val1');
>>> INSERT INTO tableB (user_id, clustering1, clustering1, value) VALUES
>>> ('1234', 'cl1', 'cl2', 'avalue');
>>> APPLY BATCH;
>>>
>>> the DML statements uses the same partition-key, can we say they are
>>> targetting the same partition or, as the partition key are for different
>>> table, should we consider this is different partition? And so does this
>>> batch ensure atomicity and isolation (in the sense described in Datastax
>>> doc)? Or only atomicity?
>>>
>>> Thanks for you help,
>>> Mickaël Delanoë
>>>
>>
>>
>
>
> --
> Mickaël Delanoë
>


Re: Upgrade using rebuild

2017-12-15 Thread Anshu Vajpayee
Thanks Jon.

On Fri, Dec 15, 2017 at 12:05 AM, Jon Haddad  wrote:

> Heh, hit send accidentally.
>
> You generally can’t run rebuild to upgrade, because it’s a streaming
> operation.  Streaming isn’t supported between versions, although on 3.x it
> might work.
>
>
> On Dec 14, 2017, at 11:01 AM, Jon Haddad  wrote:
>
> no
>
> On Dec 14, 2017, at 10:59 AM, Anshu Vajpayee 
> wrote:
>
> Thanks! I am aware with these steps.
>
> I m just thinking , is it possible to do the upgrade using nodetool
> rebuild like  we rebuld new dc ?
>
> Has anyone tried -  upgrade with nodetool rebuild ?
>
>
>
> On Thu, 14 Dec 2017 at 7:08 PM, Hannu Kröger  wrote:
>
>> If you want to do a version upgrade, you need to basically do follow node
>> by node:
>>
>> 0) stop repairs
>> 1) make sure your sstables are at the latest version (nodetool
>> upgradesstables can do it)
>> 2) stop cassandra
>> 3) update cassandra software and update cassandra.yaml and
>> cassandra-env.sh files
>> 4) start cassandra
>>
>> After all nodes are up, run “nodetool upgradesstables” on each node to
>> update your sstables to the latest version.
>>
>> Also please note that when you upgrade, you need to upgrade only between
>> compatible versions.
>>
>> E.g. 2.2.x -> 3.0.x  but not 1.2 to 3.11
>>
>> Cheers,
>> Hannu
>>
>> On 14 December 2017 at 12:33:49, Anshu Vajpayee (anshu.vajpa...@gmail.com)
>> wrote:
>>
>> Hi -
>>
>> Is it possible to upgrade a  cluster ( DC wise) using nodetool rebuild ?
>>
>>
>>
>> --
>> *C*heers,*
>> *Anshu V*
>>
>>
>> --
> *C*heers,*
> *Anshu V*
>
>
>
>
>


-- 
*C*heers,*
*Anshu V*