Hello Guys, We started noticing strange behavior after we migrated one keyspace from existing cluster to new cluster.
We expanded our source cluster from 18 node to 36 nodes and Didn't run "nodetool cleanup". We took sstable backups on source cluster and restored which has duplicate data and restored (sstableloader) it on to new cluster. Apparently applications started seeing duplicate data mostly on list backed columns. Below is sstable2json output for one of the list backed columns. Clustering Column1:Clustering Column2:mods (List collection type ModifierList:eb26e221-3a66-11e9-80b2-2102e728a233 ["ModifierList:eb26e221-3a66-11e9-80b2-2102e728a233:mods:d120b050eac811e9ab2729ea208ce219","eb25d0b13a6611e980b22102e728a233",1570648383445000], ["ModifierList:eb26e221-3a66-11e9-80b2-2102e728a233:mods:d120b051eac811e9ab2729ea208ce219","eb26bb113a6611e980b22102e728a233",1570648383445000], ["ModifierList:eb26e221-3a66-11e9-80b2-2102e728a233:mods:d120b052eac811e9ab2729ea208ce219","a4fcf1f1eac811e99664732b9302ab46",1570648383445000], ["ModifierList:eb26e221-3a66-11e9-80b2-2102e728a233:mods:38973560ead811e98bf68711844fec13","eb25d0b13a6611e980b22102e728a233",1570654999478000], ["ModifierList:eb26e221-3a66-11e9-80b2-2102e728a233:mods:38973561ead811e98bf68711844fec13","eb26bb113a6611e980b22102e728a233",1570654999478000], ["ModifierList:eb26e221-3a66-11e9-80b2-2102e728a233:mods:38973562ead811e98bf68711844fec13","a4fcf1f1eac811e99664732b9302ab46",1570654999478000], Below is the select statement i would expect Cassandra to return data with latest timestamp rather it returns duplicate values. select mods from keyspace.table where partition_key ='1117302' and type='ModifierList' and id=eb26e221-3a66-11e9-80b2-2102e728a233; [image: image.png] Any help or guidance is greatly appreciated. -- Thanks & Regards Murali K Gutha