Re: Unexpected error while refreshing token map, keeping previous version (IllegalArgumentException: Multiple entries with same key ?
I just configured a 3 node cluster in this way and was able to reproduce the warning message: cqlsh> select peer, rpc_address from system.peers; peer | rpc_address ---+- 127.0.0.3 | 127.0.0.1 127.0.0.2 | 127.0.0.1 (2 rows) cqlsh> select rpc_address from system.local; rpc_address - 127.0.0.1 10:22:40.399 [s0-admin-0] WARN c.d.o.d.i.c.metadata.DefaultMetadata - [s0] Unexpected error while refreshing token map, keeping previous version java.lang.IllegalArgumentException: Multiple entries with same key: Murmur3Token(-100881582699237014)=/127.0.0.1:9042 and Murmur3Token(-100881582699237014)=/127.0.0.1:9042 at com.datastax.oss.driver.shaded.guava.common.collect.ImmutableMap.conflictException(ImmutableMap.java:215) at com.datastax.oss.driver.shaded.guava.common.collect.ImmutableMap.checkNoConflict(ImmutableMap.java:209) at com.datastax.oss.driver.shaded.guava.common.collect.RegularImmutableMap.checkNoConflictInKeyBucket(RegularImmutableMap.java:147) at com.datastax.oss.driver.shaded.guava.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:110) at com.datastax.oss.driver.shaded.guava.common.collect.ImmutableMap$Builder.build(ImmutableMap.java:393) at com.datastax.oss.driver.internal.core.metadata.token.DefaultTokenMap.buildTokenToPrimaryAndRing(DefaultTokenMap.java:261) at com.datastax.oss.driver.internal.core.metadata.token.DefaultTokenMap.build(DefaultTokenMap.java:57) at com.datastax.oss.driver.internal.core.metadata.DefaultMetadata.rebuildTokenMap(DefaultMetadata.java:146) at com.datastax.oss.driver.internal.core.metadata.DefaultMetadata.withNodes(DefaultMetadata.java:104) at com.datastax.oss.driver.internal.core.metadata.InitialNodeListRefresh.compute(InitialNodeListRefresh.java:96) at com.datastax.oss.driver.internal.core.metadata.MetadataManager.apply(MetadataManager.java:475) at com.datastax.oss.driver.internal.core.metadata.MetadataManager$SingleThreaded.refreshNodes(MetadataManager.java:299) at com.datastax.oss.driver.internal.core.metadata.MetadataManager$SingleThreaded.access$1700(MetadataManager.java:265) at com.datastax.oss.driver.internal.core.metadata.MetadataManager.lambda$refreshNodes$0(MetadataManager.java:155) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442) at io.netty.channel.DefaultEventLoop.run(DefaultEventLoop.java:54) at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:905) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748) Interestingly enough, the version 3 of the driver only recognizes 1 node, where version 4 is able to detect 3 nodes separately. It's probably not a scenario that was given a lot of thought since this is a misconfiguration. Will think about how it should be handled and log tickets in any case, as would be nice to surface to the user that something isn't right in a more clear way. Can you please confirm when you have chance that this is indeed a configuration issue with rpc_address? Just to make sure I'm not ignoring a possible bug ;) Thanks, Andy On Thu, Jun 20, 2019 at 10:20 AM Andy Tolbert wrote: > One thing that strikes me is that the endpoint reported is '127.0.0.1'. > Is it possible that you have rpc_address set to 127.0.0.1 on each of your > three nodes in cassandra.yaml? The driver uses the system.peers table to > identify nodes in the cluster and associates them by rpc_address. Can you > verify this by executing 'select peer, rpc_address from system.peers' to > see what is being reported as the rpc_address and let me know? > > In any case, the driver should probably handle this better, I'll create a > driver ticket. > > Thanks, > Andy > > On Thu, Jun 20, 2019 at 10:03 AM Jeff Jirsa wrote: > >> There’s a reasonable chance this is a bug in the Datastax driver - may >> want to start there when debugging . >> >> It’s also just a warn, and the two entries with the same token are the >> same endpoint which doesn’t seem concerning to me, but I don’t know the >> Datastax driver that well >> >> On Jun 20, 2019, at 7:40 AM, Котельников Александр >> wrote: >> >> It appears that no such warning is issued if I connected to Cassandra >> from a remote server, not locally. >> >> >> >> *From: *Котельников Александр >> *Reply-To: *"user@cassandra.apache.org" >> *Date: *Thursday, 20 June 2019 at 10:46 >> *To: *"user@cassandra.apache.org" >> *Subject: *Unexpected error while refreshing token map, keeping previous >> version (IllegalArgumentException: Multiple entries with same key ? >> >> >> >> Hey! >> >> >> >> I’ve just configured a test 3-node Cassandra cluster and run very >> trivial java test against it. >> >> >> >> I see the following warning from java-driver on each C
Re: Tombstones not getting purged
Thank you for the information ! On Thu, Jun 20, 2019 at 9:50 AM Alexander Dejanovski wrote: > Léo, > > if a major compaction isn't a viable option, you can give a go at > Instaclustr SSTables tools to target the partitions with the most > tombstones : > https://github.com/instaclustr/cassandra-sstable-tools/tree/cassandra-2.2#ic-purge > > It generates a report like this: > > Summary: > > +-+-+ > > | | Size| > > +-+-+ > > | Disk| 1.9 GB | > > | Reclaim | 11.7 MB | > > +-+-+ > > > Largest reclaimable partitions: > > +--++-+-+ > > | Key | Size | Reclaim | Generations | > > +--++-+-+ > > | 001.2.340862 | 3.2 kB | 3.2 kB | [534, 438, 498] | > > | 001.2.946243 | 2.9 kB | 2.8 kB | [534, 434, 384] | > > | 001.1.527557 | 2.8 kB | 2.7 kB | [534, 519, 394] | > > | 001.2.181797 | 2.6 kB | 2.6 kB | [534, 424, 343] | > > | 001.3.475853 | 2.7 kB |28 B | [524, 462] | > > | 001.0.159704 | 2.7 kB |28 B | [440, 247] | > > | 001.1.311372 | 2.6 kB |28 B | [424, 458] | > > | 001.0.756293 | 2.6 kB |28 B | [428, 358] | > > | 001.2.681009 | 2.5 kB |28 B | [440, 241] | > > | 001.2.474773 | 2.5 kB |28 B | [524, 484] | > > | 001.2.974571 | 2.5 kB |28 B | [386, 517] | > > | 001.0.143176 | 2.5 kB |28 B | [518, 368] | > > | 001.1.185198 | 2.5 kB |28 B | [517, 386] | > > | 001.3.503517 | 2.5 kB |28 B | [426, 346] | > > | 001.1.847384 | 2.5 kB |28 B | [436, 396] | > > | 001.0.949269 | 2.5 kB |28 B | [516, 356] | > > | 001.0.756763 | 2.5 kB |28 B | [440, 249] | > > | 001.3.973808 | 2.5 kB |28 B | [517, 386] | > > | 001.0.312718 | 2.4 kB |28 B | [524, 467] | > > | 001.3.632066 | 2.4 kB |28 B | [432, 377] | > > | 001.1.946590 | 2.4 kB |28 B | [519, 389] | > > | 001.1.798591 | 2.4 kB |28 B | [434, 388] | > > | 001.3.953922 | 2.4 kB |28 B | [432, 375] | > > | 001.2.585518 | 2.4 kB |28 B | [432, 375] | > > | 001.3.284942 | 2.4 kB |28 B | [376, 432] | > > +--++-+-+ > > Once you've identified these partitions you can run a compaction on the > SSTables that contain them (identified using "nodetool getsstables"). > Note that user defined compactions are only available for STCS. > Also ic-purge will perform a compaction but without writing to disk > (should look like a validation compaction), so it is rightfully reported by > the docs as an "intensive process" (not more than a repair though). > > - > Alexander Dejanovski > France > @alexanderdeja > > Consultant > Apache Cassandra Consulting > http://www.thelastpickle.com > > > On Thu, Jun 20, 2019 at 9:17 AM Alexander Dejanovski < > a...@thelastpickle.com> wrote: > >> My bad on date formatting, it should have been : %Y/%m/%d >> Otherwise the SSTables aren't ordered properly. >> >> You have 2 SSTables that claim to cover timestamps from 1940 to 2262, >> which is weird. >> Aside from that, you have big overlaps all over the SSTables, so that's >> probably why your tombstones are sticking around. >> >> Your best shot here will be a major compaction of that table, since it >> doesn't seem so big. Remember to use the --split-output flag on the >> compaction command to avoid ending up with a single SSTable after that. >> >> Cheers, >> >> - >> Alexander Dejanovski >> France >> @alexanderdeja >> >> Consultant >> Apache Cassandra Consulting >> http://www.thelastpickle.com >> >> >> On Thu, Jun 20, 2019 at 8:13 AM Léo FERLIN SUTTON >> wrote: >> >>> On Thu, Jun 20, 2019 at 7:37 AM Alexander Dejanovski < >>> a...@thelastpickle.com> wrote: >>> Hi Leo, The overlapping SSTables are indeed the most probable cause as suggested by Jeff. Do you know if the tombstone compactions actually triggered? (did the SSTables name change?) >>> >>> Hello ! >>> >>> I believe they have changed. I do not remember the sstable name but the >>> "last modified" has changed recently for these tables. >>> >>> Could you run the following command to list SSTables and provide us the output? It will display both their timestamp ranges along with the estimated droppable tombstones ratio. for f in *Data.db; do meta=$(sstablemetadata -gc_grace_seconds 259200 $f); echo $(date --date=@$(echo "$meta" | grep Maximum\ time | cut -d" " -f3| cut -c 1-10) '+%m/%d/%Y %H:%M:%S') $(date --date=@$(echo "$meta" | grep Minimum\ time | cut -d" " -f3| cut -c 1-10) '+%m/%d/%Y %H:%M:%S') $(echo "$meta" | grep droppable) $(ls -lh $f); done | sort >>> >>> Here is the results : >>> >>> ``` >>> 04/01/2019 22:53:12 03/06/2018 16:46:13 Estimated droppable tombstones: >>> 0.0 -rw-r--r-- 1 cassandra cassandra 16G Apr 13 14:35 md-147916-big-Data.db >>
Re: Unexpected error while refreshing token map, keeping previous version (IllegalArgumentException: Multiple entries with same key ?
One thing that strikes me is that the endpoint reported is '127.0.0.1'. Is it possible that you have rpc_address set to 127.0.0.1 on each of your three nodes in cassandra.yaml? The driver uses the system.peers table to identify nodes in the cluster and associates them by rpc_address. Can you verify this by executing 'select peer, rpc_address from system.peers' to see what is being reported as the rpc_address and let me know? In any case, the driver should probably handle this better, I'll create a driver ticket. Thanks, Andy On Thu, Jun 20, 2019 at 10:03 AM Jeff Jirsa wrote: > There’s a reasonable chance this is a bug in the Datastax driver - may > want to start there when debugging . > > It’s also just a warn, and the two entries with the same token are the > same endpoint which doesn’t seem concerning to me, but I don’t know the > Datastax driver that well > > On Jun 20, 2019, at 7:40 AM, Котельников Александр > wrote: > > It appears that no such warning is issued if I connected to Cassandra from > a remote server, not locally. > > > > *From: *Котельников Александр > *Reply-To: *"user@cassandra.apache.org" > *Date: *Thursday, 20 June 2019 at 10:46 > *To: *"user@cassandra.apache.org" > *Subject: *Unexpected error while refreshing token map, keeping previous > version (IllegalArgumentException: Multiple entries with same key ? > > > > Hey! > > > > I’ve just configured a test 3-node Cassandra cluster and run very trivial > java test against it. > > > > I see the following warning from java-driver on each CqlSession > initialization: > > > > 13:54:13.913 [loader-admin-0] WARN c.d.o.d.i.c.metadata.DefaultMetadata - > [loader] Unexpected error while refreshing token map, keeping previous > version (IllegalArgumentException: Multiple entries with same key: > Murmur3Token(-1060405237057176857)=/127.0.0.1:9042 and > Murmur3Token(-1060405237057176857)=/127.0.0.1:9042) > > > > What does It mean? Why? > > > > Cassandra 3.11.4, driver 4.0.1. > > > > nodetool status > > Datacenter: datacenter1 > > === > > Status=Up/Down > > |/ State=Normal/Leaving/Joining/Moving > > -- Address Load Tokens Owns (effective) Host > ID Rack > > UN 10.73.66.36 419.36 MiB 256 100.0% > fafa2737-9024-437b-9a59-c1c037bce244 rack1 > > UN 10.73.66.100 336.47 MiB 256 100.0% > d5323ad0-f8cd-42d4-b34d-9afcd002ea47 rack1 > > UN 10.73.67.196 336.4 MiB 256 100.0% > 74dffe0c-32a4-4071-8b36-5ada5afa4a7d rack1 > > > > The issue persists if I reset the cluster, just the token changes its > value. > > Alexander > >
Re: Unexpected error while refreshing token map, keeping previous version (IllegalArgumentException: Multiple entries with same key ?
There’s a reasonable chance this is a bug in the Datastax driver - may want to start there when debugging . It’s also just a warn, and the two entries with the same token are the same endpoint which doesn’t seem concerning to me, but I don’t know the Datastax driver that well > On Jun 20, 2019, at 7:40 AM, Котельников Александр > wrote: > > It appears that no such warning is issued if I connected to Cassandra from a > remote server, not locally. > > From: Котельников Александр > Reply-To: "user@cassandra.apache.org" > Date: Thursday, 20 June 2019 at 10:46 > To: "user@cassandra.apache.org" > Subject: Unexpected error while refreshing token map, keeping previous > version (IllegalArgumentException: Multiple entries with same key ? > > Hey! > > I’ve just configured a test 3-node Cassandra cluster and run very trivial > java test against it. > > I see the following warning from java-driver on each CqlSession > initialization: > > 13:54:13.913 [loader-admin-0] WARN c.d.o.d.i.c.metadata.DefaultMetadata - > [loader] Unexpected error while refreshing token map, keeping previous > version (IllegalArgumentException: Multiple entries with same key: > Murmur3Token(-1060405237057176857)=/127.0.0.1:9042 and > Murmur3Token(-1060405237057176857)=/127.0.0.1:9042) > > What does It mean? Why? > > Cassandra 3.11.4, driver 4.0.1. > > nodetool status > Datacenter: datacenter1 > === > Status=Up/Down > |/ State=Normal/Leaving/Joining/Moving > -- Address Load Tokens Owns (effective) Host ID > Rack > UN 10.73.66.36 419.36 MiB 256 100.0% > fafa2737-9024-437b-9a59-c1c037bce244 rack1 > UN 10.73.66.100 336.47 MiB 256 100.0% > d5323ad0-f8cd-42d4-b34d-9afcd002ea47 rack1 > UN 10.73.67.196 336.4 MiB 256 100.0% > 74dffe0c-32a4-4071-8b36-5ada5afa4a7d rack1 > > The issue persists if I reset the cluster, just the token changes its value. > Alexander
Re: Unexpected error while refreshing token map, keeping previous version (IllegalArgumentException: Multiple entries with same key ?
It appears that no such warning is issued if I connected to Cassandra from a remote server, not locally. From: Котельников Александр Reply-To: "user@cassandra.apache.org" Date: Thursday, 20 June 2019 at 10:46 To: "user@cassandra.apache.org" Subject: Unexpected error while refreshing token map, keeping previous version (IllegalArgumentException: Multiple entries with same key ? Hey! I’ve just configured a test 3-node Cassandra cluster and run very trivial java test against it. I see the following warning from java-driver on each CqlSession initialization: 13:54:13.913 [loader-admin-0] WARN c.d.o.d.i.c.metadata.DefaultMetadata - [loader] Unexpected error while refreshing token map, keeping previous version (IllegalArgumentException: Multiple entries with same key: Murmur3Token(-1060405237057176857)=/127.0.0.1:9042 and Murmur3Token(-1060405237057176857)=/127.0.0.1:9042) What does It mean? Why? Cassandra 3.11.4, driver 4.0.1. nodetool status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.73.66.36 419.36 MiB 256 100.0% fafa2737-9024-437b-9a59-c1c037bce244 rack1 UN 10.73.66.100 336.47 MiB 256 100.0% d5323ad0-f8cd-42d4-b34d-9afcd002ea47 rack1 UN 10.73.67.196 336.4 MiB 256 100.0% 74dffe0c-32a4-4071-8b36-5ada5afa4a7d rack1 The issue persists if I reset the cluster, just the token changes its value. Alexander
Re: Decommissioned nodes are in UNREACHABLE state
Hello, Assuming you nodes are out for a while and you don't need the data after 60 days (or cannot get it anyway), the way to fix this is to force the node out. I would try, in this order: - nodetool removenode HOSTID - nodetool removenode force These 2 might really not work at this stage, but if they do, this is a clean way to do so. Now, to really push the ghost nodes to the exit door, it often takes: - nodetool assassinate I think Cassandra 2.1 doesn't have it, you might have to use JMX, more details here: https://thelastpickle.com/blog/2018/09/18/assassinate.html): echo "run -b org.apache.cassandra.net:type=Gossiper > unsafeAssassinateEndpoint $IP_TO_ASSASSINATE" | java -jar > jmxterm-1.0.0-uber.jar -l $IP_OF_LIVE_NODE:7199 This should really remove the traces of the node, without any safety, no streaming, no checks, just get rid of it. So to use with a lot of care and understanding. In your situation I guess this is what will work. As a last attempt, you could try removing traces of the dead node(s) from all the live nodes 'system.peers' table. This table is local to each node, so the DELETE command is to be send to all the nodes (that have a trace of an old node). - cqlsh -e "DELETE $IP_TO_REMOVE FROM system.peers;" but I see the node IPs in UNREACHABLE state in "nodetool describecluster" > output. I believe they appear only for 72 hours, but in my case I see > those nodes in UNREACHABLE for ever (more than 60 days) To be more accurate, you should never see leaving node as unreachable I believe (not even for 72 hours). The 72 hours is the time Gossip should continue referencing the old nodes. Typically when you remove the ghost nodes, they should no longer appear in 'nodetool describe' cluster at all, I would say immediately, but still appear in 'nodetool gossipinfo' with a 'left' or 'remove' status. I hope that helps and that one of the above will do the trick (I'd bet on the assassinate :)). Also sorry it took us a while to answer you this relatively common question :); C*heers, --- Alain Rodriguez - al...@thelastpickle.com France / Spain The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com Le jeu. 13 juin 2019 à 00:55, Jai Bheemsen Rao Dhanwada < jaibheem...@gmail.com> a écrit : > Hello, > > I have a Cassandra cluster running with 2.1.16 version of Cassandra, where > I have decommissioned few nodes from the cluster using "nodetool > decommission", but I see the node IPs in UNREACHABLE state in "nodetool > describecluster" output. I believe they appear only for 72 hours, but in > my case I see those nodes in UNREACHABLE for ever (more than 60 days). > Rolling restart of the nodes didn't remove them. any idea what could be > causing here? > > Note: I don't see them in the nodetool status output. >
Re: JanusGraph and Cassandra
Hello, This looks more like a JanusGraph question. Also I would rather try in the support for that tool instead. I never saw anyone here or elsewhere using JanusGraph, after searching I only found 4 threads about it here. Thus I think I think even people knowing Cassandra monitoring/metrics very well will not be able to help you here. If you have questions around the metrics as such, we might be able to help you though :). C*heers, --- Alain Rodriguez - al...@thelastpickle.com France / Spain The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com Le mer. 12 juin 2019 à 08:31, Vinayak Bali a écrit : > Hi, > > I am using JanusGraph along with Cassandra as the backend. I have created > a graph using JanusGraphFactory.open() method. I want to create one more > graph and traverse it. Can you please help me. > > Regards, > Vinayak > >
Re: Cassandra Tombstone
Hello Aneesh, Reading your message and answers given, I really think this post I wrote about 3 years ago now (how quickly time goes through...) about tombstone might be of interest to you: https://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html. Your problem is not related to tombstone I'd say, but the first part of the post explains how constancy work in Cassandra, and only then takes the case of deletes/tombstones. This first part might help you solving your current issue: https://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html#cassandra-some-availability-and-consistency-considerations. I really tried to reason from scratch, even for people whose new to Cassandra with a very basic knowledge of internals. For example you'd read still like: CL.READ = Consistency Level (CL) used for reads. Basically the number of > nodes that will have to acknowledge the read for Cassandra to consider it > successful. > CL.WRITE = CL used for writes. > RF = Replication Factor CL.READ + CL.WRITE > RF If what you have is an availability issue, you should just make sure the CL is lower or equal to the RF and that all nodes are up and responsive. FWIW, Quorum = RF/2 + 1, thus if RF is 2 and the consistency level for deletes is quorum (ie "2 / 2 + 1 = 2") , one node down could start breaking availability (and most probably will) as CL > number of replicas available for certain partitions. Reading the rest of the post might be useful while working out the design of the schema and queries, in particular if you plan to use deletes/TTLs. I hope that helps, C*heers, --- Alain Rodriguez - al...@thelastpickle.com France / Spain The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com Le mar. 18 juin 2019 à 09:56, Oleksandr Shulgin < oleksandr.shul...@zalando.de> a écrit : > On Tue, Jun 18, 2019 at 8:06 AM ANEESH KUMAR K.M > wrote: > >> >> I am using Cassandra cluster with 3 nodes which is hosted on AWS. Also we >> have NodeJS web Application which is on AWS ELB. Now the issue is that, >> when I add 2 or more servers (nodeJS) in AWS ELB then the delete queries >> are not working on Cassandra. >> > > Please provide a more concrete description than "not working". Do you get > an error? Which one? Does it "not working" silently, i.e. w/o an error, > but you don't observe the expected effect? How does the delete query look > like, what is the effect you expect and what do you observe instead? > > -- > Alex > >
Re: node re-start delays , busy Deleting mc-txn-compaction/ Adding log file replica
Also about your traces, and according to Jeff in another thread: the incomplete sstable will be deleted during startup (in 3.0 and newer > there’s a transaction log of each compaction in progress - that gets > cleaned during the startup process) > maybe that's what you are seeing? Again, I'm not really familiar with those traces. I find traces and debug pretty useless (or even counter-productive) in 99% of the cases, so I don't use them much. Le jeu. 20 juin 2019 à 12:25, Alain RODRIGUEZ a écrit : > Hello Asad, > > >> I’m on environment with apache Cassandra 3.11.1 with java 1.8.0_144. > > One Node went OOM and crashed. > > > If I remember well, firsts minor versions of C* 3.11 have memory leaks. It > seems it was fixed in your version though. > > 3.11.1 > > [...] > > * BTree.Builder memory leak (CASSANDRA-13754) > > > Yet other improvements were made later on: > > >> 3.11.3 > > [...] > > * Remove BTree.Builder Recycler to reduce memory usage (CASSANDRA-13929) > > * Reduce nodetool GC thread count (CASSANDRA-14475) > > > See: https://github.com/apache/cassandra/blob/cassandra-3.11/CHANGES.txt. > Before digging more I would upgrade to 3.11.latest (latest = 4 or 5 I > guess), because early versions of a major Cassandra versions are famous for > being quite broken, even though this major is a 'bug fix only' branch. > Also minor versions upgrades are not too risky to go through. I would > maybe start there if you're not too sure how to dig this. > > If it happens again or you don't want to upgrade, it would be interesting > to know: > - if the OOM happens inside the JVM or on native memory (then the OS > would be the one sending the kill signal). These 2 issues have different > (and sometime opposite) fixes. > - What's the host size (especially memory) and how the heap (and maybe > some off heap structures) are configured (at least what is not default). > - If you saw errors in the logs and what the 'nodetool tpstats' was > looking like when the node went down (it might have been dumped in the logs) > > I don't know much about those traces nor why Cassandra would take a long > time. Though they are traces and harder to interpret for me. What does the > INFO / WARN / ERR look like? > Maybe opening a lot of SSTables and/or replaying a lot of commit logs, > given the nature of the restart (post outage)? > To speed up things, when nodes are not crashing, under normal > circumstances, use 'nodetool drain' as part of stopping the node, before > stopping/killing the service/process. > > C*heers, > --- > Alain Rodriguez - al...@thelastpickle.com > France / Spain > > The Last Pickle - Apache Cassandra Consulting > http://www.thelastpickle.com > > Le mar. 18 juin 2019 à 23:43, ZAIDI, ASAD A a écrit : > >> >> >> I’m on environment with apache Cassandra 3.11.1 with java 1.8.0_144. >> >> >> >> One Node went OOM and crashed. Re-starting this crashed node is taking >> long time. Trace level debug log is showing messages like: >> >> >> >> >> >> Debug.log trace excerpt: >> >> >> >> >> >> TRACE [main] 2019-06-18 21:30:43,449 LogTransaction.java:217 - Deleting >> /cassandra/data/enterprise/device_connection_ws-f65649e0aea011e7baeb8166fa28890a/mc-9337720-big-CompressionInfo.db >> >> TRACE [main] 2019-06-18 21:30:43,449 LogTransaction.java:217 - Deleting >> /cassandra/data/enterprise/device_connection_ws-f65649e0aea011e7baeb8166fa28890a/mc-9337720-big-Filter.db >> >> TRACE [main] 2019-06-18 21:30:43,449 LogTransaction.java:217 - Deleting >> /cassandra/data/enterprise/device_connection_ws-f65649e0aea011e7baeb8166fa28890a/mc-9337720-big-TOC.txt >> >> TRACE [main] 2019-06-18 21:30:43,455 LogTransaction.java:217 - Deleting >> /cassandra/data/enterprise/device_connection_ws-f65649e0aea011e7baeb8166fa28890a/mc_txn_compaction_642976c0-91c3-11e9-97bb-6b1dee397c3f.log >> >> TRACE [main] 2019-06-18 21:30:43,458 LogReplicaSet.java:67 - Added log >> file replica >> /cassandra/data/enterprise/device_connection_ws-f65649e0aea011e7baeb8166fa28890a/mc_txn_compaction_5a6c8c90-91cc-11e9-97bb-6b1dee397c3f.log >> >> >> >> >> >> Above messages are repeated for unique [mc--* ] files. Such messages >> are repeating constantly. >> >> >> >> I’m seeking help here to find out what may be going on here , any hint to >> root cause and how I can quickly start the node. Thanks in advance. >> >> >> >> Regards/asad >> >> >> >> >> >> >> >
Re: node re-start delays , busy Deleting mc-txn-compaction/ Adding log file replica
Hello Asad, > I’m on environment with apache Cassandra 3.11.1 with java 1.8.0_144. One Node went OOM and crashed. If I remember well, firsts minor versions of C* 3.11 have memory leaks. It seems it was fixed in your version though. 3.11.1 [...] * BTree.Builder memory leak (CASSANDRA-13754) Yet other improvements were made later on: > 3.11.3 [...] * Remove BTree.Builder Recycler to reduce memory usage (CASSANDRA-13929) * Reduce nodetool GC thread count (CASSANDRA-14475) See: https://github.com/apache/cassandra/blob/cassandra-3.11/CHANGES.txt. Before digging more I would upgrade to 3.11.latest (latest = 4 or 5 I guess), because early versions of a major Cassandra versions are famous for being quite broken, even though this major is a 'bug fix only' branch. Also minor versions upgrades are not too risky to go through. I would maybe start there if you're not too sure how to dig this. If it happens again or you don't want to upgrade, it would be interesting to know: - if the OOM happens inside the JVM or on native memory (then the OS would be the one sending the kill signal). These 2 issues have different (and sometime opposite) fixes. - What's the host size (especially memory) and how the heap (and maybe some off heap structures) are configured (at least what is not default). - If you saw errors in the logs and what the 'nodetool tpstats' was looking like when the node went down (it might have been dumped in the logs) I don't know much about those traces nor why Cassandra would take a long time. Though they are traces and harder to interpret for me. What does the INFO / WARN / ERR look like? Maybe opening a lot of SSTables and/or replaying a lot of commit logs, given the nature of the restart (post outage)? To speed up things, when nodes are not crashing, under normal circumstances, use 'nodetool drain' as part of stopping the node, before stopping/killing the service/process. C*heers, --- Alain Rodriguez - al...@thelastpickle.com France / Spain The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com Le mar. 18 juin 2019 à 23:43, ZAIDI, ASAD A a écrit : > > > I’m on environment with apache Cassandra 3.11.1 with java 1.8.0_144. > > > > One Node went OOM and crashed. Re-starting this crashed node is taking > long time. Trace level debug log is showing messages like: > > > > > > Debug.log trace excerpt: > > > > > > TRACE [main] 2019-06-18 21:30:43,449 LogTransaction.java:217 - Deleting > /cassandra/data/enterprise/device_connection_ws-f65649e0aea011e7baeb8166fa28890a/mc-9337720-big-CompressionInfo.db > > TRACE [main] 2019-06-18 21:30:43,449 LogTransaction.java:217 - Deleting > /cassandra/data/enterprise/device_connection_ws-f65649e0aea011e7baeb8166fa28890a/mc-9337720-big-Filter.db > > TRACE [main] 2019-06-18 21:30:43,449 LogTransaction.java:217 - Deleting > /cassandra/data/enterprise/device_connection_ws-f65649e0aea011e7baeb8166fa28890a/mc-9337720-big-TOC.txt > > TRACE [main] 2019-06-18 21:30:43,455 LogTransaction.java:217 - Deleting > /cassandra/data/enterprise/device_connection_ws-f65649e0aea011e7baeb8166fa28890a/mc_txn_compaction_642976c0-91c3-11e9-97bb-6b1dee397c3f.log > > TRACE [main] 2019-06-18 21:30:43,458 LogReplicaSet.java:67 - Added log > file replica > /cassandra/data/enterprise/device_connection_ws-f65649e0aea011e7baeb8166fa28890a/mc_txn_compaction_5a6c8c90-91cc-11e9-97bb-6b1dee397c3f.log > > > > > > Above messages are repeated for unique [mc--* ] files. Such messages > are repeating constantly. > > > > I’m seeking help here to find out what may be going on here , any hint to > root cause and how I can quickly start the node. Thanks in advance. > > > > Regards/asad > > > > > > >
Re: How to query TTL on collections ?
Hello Maxim. I think you won't be able to do what you want this way. Collections are supposed to be (ideally small) sets of data that you'll always read entirely, at once. At least it seems to be working this way. Not sure about the latest versions, but I did not hear about new design for collections. You can set values individually in a collection as you did above (and probably should do so to avoid massive tombstones creation), but you have to read the whole thing at once: ``` $ ccm node1 cqlsh -e "SELECT items[10] FROM tlp_labs.products WHERE product_id=1;" :1:SyntaxException: line 1:12 no viable alternative at input '[' (SELECT [items][...) $ ccm node1 cqlsh -e "SELECT items FROM tlp_labs.products WHERE product_id=1;" items {10: {csn: 100, name: 'item100'}, 20: {csn: 200, name: 'item200'}} ``` Furthermore, you cannot query the TTL for a single item in a collection, and as distinct columns can have distinct TTLs, you cannot query the TTL for the whole map (collection). As you cannot get the TTL for the whole thing, nor query a single item of the collection, I guess there is no way to get the currently set TTL for all or part of a collection. If you need it, you would need to redesign this table, maybe split it. Make the collection a different table for example, that you would then be referenced in your current table. Another idea of hack I'm just thinking about could be to add a 'ttl' field that would get the updates as well, any time a client updates the TTL for an entry, you could update that 'ttl' field as well. But again, you would still not be able to query this information only for an item or a few, it would be querying the whole map again. I had to test it because I could not remember about this, and I think my observations are making sense. Sadly, there is no 'good' syntax for this query, it's just not permitted at all I would say. Sorry I have no better news for you :). C*heers, --- Alain Rodriguez - al...@thelastpickle.com France / Spain The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com Le mer. 19 juin 2019 à 09:21, Maxim Parkachov a écrit : > Hi everyone, > > I'm struggling to understand how can I query TTL on the row in collection > ( Cassandra 3.11.4 ). > Here is my schema: > > CREATE TYPE item ( > csn bigint, > name text > ); > > CREATE TABLE products ( > product_id bigint PRIMARY KEY, > items map> > ); > > And I'm creating records with TTL like this: > > UPDATE products USING TTL 10 SET items = items + {10: {csn: 100, name: > 'item100'}} WHERE product_id = 1; > UPDATE products USING TTL 20 SET items = items + {20: {csn: 200, name: > 'item200'}} WHERE product_id = 1; > > As expected first records disappears after 10 seconds and the second after > 20. But if I already have data in the table I could not figure out how to > query TTL on the item value: > > SELECT TTL(items) FROM products WHERE product_id=1; > InvalidRequest: Error from server: code=2200 [Invalid query] > message="Cannot use selection function ttl on collections" > > SELECT TTL(items[10]) FROM products WHERE product_id=1; > SyntaxException: line 1:16 mismatched input '[' expecting ')' (SELECT > TTL(items[[]...) > > Any tips, hints, tricks are highly appreciated, > Maxim. >
Re: Tombstones not getting purged
Léo, if a major compaction isn't a viable option, you can give a go at Instaclustr SSTables tools to target the partitions with the most tombstones : https://github.com/instaclustr/cassandra-sstable-tools/tree/cassandra-2.2#ic-purge It generates a report like this: Summary: +-+-+ | | Size| +-+-+ | Disk| 1.9 GB | | Reclaim | 11.7 MB | +-+-+ Largest reclaimable partitions: +--++-+-+ | Key | Size | Reclaim | Generations | +--++-+-+ | 001.2.340862 | 3.2 kB | 3.2 kB | [534, 438, 498] | | 001.2.946243 | 2.9 kB | 2.8 kB | [534, 434, 384] | | 001.1.527557 | 2.8 kB | 2.7 kB | [534, 519, 394] | | 001.2.181797 | 2.6 kB | 2.6 kB | [534, 424, 343] | | 001.3.475853 | 2.7 kB |28 B | [524, 462] | | 001.0.159704 | 2.7 kB |28 B | [440, 247] | | 001.1.311372 | 2.6 kB |28 B | [424, 458] | | 001.0.756293 | 2.6 kB |28 B | [428, 358] | | 001.2.681009 | 2.5 kB |28 B | [440, 241] | | 001.2.474773 | 2.5 kB |28 B | [524, 484] | | 001.2.974571 | 2.5 kB |28 B | [386, 517] | | 001.0.143176 | 2.5 kB |28 B | [518, 368] | | 001.1.185198 | 2.5 kB |28 B | [517, 386] | | 001.3.503517 | 2.5 kB |28 B | [426, 346] | | 001.1.847384 | 2.5 kB |28 B | [436, 396] | | 001.0.949269 | 2.5 kB |28 B | [516, 356] | | 001.0.756763 | 2.5 kB |28 B | [440, 249] | | 001.3.973808 | 2.5 kB |28 B | [517, 386] | | 001.0.312718 | 2.4 kB |28 B | [524, 467] | | 001.3.632066 | 2.4 kB |28 B | [432, 377] | | 001.1.946590 | 2.4 kB |28 B | [519, 389] | | 001.1.798591 | 2.4 kB |28 B | [434, 388] | | 001.3.953922 | 2.4 kB |28 B | [432, 375] | | 001.2.585518 | 2.4 kB |28 B | [432, 375] | | 001.3.284942 | 2.4 kB |28 B | [376, 432] | +--++-+-+ Once you've identified these partitions you can run a compaction on the SSTables that contain them (identified using "nodetool getsstables"). Note that user defined compactions are only available for STCS. Also ic-purge will perform a compaction but without writing to disk (should look like a validation compaction), so it is rightfully reported by the docs as an "intensive process" (not more than a repair though). - Alexander Dejanovski France @alexanderdeja Consultant Apache Cassandra Consulting http://www.thelastpickle.com On Thu, Jun 20, 2019 at 9:17 AM Alexander Dejanovski wrote: > My bad on date formatting, it should have been : %Y/%m/%d > Otherwise the SSTables aren't ordered properly. > > You have 2 SSTables that claim to cover timestamps from 1940 to 2262, > which is weird. > Aside from that, you have big overlaps all over the SSTables, so that's > probably why your tombstones are sticking around. > > Your best shot here will be a major compaction of that table, since it > doesn't seem so big. Remember to use the --split-output flag on the > compaction command to avoid ending up with a single SSTable after that. > > Cheers, > > - > Alexander Dejanovski > France > @alexanderdeja > > Consultant > Apache Cassandra Consulting > http://www.thelastpickle.com > > > On Thu, Jun 20, 2019 at 8:13 AM Léo FERLIN SUTTON > wrote: > >> On Thu, Jun 20, 2019 at 7:37 AM Alexander Dejanovski < >> a...@thelastpickle.com> wrote: >> >>> Hi Leo, >>> >>> The overlapping SSTables are indeed the most probable cause as suggested >>> by Jeff. >>> Do you know if the tombstone compactions actually triggered? (did the >>> SSTables name change?) >>> >> >> Hello ! >> >> I believe they have changed. I do not remember the sstable name but the >> "last modified" has changed recently for these tables. >> >> >>> Could you run the following command to list SSTables and provide us the >>> output? It will display both their timestamp ranges along with the >>> estimated droppable tombstones ratio. >>> >>> >>> for f in *Data.db; do meta=$(sstablemetadata -gc_grace_seconds 259200 >>> $f); echo $(date --date=@$(echo "$meta" | grep Maximum\ time | cut -d" " >>> -f3| cut -c 1-10) '+%m/%d/%Y %H:%M:%S') $(date --date=@$(echo "$meta" | >>> grep Minimum\ time | cut -d" " -f3| cut -c 1-10) '+%m/%d/%Y %H:%M:%S') >>> $(echo "$meta" | grep droppable) $(ls -lh $f); done | sort >>> >> >> Here is the results : >> >> ``` >> 04/01/2019 22:53:12 03/06/2018 16:46:13 Estimated droppable tombstones: >> 0.0 -rw-r--r-- 1 cassandra cassandra 16G Apr 13 14:35 md-147916-big-Data.db >> 04/11/2262 23:47:16 10/09/1940 19:13:17 Estimated droppable tombstones: >> 0.0 -rw-r--r-- 1 cassandra cassandra 218M Jun 20 05:57 md-167948-big-Data.db >> 04/11/2262 23:47:16 10/09/1940 19:13:17 Estimated droppable tombstones: >> 0.0 -rw-r--r-- 1 cassandra cassandra 2.2G Jun 20 05:57 md-167942-big-Data.db >> 05/01/2019 08:03:24 03/06/2018 1
Unexpected error while refreshing token map, keeping previous version (IllegalArgumentException: Multiple entries with same key ?
Hey! I’ve just configured a test 3-node Cassandra cluster and run very trivial java test against it. I see the following warning from java-driver on each CqlSession initialization: 13:54:13.913 [loader-admin-0] WARN c.d.o.d.i.c.metadata.DefaultMetadata - [loader] Unexpected error while refreshing token map, keeping previous version (IllegalArgumentException: Multiple entries with same key: Murmur3Token(-1060405237057176857)=/127.0.0.1:9042 and Murmur3Token(-1060405237057176857)=/127.0.0.1:9042) What does It mean? Why? Cassandra 3.11.4, driver 4.0.1. nodetool status Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.73.66.36 419.36 MiB 256 100.0% fafa2737-9024-437b-9a59-c1c037bce244 rack1 UN 10.73.66.100 336.47 MiB 256 100.0% d5323ad0-f8cd-42d4-b34d-9afcd002ea47 rack1 UN 10.73.67.196 336.4 MiB 256 100.0% 74dffe0c-32a4-4071-8b36-5ada5afa4a7d rack1 The issue persists if I reset the cluster, just the token changes its value. Alexander
Re: Tombstones not getting purged
My bad on date formatting, it should have been : %Y/%m/%d Otherwise the SSTables aren't ordered properly. You have 2 SSTables that claim to cover timestamps from 1940 to 2262, which is weird. Aside from that, you have big overlaps all over the SSTables, so that's probably why your tombstones are sticking around. Your best shot here will be a major compaction of that table, since it doesn't seem so big. Remember to use the --split-output flag on the compaction command to avoid ending up with a single SSTable after that. Cheers, - Alexander Dejanovski France @alexanderdeja Consultant Apache Cassandra Consulting http://www.thelastpickle.com On Thu, Jun 20, 2019 at 8:13 AM Léo FERLIN SUTTON wrote: > On Thu, Jun 20, 2019 at 7:37 AM Alexander Dejanovski < > a...@thelastpickle.com> wrote: > >> Hi Leo, >> >> The overlapping SSTables are indeed the most probable cause as suggested >> by Jeff. >> Do you know if the tombstone compactions actually triggered? (did the >> SSTables name change?) >> > > Hello ! > > I believe they have changed. I do not remember the sstable name but the > "last modified" has changed recently for these tables. > > >> Could you run the following command to list SSTables and provide us the >> output? It will display both their timestamp ranges along with the >> estimated droppable tombstones ratio. >> >> >> for f in *Data.db; do meta=$(sstablemetadata -gc_grace_seconds 259200 >> $f); echo $(date --date=@$(echo "$meta" | grep Maximum\ time | cut -d" " >> -f3| cut -c 1-10) '+%m/%d/%Y %H:%M:%S') $(date --date=@$(echo "$meta" | >> grep Minimum\ time | cut -d" " -f3| cut -c 1-10) '+%m/%d/%Y %H:%M:%S') >> $(echo "$meta" | grep droppable) $(ls -lh $f); done | sort >> > > Here is the results : > > ``` > 04/01/2019 22:53:12 03/06/2018 16:46:13 Estimated droppable tombstones: > 0.0 -rw-r--r-- 1 cassandra cassandra 16G Apr 13 14:35 md-147916-big-Data.db > 04/11/2262 23:47:16 10/09/1940 19:13:17 Estimated droppable tombstones: > 0.0 -rw-r--r-- 1 cassandra cassandra 218M Jun 20 05:57 md-167948-big-Data.db > 04/11/2262 23:47:16 10/09/1940 19:13:17 Estimated droppable tombstones: > 0.0 -rw-r--r-- 1 cassandra cassandra 2.2G Jun 20 05:57 md-167942-big-Data.db > 05/01/2019 08:03:24 03/06/2018 16:46:13 Estimated droppable tombstones: > 0.0 -rw-r--r-- 1 cassandra cassandra 4.6G May 1 08:39 md-152253-big-Data.db > 05/09/2018 06:35:03 03/06/2018 16:46:07 Estimated droppable tombstones: > 0.0 -rw-r--r-- 1 cassandra cassandra 30G Apr 13 22:09 md-147948-big-Data.db > 05/21/2019 05:28:01 03/06/2018 16:46:16 Estimated droppable tombstones: > 0.45150604672159905 -rw-r--r-- 1 cassandra cassandra 1.1G Jun 20 05:55 > md-167943-big-Data.db > 05/22/2019 11:54:33 03/06/2018 16:46:16 Estimated droppable tombstones: > 0.30826566640798975 -rw-r--r-- 1 cassandra cassandra 7.6G Jun 20 04:35 > md-167913-big-Data.db > 06/13/2019 00:02:40 03/06/2018 16:46:08 Estimated droppable tombstones: > 0.20980847354256815 -rw-r--r-- 1 cassandra cassandra 6.9G Jun 20 04:51 > md-167917-big-Data.db > 06/17/2019 05:56:12 06/16/2019 20:33:52 Estimated droppable tombstones: > 0.6114260192855792 -rw-r--r-- 1 cassandra cassandra 257M Jun 20 05:29 > md-167938-big-Data.db > 06/18/2019 11:21:55 03/06/2018 17:48:22 Estimated droppable tombstones: > 0.18655813086540254 -rw-r--r-- 1 cassandra cassandra 2.2G Jun 20 05:52 > md-167940-big-Data.db > 06/19/2019 16:53:04 06/18/2019 11:22:04 Estimated droppable tombstones: > 0.0 -rw-r--r-- 1 cassandra cassandra 425M Jun 19 17:08 md-167782-big-Data.db > 06/20/2019 04:17:22 06/19/2019 16:53:04 Estimated droppable tombstones: > 0.0 -rw-r--r-- 1 cassandra cassandra 146M Jun 20 04:18 md-167921-big-Data.db > 06/20/2019 05:50:23 06/20/2019 04:17:32 Estimated droppable tombstones: > 0.0 -rw-r--r-- 1 cassandra cassandra 42M Jun 20 05:56 md-167946-big-Data.db > 06/20/2019 05:56:03 06/20/2019 05:50:32 Estimated droppable tombstones: > 0.0 -rw-r--r-- 2 cassandra cassandra 4.8M Jun 20 05:56 md-167947-big-Data.db > 07/03/2018 17:26:54 03/06/2018 16:46:07 Estimated droppable tombstones: > 0.0 -rw-r--r-- 1 cassandra cassandra 27G Apr 13 17:45 md-147919-big-Data.db > 09/09/2018 18:55:23 03/06/2018 16:46:08 Estimated droppable tombstones: > 0.0 -rw-r--r-- 1 cassandra cassandra 30G Apr 13 18:57 md-147926-big-Data.db > 11/30/2018 11:52:33 03/06/2018 16:46:08 Estimated droppable tombstones: > 0.0 -rw-r--r-- 1 cassandra cassandra 14G Apr 13 13:53 md-147908-big-Data.db > 12/20/2018 07:30:03 03/06/2018 16:46:08 Estimated droppable tombstones: > 0.0 -rw-r--r-- 1 cassandra cassandra 9.3G Apr 13 13:28 md-147906-big-Data.db > ``` > > You could also check the min and max tokens in each SSTable (not sure if >> you get that info from 3.0 sstablemetadata) so that you can detect the >> SSTables that overlap on token ranges with the ones that carry the >> tombstones, and have earlier timestamps. This way you'll be able to trigger >> manual compactions, targeting those specific SSTables. >> > > I have checked and I don't be