Re: Increased read latency with Cassandra >= 3.11.7
Dear community, is someone working on this ticket? This is clearly performance regression and we stuck with 3.11.6 and could not upgrade to latest version. Regards, Maxim. On Mon, Feb 22, 2021 at 10:37 AM Ahmed Eljami wrote: > Hey, > I have created the issue, here => > https://issues.apache.org/jira/browse/CASSANDRA-16465 > > Le ven. 19 févr. 2021 à 10:10, Ahmed Eljami a > écrit : > >> Hi folks, >> >> If this can help, we encountered the same behaviour with 3.11.9. We are >> using LCS. >> After upgrading from 3.11.3 to 3.11.9 in Bench Environnement, Cassandra >> read latency 99% is multiplied by ~3 >> >> We are planning a second test with 3.11.6, I'll send you the results when >> it's done. >> Cheers, >> >> >> >> Le lun. 15 févr. 2021 à 19:55, Jai Bheemsen Rao Dhanwada < >> jaibheem...@gmail.com> a écrit : >> >>> Any update on this? Any idea if this is already tracked under JIRA >>> issue, so we can follow for updates >>> >>> On Wednesday, February 10, 2021, Johannes Weißl wrote: >>> Hi Nico, On Mon, Sep 14, 2020 at 03:51PM +0200, Nicolai Lune Vest wrote: > after upgrading my Cassandra nodes from version 3.11.6 to either > 3.11.7 or 3.11.8 I experience a significant increase in read latency Any update here? Does version 3.11.10 provide any improvement? Thanks, Johannes - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org >> >> -- >> Cordialement; >> >> Ahmed ELJAMI >> > > > -- > Cordialement; > > Ahmed ELJAMI >
Re: Reduce num_tokens on single node cluster
On Fri, Jul 30, 2021 at 7:21 PM Bowen Song wrote: > Since you have only one node, sstableloader is unnecessary. Copy/move the > the data directory back to the right place and restart Cassandra or run > 'nodetool > refresh' is sufficient. Do not restore the 'system' keyspace, but do > restore the other system keyspaces, such as 'system_auth' and ' > system_schema'. > > In fact, it comes to my mind that the following is much quicker, simpler > and should do the trick anyway: > > 1. shutdown Cassandra > > 2. backup /var/lib/cassandra (just in case...) > > 3. run 'rm -rf > /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377' > > 4. change num_tokens > > 5. start Cassandra > > > Thanks Bowen, I used simple procedure, everything worked well, data is there. Regards, Maxim.
Re: Reduce num_tokens on single node cluster
Thanks for quick answer. > Do you ever intend to add nodes to this single node cluster? If not, I > don't see the number of tokens matter at all. > I understand that, I would like to have all environments with the same settings. > However, if you really want to change it and don't mind downtime, you can > do this: > > 1. make a backup of the data > > 2. completely destroy the node with all data in it > > 3. create a new node on the same server with the desired num_tokens > > 4. restore the backup > > With 'restore the backup' you mean SSTableLoader ? Regards, Maxim. >
Reduce num_tokens on single node cluster
Hi everyone, I have several development servers with 1 node and num_tokens 256. As preparation for testing 4.0 I would like to change num_tokens to 16. Unfortunately I could not add any additional nodes or additional DC, but I'm fine with downtime. The important part, data should be preserved. What options do I have? Thanks in advance, Maxim.
Re: Cassandra 3.11 cqlsh doesn't work with latest JDK
Hi Sean, thanks for the quick answer. I have applied your suggestion and tested on several environments, everything is working fine. Other communication protected by SSL such as server-to-server and client-to-server is working without problems as well. Regards, Maxim. On Fri, Apr 30, 2021 at 3:18 PM Durity, Sean R wrote: > Try adding this into the SSL section of your cqlshrc file: > > version = SSLv23 > > > > Sean Durity >
Cassandra 3.11 cqlsh doesn't work with latest JDK
Hi everyone, I have Apache Cassandra 3.11.6 with SSL encryption, CentOS Linux release 7.9, python 2.7.5. JDK and python are coming from operating system. I have updated today operating system and with that I've got new JDK $ java -version openjdk version "1.8.0_292" OpenJDK Runtime Environment (build 1.8.0_292-b10) OpenJDK 64-Bit Server VM (build 25.292-b10, mixed mode) Now when I try to connect to my local instance of Cassandra with cqlsh I'm getting error: $ cqlsh --ssl -u cassandra -p cassandra Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(1, u"Tried connecting to [('127.0.0.1', 9142)]. Last error: [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:618)")}) Apparently, latest release of JDK *_292 disabled TLS1.0 and TLS1.1. Is this known issue ? Is there is something I could do to quickly remedy the situation ? Thanks in advance, Maxim.
Repairs on table with daily full load
Hi everyone, There are a lot of articles, and, probably this question was asked already many times, but I still not 100% sure. We have a table, which we load almost full every night with spark job and consistency LOCAL_QUORUM and record TTL 7 days. This is to remove some records if they are not present in last 7 imports. Table is located in 2 DCs. We are interested only in the last record state. Definition of the table below. After the load, we are running repair with reaper on this table, which takes lot of time and resources. We have multiple such tables and most of the repair time is busy with such tables. Running full load again takes less time than repair on this table. Question is: Do we, actually, need to run repairs on this table at all ? If yes, how offten, daily, weekly ? Thanks in advance, Maxim. WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} AND comment = '' AND compaction = {'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'} AND compression = {'chunk_length_in_kb': '16', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99PERCENTILE';
Re: Re: Increased read latency with Cassandra >= 3.11.7
Hi, I'm not sure if this will help, but I tried today to change one node to 3.11.9 from 3.11.6. We are NOT using TWCS. Very heavy read pattern, almost no writes, with constant performance test load. Cassandra read latency 99% increased significantly, but NOT on the node where I changed version (top line 90ms vs normal 2ms).. Getting back to 3.11.6 immediately fixed the issue. Interesting would be to check all nodes on the new version, but I cannot do such a test for several weeks. At least in a mixed cluster this is no go. Regards, Maxim. [image: image.png]
Re: Re: Increased read latency with Cassandra >= 3.11.7
Hi Nico, we wanted to upgrade to 3.11.8 (from .6), but now I'm very concerned as our load is mostly read-only and very latency sensitive. Did you figure out the reason for such behaviour with the new version ? Regards, Maxim. On Tue, Sep 15, 2020 at 5:24 AM Sagar Jambhulkar wrote: > That is odd. I thought maybe lower version had a bigger custom cache or > some other setting missed in the upgrade. > > On Mon, 14 Sep 2020, 23:00 Nicolai Lune Vest, > wrote: > >> While cache configuration stays the same, cache hit rate for both, the >> chunk cache and the key cache are worse (but still ok) after the update. >> You can see the change in the attached screenshot (update was performed at >> September the 7th ~4pm). >> >> @Sagar Any specific reason you are suspecting the cache? >> >> >> >> -"Sagar Jambhulkar" schrieb: - >> An: user@cassandra.apache.org >> Von: "Sagar Jambhulkar" >> Datum: 14.09.2020 16:25 >> Betreff: Re: Increased read latency with Cassandra >= 3.11.7 >> >> >> Maybe compare the cache size see if anything different in two versions? >> >> >> On Mon, 14 Sep 2020, 19:21 Nicolai Lune Vest, >> wrote: >> >> >> Dear Cassandra community, >> >> after upgrading my Cassandra nodes from version 3.11.6 to either 3.11.7 >> or 3.11.8 I experience a significant increase in read latency. >> With 3.11.6 average read latency is ~0.35 ms. With 3.11.7 average read >> latency increases to ~2.9 ms, almost 10 times worse! This behavior applies >> to all my environments, testing and productive. >> >> I do also observe that the "bloom-filter false positive rate" and >> "SSTable reads" increases significantly with version >= 3.11.7 >> >> I'm wondering if someone experienced the same problem when updating >> Cassandra from 3.11.6 to either 3.11.7 or 3.11.8 and can help me out. >> >> I already tried nodetool scrub, sstablescrub, manual compaction, and >> repair on selected tables without impact. When downgrading back to 3.11.6 >> read latency recovers immediately. >> >> My update routine is very simple. I'm just stopping the service, install >> the new version (using apt), and start the service again. One node after >> the other. As coming from 3.11.6 updating to 3.11.7 (or 3.11.8) is just a >> patch update and I expect this routine should do well, but maybe I'm >> missing some important steps here. >> >> I can provide more information about my system or configuration if >> required. >> >> Appreciate your support! >> >> Kind regards >> Nico >> >> >> >> >> >> >> >> LANCOM Systems GmbH >> Adenauerstr. 20 / B2 >> 52146 Würselen >> Deutschland >> >> Tel: +49 2405 49936-0 >> Fax: +49 2405 49936-99 >> >> Web: https://www.lancom-systems.de >> >> >> >> Geschäftsführer: Ralf Koenzen, Stefan Herrlich >> Sitz der Gesellschaft: Aachen, Amtsgericht Aachen, HRB 16976 >> >> >> >> >> >> >> >> - >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: user-h...@cassandra.apache.org > >
Re: [EXTERNAL] How to reduce vnodes without downtime
Hi guys, thanks a lot for useful tips. I obviously underestimated complexity of such change. Thanks again, Maxim. >
How to reduce vnodes without downtime
Hi everyone, with discussion about reducing default vnodes in version 4.0 I would like to ask, what would be optimal procedure to perform reduction of vnodes in existing 3.11.x cluster which was set up with default value 256. Cluster has 2 DC with 5 nodes each and RF=3. There is one more restriction, I could not add more servers, nor to create additional DC, everything is physical. This should be done without downtime. My idea for such procedure would be for each node: - decommission node - set auto_bootstrap to true and vnodes to 4 - start and wait till node joins cluster - run cleanup on rest of nodes in cluster - run repair on whole cluster (not sure if needed after cleanup) - set auto_bootstrap to false repeat for each node rolling restart of cluster cluster repair Is this sounds right ? My concern is that after decommission, node will start on the same IP which could create some confusion. Regards, Maxim.
Re: How to query TTL on collections ?
Hi Alain, thanks a lot for detailed answer. > You can set values individually in a collection as you did above (and > probably should do so to avoid massive tombstones creation), but you have > to read the whole thing at once: > This, actually, is one of the design goals. At the moment I have two (actually more) "normalised" tables, which have data as separate columns. But the use case, actually, requires to read all items every time product is queried, thus move to collection to reduce amount of queries. Moreover, items in collection is append only and expires using TTL. Map seem to be excellent fit for that. > Furthermore, you cannot query the TTL for a single item in a collection, > and as distinct columns can have distinct TTLs, you cannot query the TTL > for the whole map (collection). As you cannot get the TTL for the whole > thing, nor query a single item of the collection, I guess there is no way > to get the currently set TTL for all or part of a collection. > Yes, this is unfortunate. I could not find a way to query individual element of collection and TTL for individual element, thanks for confirming that this is not possible. > Another idea of hack I'm just thinking about could be to add a 'ttl' field > that would get the updates as well, any time a client updates the TTL for > an entry, you could update that 'ttl' field as well. But again, you would > still not be able to query this information only for an item or a few, it > would be querying the whole map again. > This is, actually, very good idea, maybe adding field something like "expires_at" will solve my problem. I received another advise to do it. For one off query, it is possible to get TTL by finding corresponding sstable and using sstabledump, it shows all information including TTL, but this is very cumbersome. Regards, Maxim. P.S. Your company's blog is excellent.
How to query TTL on collections ?
Hi everyone, I'm struggling to understand how can I query TTL on the row in collection ( Cassandra 3.11.4 ). Here is my schema: CREATE TYPE item ( csn bigint, name text ); CREATE TABLE products ( product_id bigint PRIMARY KEY, items map> ); And I'm creating records with TTL like this: UPDATE products USING TTL 10 SET items = items + {10: {csn: 100, name: 'item100'}} WHERE product_id = 1; UPDATE products USING TTL 20 SET items = items + {20: {csn: 200, name: 'item200'}} WHERE product_id = 1; As expected first records disappears after 10 seconds and the second after 20. But if I already have data in the table I could not figure out how to query TTL on the item value: SELECT TTL(items) FROM products WHERE product_id=1; InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot use selection function ttl on collections" SELECT TTL(items[10]) FROM products WHERE product_id=1; SyntaxException: line 1:16 mismatched input '[' expecting ')' (SELECT TTL(items[[]...) Any tips, hints, tricks are highly appreciated, Maxim.
Re: Repairs are slow after upgrade to 3.11.3
Hi Alex, I'm using Cassandra reaper as well. Could be https://issues.apache.org/jira/browse/CASSANDRA-14332 as it was committed in both version. Regards, Maxim. On Wed, Aug 29, 2018 at 2:14 PM Oleksandr Shulgin < oleksandr.shul...@zalando.de> wrote: > On Wed, Aug 29, 2018 at 3:06 AM Maxim Parkachov > wrote: > >> couple of days ago I have upgraded Cassandra from 3.11.2 to 3.11.3 and I >> see that repair time is practically doubled. Does someone else experience >> the same regression ? >> > > We have upgraded from 3.0.16 to 3.0.17 two days ago and we see the same > symptom. We are using Cassandra reaper and average time to repair one > segment increased from 5-6 to 10-12 min. > > -- > Alex > >
Re: Repairs are slow after upgrade to 3.11.3
Hi, I wanted to get rid of https://issues.apache.org/jira/browse/CASSANDRA-14332 and https://issues.apache.org/jira/browse/CASSANDRA-14470. I haven't seen these errors yet, but it is early to say after couple of days of operation. Regards, Maxim. On Wed, Aug 29, 2018 at 10:27 AM Jean Carlo wrote: > Hello, > > Can I ask you why did you upgrade from 3.11.2 ? did you experience some > java heap problems ? > > Unfortunately I cannot answer your question :( I am in the 2.1 and about > to upgrade to 3.11 > > Best greatings > > > > Jean Carlo > > "The best way to predict the future is to invent it" Alan Kay > > > On Wed, Aug 29, 2018 at 3:06 AM Maxim Parkachov > wrote: > >> Hi everyone, >> >> couple of days ago I have upgraded Cassandra from 3.11.2 to 3.11.3 and I >> see that repair time is practically doubled. Does someone else experience >> the same regression ? >> >> Regards, >> Maxim. >> >
Repairs are slow after upgrade to 3.11.3
Hi everyone, couple of days ago I have upgraded Cassandra from 3.11.2 to 3.11.3 and I see that repair time is practically doubled. Does someone else experience the same regression ? Regards, Maxim.
Re: Repair daily refreshed table
Hi Raul, I cannot afford delete and then load as this will create downtime for the record, that's why I'm upserting with TTL today()+7days as I mentioted in my original question. And at the moment I don't have an issue either with loading nor with access times. My question is should I repair such table or not and if yes before load or after (or it doesn't matter) ? Thanks, Maxim. On Sun, Aug 19, 2018 at 8:52 AM Rahul Singh wrote: > If you wanted to be certain that all replicas were acknowledging receipt > of the data, then you could use ALL or EACH_QUORUM ( if you have multiple > DCs) but you must really want high consistency if you do that. > > You should avoid consciously creating tombstones if possible — it ends up > making reads slower because they need to be accounted for until they are > compacted / garbage collected out. > > Tombstones are created when data is either deleted, or nulled. When > marking data with a TTL , the actual delete is not done until after the TTL > has expired. > > When you say you are overwriting, are you deleting and then loading? > That’s the only way you should see tombstones — or maybe you are setting > nulls? > > Rahul > On Aug 18, 2018, 11:16 PM -0700, Maxim Parkachov , > wrote: > > Hi Rahul, > > I'm already using LOCAL_QUORUM in batch process and it runs every day. As > far as I understand, because I'm overwriting whole table with new TTL, > process creates tons of thumbstones and I'm more concerned with them. > > Regards, > Maxim. > > On Sun, Aug 19, 2018 at 3:02 AM Rahul Singh > wrote: > >> Are you loading using a batch process? What’s the frequency of the data >> Ingest and does it have to very fast. If not too frequent and can be a >> little slower, you may consider a higher consistency to ensure data is on >> replicas. >> >> Rahul >> On Aug 18, 2018, 2:29 AM -0700, Maxim Parkachov , >> wrote: >> >> Hi community, >> >> I'm currently puzzled with following challenge. I have a CF with 7 days >> TTL on all rows. Daily there is a process which loads actual data with +7 >> days TTL. Thus records which are not present in last 7 days of load >> expired. Amount of these expired records are very small < 1%. I have daily >> repair process, which take considerable amount of time and resources, and >> snapshot after that. Obviously I'm concerned only with the last loaded >> data. Basically, my question: should I run repair before load, after load >> or maybe I don't need to repair such table at all ? >> >> Regards, >> Maxim. >> >>
Re: Repair daily refreshed table
Hi Rahul, I'm already using LOCAL_QUORUM in batch process and it runs every day. As far as I understand, because I'm overwriting whole table with new TTL, process creates tons of thumbstones and I'm more concerned with them. Regards, Maxim. On Sun, Aug 19, 2018 at 3:02 AM Rahul Singh wrote: > Are you loading using a batch process? What’s the frequency of the data > Ingest and does it have to very fast. If not too frequent and can be a > little slower, you may consider a higher consistency to ensure data is on > replicas. > > Rahul > On Aug 18, 2018, 2:29 AM -0700, Maxim Parkachov , > wrote: > > Hi community, > > I'm currently puzzled with following challenge. I have a CF with 7 days > TTL on all rows. Daily there is a process which loads actual data with +7 > days TTL. Thus records which are not present in last 7 days of load > expired. Amount of these expired records are very small < 1%. I have daily > repair process, which take considerable amount of time and resources, and > snapshot after that. Obviously I'm concerned only with the last loaded > data. Basically, my question: should I run repair before load, after load > or maybe I don't need to repair such table at all ? > > Regards, > Maxim. > >
Repair daily refreshed table
Hi community, I'm currently puzzled with following challenge. I have a CF with 7 days TTL on all rows. Daily there is a process which loads actual data with +7 days TTL. Thus records which are not present in last 7 days of load expired. Amount of these expired records are very small < 1%. I have daily repair process, which take considerable amount of time and resources, and snapshot after that. Obviously I'm concerned only with the last loaded data. Basically, my question: should I run repair before load, after load or maybe I don't need to repair such table at all ? Regards, Maxim.