Re: How do I upgrade a single cassandra node in production to 3 nodes cluster ?

2014-02-16 Thread Erick Ramirez
Ertio, It's not so much upgrading, but simply adding more nodes to your existing setup. Cheers, Erick On Sun, Feb 16, 2014 at 2:13 PM, Ertio Lew ertio...@gmail.com wrote: I started off with a single cassandra node on my 2GB digital ocean VPS, but now I'm planning to upgrade it to 3 node

Re: Where to I start to get to the bottom of this WriteTimeout issue?

2014-02-16 Thread Erick Ramirez
Jacob, Are you able to post log snippets around the time that the timeouts occur? I have a suspicion you may be running out of heap memory and might need to tune your environment. The INFO entries in the log should indicate this. Cheers, Erick

Re: Where to I start to get to the bottom of this WriteTimeout issue?

2014-02-16 Thread Erick Ramirez
the cause of the issue. And keep us posted. Thanks! Cheers, Erick On Mon, Feb 17, 2014 at 1:41 PM, Jacob Rhoden jacob.rho...@me.com wrote: Hi Erick, On 17 Feb 2014, at 1:19 pm, Erick Ramirez er...@ramirez.com.au wrote: Are you able to post log snippets around the time that the timeouts occur

Re: Performance problem with large wide row inserts using CQL

2014-02-20 Thread Erick Ramirez
Wow! What a fantastic robust discussion. I've just been educated. Peter --- Thanks for providing those use cases. They are great examples. Rudiger --- From what you've done so far, I wouldn't have said your are new to Cassandra. Well done. Cheers, Erick

Re: CQL Alter table does not propagate to all nodes.

2014-10-15 Thread Erick Ramirez
Hello, dlu66061. A common issue with schema disagreements is time drift on the nodes. Are you using NTP? The only other issue is when the nodes are not reachable at the time that the schema update was being propagated ---

Re: CQL Alter table does not propagate to all nodes.

2014-10-17 Thread Erick Ramirez
Lu, Thanks for the letting me know that you figured it out. Cheers, Erick

Re: Denormalization leads to terrible, rather than better, Cassandra performance -- I am really puzzled

2015-05-03 Thread Erick Ramirez
in the next 24-48 hours that reverts JAVA-425 to resolve this issue. Cheers, Erick *Erick Ramirez* About Me about.me/erickramirezonline Make a difference today! * Reduce your carbon footprint http://on.mash.to/1vZL7fX * Give back to the community http://www.govolunteer.com.au * Write free software

Re: Can't connect to Cassandra server

2015-07-19 Thread Erick Ramirez
Cheers, Erick *Erick Ramirez* About Me about.me/erickramirezonline Make a difference today! * Reduce your carbon footprint http://on.mash.to/1vZL7fX * Give back to the community http://www.govolunteer.com.au * Write free software http://www.opensource.org On Sun, Jul 19, 2015 at 11:22 PM, Chamila

Re: Can't connect to Cassandra server

2015-07-20 Thread Erick Ramirez
- cassandra-env.sh Cheers, Erick *Erick Ramirez* About Me about.me/erickramirezonline

Re: How to remove huge files with all expired data sooner?

2015-09-28 Thread Erick Ramirez
Hello, You should never run `nodetool compact` since this will result in a massive SSTable that will almost never get compacted out or take a very long time to get compacted out. You are correct that there needs to be 4 similar-sized SSTables for them to get compacted. If you want the expired

Re: Attempted to write commit log entry for unrecognized table

2017-08-15 Thread Erick Ramirez
Myron, it just means that while the node was down, one of the tables got dropped. When you eventually brought the node back online and the commit logs were getting replayed, it tried to replay a mutation for a table which no longer exists. Cheers! On Wed, Aug 16, 2017 at 5:56 AM, Myron A. Semack

Re: Creating a copy of a C* cluster

2017-08-15 Thread Erick Ramirez
onfiguration of the "stand by" cluster (i.e. number of the > nodes). > however, there is a issue we ran into with the sstableloader > (java.lang.OutOfMemoryError: > GC overhead limit exceeded) > https://issues.apache.org/jira/browse/CASSANDRA-7385 > > Thanks, > F

Re: Migrate from DSE (Datastax) to Apache Cassandra

2017-08-15 Thread Erick Ramirez
Ioannis, it's not a straightforward process to migrate from DSE to COSS. There are some parts of DSE which are not recognised by COSS, e.g. EverywhereStrategy for replication only known to DSE. You are better off standing up a new COSS 3.11 cluster and restore app keyspaces to the new cluster.

Re: SASI index returns no results

2017-08-15 Thread Erick Ramirez
Have you tried tracing (TRACING ON) the query? That would usually give you clues as to where it's failing. Cheers! On Wed, Aug 16, 2017 at 12:03 AM, Vladimir Yudovin wrote: > Hi, > > I recently encountered with strange issue. > Assuming there is table > > id PRIMARY KEY >

Re: Dropping down replication factor

2017-08-15 Thread Erick Ramirez
I would discourage dropping to RF=2 because if you're using CL=*QUORUM, it won't be able to tolerate a node outage. You mentioned a couple of days ago that there's an index file that is corrupted on 10.40.17.114. Could you try moving out the sstable set associated with that corrupt file and try

Re: MemtablePostFlush pending

2017-08-15 Thread Erick Ramirez
Check what you have set for memtable_cleanup_threshold and if it's set too low which means more flushing triggered. Cheers! On Sat, Aug 12, 2017 at 5:05 AM, ZAIDI, ASAD A wrote: > Hello Folks, > > > > I’m using Cassandra 2.2 on 14 node cluster. > > > > Now a days, I’m observing

Re: Creating a copy of a C* cluster

2017-08-15 Thread Erick Ramirez
A slight variation to Ben Slater's idea is to build the second cluster like-for-like and assigning the same tokens used by the original nodes. If you restore the data onto the equivalent nodes with the same tokens, the data will be accessible as normal. Cheers! On Tue, Aug 8, 2017 at 7:08 AM,

Re: Questions on time series use case, tombstones, TWCS

2017-08-15 Thread Erick Ramirez
Not sure if these are what Jeff was referring to but as a workaround, you can configure the following STCS compaction subproperties: - min_threshold - set to 2 so that only a minimum of 2 similar-sized sstables are required to trigger a minor compaction instead of the default 4 -

Re: Error Exception in Repair Thread

2017-08-15 Thread Erick Ramirez
2 common causes of interrupted streams are (a) network interruptions, or (b) nodes becoming unresponsive, e.g. GC pause during high loads. As far as network is concerned, is there a firewall in the middle? If so, it's quite common for firewalls to close sockets when it thinks the connection is

Re: live dsc upgrade from 2.0 to 2.1 behind the scenes

2017-08-15 Thread Erick Ramirez
1) You should not perform any streaming operations (repair, bootstrap, decommission) in the middle of an upgrade. Note that an upgrade is not complete until you have completed upgradesstables on all nodes in the cluster. 2) No streaming involved with writes so it's not an issue. 3) It doesn't

Re: cqlsh -e output - How to change the default delimiter '|' in the output

2017-08-15 Thread Erick Ramirez
+1 to Jim and Tobin. cqlsh wasn't designed for what you're trying to achieve. Cheers! On Tue, Aug 15, 2017 at 1:34 AM, Tobin Landricombe wrote: > Can't change the delimiter (I'm on cqlsh 5.0.1). Best I can offer is >

Re: MUTATION messages were dropped in last 5000 ms for cross node timeout

2017-08-15 Thread Erick Ramirez
Mutations get dropped because a node can't keep up with writes. If you understand the Cassandra write path, writes are ACKed when the mutation is appended to the commitlog which is why it's very fast. Knowing that, dropped mutations mean that the disk is not able to keep up with the IO. Another

Re: Data in multi disks is not evenly distributed

2017-06-11 Thread Erick Ramirez
That's the cause of the imbalance -- an excessively large sstable which suggests to me that at some point you performed a manual major compaction with nodetool compact. If the table is using STCS, there won't be other compaction partners in the near future so you split the sstable manually with

Re: Using Cassandra for my usecase

2017-06-11 Thread Erick Ramirez
> > *Given my use case is cassandra the best suited one or is there any other > database which suits my requirement better?* Probably not the right forum for that question. It's like walking into a Ford dealership and asking if the Mustang is the best car for you.  In any case, you would

Re: Invalid Gossip generation

2017-08-30 Thread Erick Ramirez
Unfortunately, the only available workaround is a rolling restart of the cluster until you get the fix in C* 2.1.13 (CASSANDRA-10969 ). On Thu, Aug 31, 2017 at 5:52 AM, Mark Furlong wrote: > I have a 2.1.12 cluster

Re: Hints replay incompatible between 2.x and 3.x

2017-08-30 Thread Erick Ramirez
1TB of hints suggests you don't have enough capacity in your cluster. The only way around that is to add more nodes. On Thu, Aug 31, 2017 at 3:05 AM, Jason Brown wrote: > Hi Andrew, > > This question is best for the user@ list, included here. > > Thanks, > > -Jason > > On

Re: Cassandra All host(s) tried for query failed (no host was tried)

2017-08-30 Thread Erick Ramirez
No host was tried because nodes were unresponsive and the driver marked them as down. When too many nodes get marked as down, the driver eventually runs out of nodes so ends up in NoHostAvailableException. Nodes become unresponsive because they are overloaded. You either throttle back the app

Re: system_auth replication factor in Cassandra 2.1

2017-08-30 Thread Erick Ramirez
It looks like nodes .113 and .116 have a problem. Repairing system_auth which only contains 5 users should not take that long. Run with just nodetool repair system_auth (without the -pr flag). But first investigate why those 2 nodes are slow to respond. Cheers! On Thu, Aug 31, 2017 at 3:00 AM,

Re: timeouts on counter tables

2017-08-29 Thread Erick Ramirez
Is it possible at all that you may have a data hotspot if it's not hardware-related? On Mon, Aug 28, 2017 at 11:30 AM, kurt greaves wrote: > If every node is a replica it sounds like you've got hardware issues. Have > you compared iostat to the "normal" nodes? I assume

Re: Cassandra snapshot restore with VNODES missing some data

2017-08-30 Thread Erick Ramirez
For your method to work, you have to restore like-for-like, i.e. you need to mirror the source nodes by using the exact same tokens in system.local. For example, if source node A has tokens 567, 678 and 789, then you need to setup the equivalent target node with exactly those tokens. Otherwise,

Re: Cassandra 3.7 repair error messages

2017-08-30 Thread Erick Ramirez
No, it isn't normal for sessions to fail and you will need to investigate. You need to review the logs on node .204 to determine why the session failed. For example, did it timeout because of a very large sstable? Or did the connection get truncated after a while? You will need to address the

Re: How to know if bootstrap is still running

2017-11-12 Thread Erick Ramirez
+1 and run nodetool compactionstats so you can see 2Is in progress. On Mon, Nov 13, 2017 at 7:00 AM, kurt greaves wrote: > bootstrap will wait for secondary indexes and MV's to build before > completing. if either are still shown in compactions then it will wait for > them

Re: unrecognized column family in logs

2017-11-12 Thread Erick Ramirez
ment? > You can try "nodetool resetlocalschema" to fix the issue on the node > experiencing disagreement. > > Romain > > Le jeudi 9 novembre 2017 à 02:55:22 UTC+1, Erick Ramirez < > flightc...@gmail.com> a écrit : > > > It looks like you have a schema disagreem

Re: Node Failure Scenario

2017-11-12 Thread Erick Ramirez
Use the replace_address method with its own IP address. Make sure you delete the contents of the following directories: - data/ - commitlog/ - saved_caches/ Forget rejoining with repair -- it will just cause more problems. Cheers! On Mon, Nov 13, 2017 at 2:54 PM, Anshu Vajpayee

Re: Warning for large batch sizes with a small number of statements

2017-11-13 Thread Erick Ramirez
You can increase it if you're sure that it fits your use case. For an explanation of why batch size vs number of statements, see the discussion in CASSANDRA-6487. Cheers! On Mon, Nov 13, 2017 at 6:31 PM, Tim Moore wrote: > Hi, > > I'm trying to understand some of the

Re: Cassandra Query

2017-11-13 Thread Erick Ramirez
That is one of the most asked questions that it prompted me to write a blog post last month -- https://academy.datastax.com/support-blog/counting-keys-might-well-be-counting-stars On Mon, Nov 13, 2017 at 6:55 PM, Hareesh Veduraj wrote: > Hi Team, > > I have a new

Re: Repair failing after it was interrupted once

2017-11-15 Thread Erick Ramirez
Check that there are no running repair threads on the nodes with nodetool netstats. For those that do have running repairs, restart C* on them to kill the repair threads and you should be able to repair the nodes again. Cheers! On Wed, Nov 15, 2017 at 8:08 PM, Dipan Shah

Re: 3.0.6 - CorruptSSTableException

2017-11-08 Thread Erick Ramirez
system.local only contains a single partition with key='local'. There won't be anything to repair for it since it only contains data about that node, e.g. saved tokens, cluster name, etc. Cheers! On Wed, Nov 8, 2017 at 5:03 AM, Riccardo Ferrari wrote: > Thanks you Adama, > >

Re: What can NOT be done during repairs (2.2.x and 3.0.x)

2017-11-08 Thread Erick Ramirez
You can continue to perform operations while running a repair but just be aware that you will get repair failures, e.g. if you drop a table while that table is getting repaired. In which case, the failure is benign. Creating new keyspaces or tables will not pose any problems for the repair

Re: unrecognized column family in logs

2017-11-08 Thread Erick Ramirez
It looks like you have a schema disagreement in your cluster which you need to look into. And you're right since that column family ID is equivalent to Friday, June 24, 2016 10:14:49 AM PDT. Have a look at the table IDs in system.schema_columnfamilies for clues. Cheers! On Thu, Nov 9, 2017 at

Re: Meltdown/Spectre Linux patch - Performance impact on Cassandra?

2018-01-06 Thread Erick Ramirez
Thanks for the insight, Romain, and providing those numbers. Looking forward to others posting their stats here. We are running up some tests and will share when available. Cheers! On Sat, Jan 6, 2018 at 3:44 AM, Romain Hardouin wrote: > Hi, > > We also noticed an

Re: Overload because of hint pressure + MVs

2020-02-10 Thread Erick Ramirez
> > Currently the value of phi_convict_threshold is not set which makes it to > 8 (default) . > Can this also cause hints buildup even when we can see that all nodes are > UP ? You can bump it up to 12 to reduce the sensitivity but it's likely GC pauses causing it. Phi convict is the

Re: Connection reset by peer

2020-02-13 Thread Erick Ramirez
> > Last question: In all your experiences, how high can the latency (simple > ping response times go) before it becomes a problem? (Obviously the lower > the better but is there some sort of cut off/formula where problems can be > expected intermittently like the connection resets)

cassandra-cli on 3.x

2020-02-11 Thread Erick Ramirez
Jai, Thrift was deprecated years ago (maybe 5 or 6?) and COMPACT STORAGE was dropped since the refactor of the storage engine in C* 3.0 so there won't be support for any legacy CLI. In fact, you need to migrate off legacy storage when you upgrade using ALTER TABLE ks.table DROP COMPACT STORAGE.

Re: cassandra-cli on 3.x

2020-02-11 Thread Erick Ramirez
> > I am using astyanax client Right. It was announced as being retired back in 2016 [1] which ended in 2018 [2]: > > *DeprecationAstyanax has been retired and is no longer under active > development but may receive dependency updates to ease migration away from > Astyanax.In place of Astyanax

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-25 Thread Erick Ramirez
> > Just follow up to your statement: > Limiting the seeds to 2 per DC means : > A) Each node in a DC has at least 2 seeds and those seeds belong to the > same DC > or > B) Each node in a DC has at least 2 seeds even across different DC > I apologise for the ambiguity of my previous response, I

Re: Hints replays very slow in one DC

2020-02-25 Thread Erick Ramirez
Krish, with the limited info and assuming things like hint throttle and delivery threads all being equal, my guess would be DC1 is your primary DC and is busier than DC2. Got any diagnostic data/troubleshooting info you could share? Otherwise, it's a little difficult to speculate as to what may be

Re: Hints replays very slow in one DC

2020-02-25 Thread Erick Ramirez
What's the reason for nodes going down? Is it because the cluster is overloaded? Hints will get handed off periodically when nodes come back to life but if they happen to go down again or become unresponsive (for whatever reason), the handoff will be delayed until the next cycle. I think it's

Re: Should we use Materialised Views or ditch them ?

2020-02-28 Thread Erick Ramirez
Personally, I think MVs are still experimental and not ready for primetime. It works for some but if you run into issues, fixing them have a huge impact to your application. For example if the view updates get too far behind, there's no effective way to resolve them other than having to drop the

Re: Deleting Compaction Strategy for Cassandra 3.0?

2020-02-28 Thread Erick Ramirez
I'm not personally aware of anyone who is using it successfully other than ProtectWise where it was a good fit for their narrow use case. My limited knowledge of it is that it has some sharp edges which is the reason they haven't pushed for it to be added to Cassandra (that's second hand info so

Re: Downgrading from 3.11.5 to 3.11.0

2020-03-04 Thread Erick Ramirez
Once you've upgraded to a version of Cassandra that uses a new SSTable format, you cannot downgrade the binaries -- for whatever reason -- because the older version of C* will not be able to read the SSTables using the old format. I'm more concerned that you think reverting to a .0 release is a

Re: Downgrading from 3.11.5 to 3.11.0

2020-03-04 Thread Erick Ramirez
> > ... the older version of C* will not be able to read the SSTables using > the old format. > Sorry, I meant to say *"... using the NEW format"*. Cheers!

Re: Downgrading from 3.11.5 to 3.11.0

2020-03-04 Thread Erick Ramirez
> > Are you telling me that cassandra changes sstanle file format when > updating from 3.11.4 to 3.11.5? > It was introduced in C* 3.0.18 and 3.11.4. The format changed from -mc- to -md- because of a bug with min/max clustering values. It had the potential to result in data loss so there was no

Re: Should we use Materialised Views or ditch them ?

2020-03-03 Thread Erick Ramirez
I received quite a number of follow up questions directly as a result of my response to Tobias' question. My comments on MVs specifically only relate to MVs in OSS C*. :) I'd like to reiterate that my responses to this mailing list are almost exclusively relating to OSS C* but I'm going to make

Re: Deleting data from future

2020-03-03 Thread Erick Ramirez
> > Inspecting sstables, we found that timestamp(TS field value) of these > records was 1584349956844022, Monday, 16 March 2020 09:12:36.844. > We neither delete these records nor truncate table. > Is there anyway to manipulate records inside sstable manually? > Not really unless you plan to

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-03-03 Thread Erick Ramirez
> > Should upgradesstables not be run after every node is upgraded? If we need > to rollback then we will not be able to downgrade sstables to older version > You can choose to (a) upgrade the SSTables one node at a time as you complete the binary upgrade, or (b) upgrade the binaries on all

Re: Hints replays very slow in one DC

2020-02-26 Thread Erick Ramirez
> > Nodes are going down due to Out of Memory and we are using 31GB heap size > in DC1 , however in DC2 (Which serves the traffic) has 16GB heap . > Why we had to increase heap in DC1 is because , DC1 nodes were going down > due Out of Memory issue but DC2 nodes never went down . > It doesn't

Re:

2020-01-22 Thread Erick Ramirez
You need to email user-unsubscr...@cassandra.apache.org if you don't want to receive emails anymore. Cheers! On Thu, Jan 23, 2020 at 3:12 AM Sowjanya Karangula wrote: > stop >

Re: How to read content of hints file and apply them manually?

2020-01-27 Thread Erick Ramirez
> > Increase the max_hint_window_in_ms setting in cassandra.yaml to more than > 3 hours, perhaps 6 hours. If the issue still persists networking may need > to be tested for bandwidth issues. > Just a note of warning about bumping up the hint window without understanding the pros and cons. Be

Re: How to read content of hints file and apply them manually?

2020-01-27 Thread Erick Ramirez
There isn't a tool that I'm aware of that's readily available to do that. Your best bet is to run a regular repair. But really, hints are just a side-issue of a much wider problem and that is the nodes are overloaded. Is your application getting hit with a much higher than expected traffic? The

Re: new node stops streaming..

2020-01-27 Thread Erick Ramirez
You can increase the max number of open files on the new node. We find that 65K is too low for most production clusters and you can bump it up to 100 or 200K. We generally recommend 1 million but YMMV: - nofile 1048576 On Tue, Jan 28, 2020 at 11:55 AM Eunsu Kim wrote: > Hi experts > > I had

Re: How to read content of hints file and apply them manually?

2020-01-28 Thread Erick Ramirez
I would do a thread dump and work out the threads with the highest CPU consumers from it. But in my experience, 90% of the time it's GC from high app traffic unless you've hit an edge case bug. Which means the cluster doesn't have enough capacity and you need to review the cluster size. Cheers!

Re: Cassandra going OOM due to tombstones (heapdump screenshots provided)

2020-01-29 Thread Erick Ramirez
> > It looks like the number of tables is the problem, with 5,000 - 10,000 > tables, that is way above the recommendations. > Take a look here: > https://docs.datastax.com/en/dse-planning/doc/planning/planningAntiPatterns.html#planningAntiPatterns__AntiPatTooManyTables > This suggests that 5-10GB

Re: KeyCache Harmless Error on Startup

2020-01-29 Thread Erick Ramirez
Oh, I just saw the example. Never mind. :)

Re: KeyCache Harmless Error on Startup

2020-01-29 Thread Erick Ramirez
> > Does anyone perhaps have an idea on what could've gone wrong here? > Could it be just a calculation error on startup? > Specifically for the NegativeArraySizeException, what's happening is that the keyLength is so huge that it blows up MAX_UNSIGNED_SHORT so it looks like it's a negative

Re: KeyCache Harmless Error on Startup

2020-01-29 Thread Erick Ramirez
> Specifically for the NegativeArraySizeException, what's happening is that > the keyLength is so huge that it blows up MAX_UNSIGNED_SHORT so it looks > like it's a negative value. Someone will correct me if I got that wrong but > the "Key length longer than max" error confirms that. > Is it

Re: Cassandra OS Patching.

2020-01-30 Thread Erick Ramirez
There is no need to shutdown the application because you should be able to carry out the operating system upgraded without an outage to the database particularly since you have a lot of nodes in your cluster. Provided your cluster has sufficient capacity, you might even have the ability to

Re: sstableloader: How much does it actually need?

2020-02-05 Thread Erick Ramirez
> > Another option is the DSE-bulk loader but it will require to convert to > csv/json (good option if you don't like to play with sstableloader and deal > to get all the sstables from all the nodes) > https://docs.datastax.com/en/dsbulk/doc/index.html > Thanks, Sergio. The DataStax Bulk Loader

Re: Nodes becoming unresponsive

2020-02-05 Thread Erick Ramirez
Surbhi, just a *friendly* reminder that it's customary to reply back to the mailing list instead of emailing me directly so that everyone else in the list can participate. ☺ > I tried taking thread dump using kill -3 but it just came back and > no file generated. > How do you take the thread

Re: Nodes becoming unresponsive

2020-02-05 Thread Erick Ramirez
I wrote that article 5 years ago but I didn't think it would still be relevant today.  Have you tried to do a thread dump to see which are the most dominant threads? That's the most effective way of troubleshooting high CPU situations. Cheers! >

Re: sstableloader: How much does it actually need?

2020-02-05 Thread Erick Ramirez
Unfortunately, there isn't a guarantee that 2 nodes alone will have the full copy of data. I'd rather not say "it depends".  TIP: If the nodes in the target cluster have identical tokens allocated, you can just do a straight copy of the sstables node-for-node then do nodetool refresh. If the

Re: [EXTERNAL] How to reduce vnodes without downtime

2020-01-31 Thread Erick Ramirez
There's an active discussion going on right now in a separate dev thread. The current "default recommendation" is 32 tokens. But there's a push for 4 in combination with allocate_tokens_for_keyspace from Jon Haddad & co (based on a paper from Joe Lynch & Josh Snyder). If you're satisfied with the

Re: Apache vs Datastax cassandra

2020-02-03 Thread Erick Ramirez
Adarsh, a very *friendly* note that anyone is more than welcome to ask questions -- in fact as a group it's encouraged -- but a *gentle reminder* that this mailing list is for open-source Apache Cassandra. By all means, feel free to respond and not saying at all that it's not allowed (I'm just

Re: nodetool load does not match du

2020-02-03 Thread Erick Ramirez
> > Why the df -h and du -sh shows a big discrepancy? nodetool load is it > computed with df -h? > In Linux terms, df reports the filesystem disk usage while du is an *estimate* of the file space usage. What that means is that the operating system uses different accounting between the two

Re: nodetool load does not match du

2020-02-03 Thread Erick Ramirez
> I thought that the snapshot size was not counted in the load. > That's correct. I suggested looking at what nodetool tablestats reports so you can compare that against du/df outputs for clues as to why there is such a large discrepancy. Cheers!

Re: Running select against cassandra

2020-02-06 Thread Erick Ramirez
> > Also is materialized view good for production? I agree with Sean's and Reid's sentiments about MVs. I still think of MVs as being experimental and not ready for primetime. I would wait for the improvements which may be coming in C* 4.0 but no promises there... yet. :) Cheers!

Re: Query timeouts after Cassandra Migration

2020-02-06 Thread Erick Ramirez
> > So do you advise copying tokens in such cases ? What procedure is > advisable ? > Specifically for your case with 3 nodes + RF=3, it won't make a difference so leave it as it is. > Latency increased on target cluster. > Have you tried to run a trace of the queries which are slow? It will

Re: Nodes becoming unresponsive

2020-02-06 Thread Erick Ramirez
> > I tried to debug more and could see using top that Command is > MutationStage in top output , Any clue we get from this ? > That just means there's lots of writes hitting your cluster. Without the thread dump, it would be difficult to know if the threads are blocked by futex_wait or whatever

Re: Query timeouts after Cassandra Migration

2020-02-06 Thread Erick Ramirez
> > I didn’t copy tokens since it’s an identical cluster and we have RF as 3 > on 3 node cluster. Is it still needed , why? > In C*, same number of nodes alone isn't enough. Clusters aren't really identical unless token assignments are the same. In your case though since each node has a full copy

Re: [RELEASE] Apache Cassandra 4.0-alpha3 released

2020-02-07 Thread Erick Ramirez
Congratulations!  For those who may not be familiar with the behind-the-scenes, this is a major milestone for the project and another step closer to the release of Apache Cassandra 4.0. I'm so excited about this news and you should be too! 

Re: sstableloader - warning vs. failure?

2020-02-07 Thread Erick Ramirez
> > INFO [pool-1-thread-4] 2020-02-08 01:35:37,946 NoSpamLogger.java:91 - > Maximum memory usage reached (536870912), cannot allocate chunk of 1048576 > The message gets logged when SSTables are being cached and the cache fills up faster than objects are evicted from it. Note that the message is

Re: sstableloader & num_tokens change

2020-01-24 Thread Erick Ramirez
> If I may just loop this back to the question at hand: > > I'm curious if there are any gotchas with using sstableloader to restore > snapshots taken from 256-token nodes into a cluster with 32-token (or your > preferred number of tokens) nodes (otherwise same # of nodes and same RF). > No,

Re: sstableloader & num_tokens change

2020-01-24 Thread Erick Ramirez
On the subject of DSBulk, sstableloader is the tool of choice for this scenario. +1 to Sergio and I'm confirming that DSBulk is designed as a bulk loader for CSV/JSON formats. Cheers!

Re: [EXTERNAL] How to reduce vnodes without downtime

2020-02-02 Thread Erick Ramirez
A%22%5C%5BDiscuss%5C%5D+num_tokens+default+in+Cassandra+4.0%22=oldest> > ". > > Regards, > Anthony > > On Sat, 1 Feb 2020 at 10:07, Sergio wrote: > >> >> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html >>

Re: Consequences of dropping Materialized views

2020-02-18 Thread Erick Ramirez
> > We are Cassandra 3.11.0 unfortunately :( > Oh, right. That's why the hint read failure is causing the node to shutdown. We've at least identified that. I was worried that there was another bug we didn't know about. Cheers! >

Re: Consequences of dropping Materialized views

2020-02-18 Thread Erick Ramirez
> > We are on cassandra 3.11 , we are using G1GC and using 16GB of heap. > Which exact version of C* is it again? > WARN [MessagingService-Incoming-/10.X.X.X] 2020-02-18 14:21:47,115 > IncomingTcpConnection.java:103 - UnknownColumnFamilyException reading from > socket; closing > This is

Re: Understanding the difference between write and read operations

2020-02-18 Thread Erick Ramirez
A quick eyeball seems correct to me. Do you have concerns with the TRACE output? As a side note, the QUORUM consistency is not really relevant because you only have 1 replica in the dc1 DC. Cheers!

Re: Consequences of dropping Materialized views

2020-02-18 Thread Erick Ramirez
I'm not sure I understand your last response. Was there a question in there somewhere? Cheers! >

Re: Consequences of dropping Materialized views

2020-02-18 Thread Erick Ramirez
> > Clearly the hint error invoked the fs error handler - probably incorrectly > - which shut down the db. That’s not ok and deserves a JIRA. > It's supposed to have been fixed by CASSANDRA-13696 in 3.0.15/3.11.1 but I'm waiting for Surbhi to confirm the exact C* version. Cheers!

Re: Consequences of dropping Materialized views

2020-02-18 Thread Erick Ramirez
> > Just to add to my above point because here we are dropping MV not a > regular table. > And MV does read before write , Is this the reason we are seeing the below > message? Trying to understand > > WARN [HintsDispatcher:6737] 2020-02-18 14:22:24,932 HintsReader.java:237 - > Failed to read a

Re: Consequences of dropping Materialized views

2020-02-18 Thread Erick Ramirez
> > So should upgrading to 3.11.1 will solve this issue? > Upgrading off 3.11.0 will prevent nodes going down as a result of the hint replay bug in CASSANDRA-13696, yes. But I'd recommend upgrading to the latest C* 3.11.6 unless you have a very specific reason for upgrading to 3.11.1 which was

Re: IN OPERATOR VS BATCH QUERY

2020-02-20 Thread Erick Ramirez
Batches aren't really meant for optimisation in the same way as RDBMS. If anything, it will just put pressure on the coordinator having to fire off multiple requests to lots of replicas. The IN operator falls into the same category and I personally wouldn't use it with more than 2 or 3 partitions

Re: Corrupt SSTable Cassandra 3.11.2

2020-02-13 Thread Erick Ramirez
s a result of a faulty disk or hardware failure, it wouldn't be isolated to just one table. If you provide a bit more background information, we would be able to give you a better response. Cheers! Erick Ramirez | Developer Relations erick.rami...@datastax.com | datastax.com <http://www.datasta

Re: Corruption of frozen UDT during upgrade

2020-02-13 Thread Erick Ramirez
> ) - run sstablescrub -e fix-only to just fix the headers without doing a normal scrub If the headers are fine, the scrub will be a no-op. Otherwise, it will report that new metadata files are being written. For more details, see https://support.datastax.com/hc/en-us/articles/36002595

Re: Corruption of frozen UDT during upgrade

2020-02-14 Thread Erick Ramirez
> > I am still having problems reproducing this, so I am wondering if I have > created the tables correctly to create this issue. Paul, I've since had clarification on the bug and I hope I can explain it correctly here (happy to be corrected if anyone else has insight on the issue). When you

Re: AWS I3.XLARGE retiring instances advices

2020-02-14 Thread Erick Ramirez
> > Erick, a question purely as a point of curiosity. The entire model of a > commit log, historically (speaking in RDBS terms), depended on a notion of > stable store. The idea being that if your data volume lost recent writes, > the failure mode there would be independent of writes to the

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Erick Ramirez
> > Should I do something to fix it or leave as it? It depends on what your intentions are. I would use the "replace" method to build it correctly. At a high level: - remove the IP from it's own seeds list - delete the contents of data, commitlog and saved_caches - add the replace flag in

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Erick Ramirez
> > I want to have more than one seed node in each DC, so unless I don't > restart the node after changing the seed_list in that node it will not > become the seed. That's not really going to hurt you if you have other seeds in other DCs. But if you're willing to take the hit from the restart

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Erick Ramirez
> > I did decommission of this node and I did all the steps mentioned except > the -Dcassandra.replace_address and now it is streaming correctly! That works too but I was trying to avoid the rebalance operations (like streaming to restore replica counts) since they can be expensive. So

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Erick Ramirez
Not a problem. And I've just responded on the new thread. Cheers!  >

  1   2   3   4   >