No actually in this case I didn’t really have an opinion because C* is an architecturally different beast than an RDBMS. That’s kinda what ticked the curiosity when you made the suggestion about co-locating commit and data. It raises an interesting question for me. As for the 10 seconds delay, I’m used to looking at graphite, so bad is relative. 😉
The question that pops to mind is this. If a commit log isn’t really an important recovery mechanism…. should one even be part of C* at all? It’s a lot of code complexity and I/O volume and O/S tuning complexity to worry about having good I/O resiliency and performance with both commit and data volumes. If the proper way to deal with all data volume problems in C* would be to burn the node (or at least, it’s state) and rebuild via the state of its neighbours, then repairs (whether administratively triggered, or as a side-effect of ongoing operations) should always catch up with any mutations anyways so long as the data is appropriately replicated. The benefit to the having a commit log would seem limited to data which isn’t replicated. However, I shouldn’t derail Sergio’s thread. It just was something that caught my interest and got me mulling, but it’s a tangent. From: Erick Ramirez <erick.rami...@datastax.com> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org> Date: Friday, February 14, 2020 at 9:04 PM To: "user@cassandra.apache.org" <user@cassandra.apache.org> Subject: Re: AWS I3.XLARGE retiring instances advices Message from External Sender Erick, a question purely as a point of curiosity. The entire model of a commit log, historically (speaking in RDBS terms), depended on a notion of stable store. The idea being that if your data volume lost recent writes, the failure mode there would be independent of writes to the volume holding the commit log, so that replay of the commit log could generally be depended on to recover the missing data. I’d be curious what the C* expert viewpoint on that would be, with the commit log and data on the same volume. Those are fair points so thanks for bringing them up. I'll comment from a personal viewpoint and others can provide their opinions/feedback.👍 If you think about it, you've lost the data volume -- not just the recent writes. Replaying the mutations in the commit log is probably insignificant compared to having to recover the data through various ways (re-bootstrap, refresh from off-volume/off-server snapshots, etc). The data and redo/archive logs being on the same volume (in my opinion) is more relevant in RDBMS since they're mostly deployed on SANs compared to the nothing-shared architecture of C*. I know that's debatable and others will have their own view. :) How about you, Reid? Do you have concerns about both data and commitlog being on the same disk? And slightly off-topic but by extension, do you also have concerns about the default commitlog fsync() being 10 seconds? Cheers!