Re: Future direction for the row cache and OHC implementation

2023-12-15 Thread Josh McKenzie
Gotcha; wasn't sure given the earlier phrasing. Makes sense.

Dinesh's compromise position makes sense to me.

On Fri, Dec 15, 2023, at 11:21 PM, Ariel Weisberg wrote:
> Hi,
> 
> I did get one response from Robert indicating that he didn’t want to do the 
> work to contribute it.
> 
> I offered to do the work and asked for permission to contribute it and no 
> response. Followed up later with a ping and also no response.
> 
> Ariel
> 
> On Fri, Dec 15, 2023, at 9:58 PM, Josh McKenzie wrote:
>>> I have reached out to the original maintainer about it and it seems like if 
>>> we want to keep using it we will need to start releasing it under a new 
>>> package from a different repo.
>> 
>>> the current maintainer is not interested in donating it to the ASF
>> Is that the case Ariel or could you just not reach Robert?
>> 
>> On Fri, Dec 15, 2023, at 11:55 AM, Jeremiah Jordan wrote:
 from a maintenance and
 integration testing perspective I think it would be better to keep the
 ohc in-tree, so we will be aware of any issues immediately after the
 full CI run.
>>> 
>>> From the original email bringing OHC in tree is not an option because the 
>>> current maintainer is not interested in donating it to the ASF.  Thus the 
>>> option 1 of some set of people forking it to their own github org and 
>>> maintaining a version outside of the ASF C* project.
>>> 
>>> -Jeremiah
>>> 
>>> On Dec 15, 2023 at 5:57:31 AM, Maxim Muzafarov  wrote:
 Ariel,
 thank you for bringing this topic to the ML.
 
 I may be missing something, so correct me if I'm wrong somewhere in
 the management of the Cassandra ecosystem.  As I see it, the problem
 right now is that if we fork the ohc and put it under its own root,
 the use of that row cache is still not well tested (the same as it is
 now). I am particularly emphasising the dependency management side, as
 any version change/upgrade in Cassandra and, as a result of that
 change a new set of libraries in the classpath should be tested
 against this integration.
 
 So, unless it is being widely used by someone else outside of the
 community (which it doesn't seem to be), from a maintenance and
 integration testing perspective I think it would be better to keep the
 ohc in-tree, so we will be aware of any issues immediately after the
 full CI run.
 
 I'm also +1 for not deprecating it, even if it is used in narrow
 cases, while the cost of maintaining its source code remains quite low
 and it brings some benefits.
 
 On Fri, 15 Dec 2023 at 05:39, Ariel Weisberg  wrote:
> 
> Hi,
> 
> To add some additional context.
> 
> The row cache is disabled by default and it is already pluggable, but 
> there isn’t a Caffeine implementation present. I think one used to exist 
> and could be resurrected.
> 
> I personally also think that people should be able to scratch their own 
> itch row cache wise so removing it entirely just because it isn’t 
> commonly used isn’t the right move unless the feature is very far out of 
> scope for Cassandra.
> 
> Auto enabling/disabling the cache is a can of worms that could result in 
> performance and reliability inconsistency as the DB enables/disables the 
> cache based on heuristics when you don’t want it to. It being off by 
> default seems good enough to me.
> 
> RE forking, we could create a GitHub org for OHC and then add people to 
> it. There are some examples of dependencies that haven’t been contributed 
> to the project that live outside like CCM and JAMM.
> 
> Ariel
> 
> On Thu, Dec 14, 2023, at 5:07 PM, Dinesh Joshi wrote:
> 
> I would avoid taking away a feature even if it works in narrow set of 
> use-cases. I would instead suggest -
> 
> 1. Leave it disabled by default.
> 2. Detect when Row Cache has a low hit rate and warn the operator to turn 
> it off. Cassandra should ideally detect this and do it automatically.
> 3. Move to Caffeine instead of OHC.
> 
> I would suggest having this as the middle ground.
> 
> On Dec 14, 2023, at 4:41 PM, Mick Semb Wever  wrote:
> 
> 
> 
> 
> 3. Deprecate the row cache entirely in either 5.0 or 5.1 and remove it in 
> a later release
> 
> 
> 
> 
> I'm for deprecating and removing it.
> It constantly trips users up and just causes pain.
> 
> Yes it works in some very narrow situations, but those situations often 
> change over time and again just bites the user.  Without the row-cache I 
> believe users would quickly find other, more suitable and lasting, 
> solutions.
> 
> 
>> 
> 


Re: Future direction for the row cache and OHC implementation

2023-12-15 Thread Ariel Weisberg
Hi,

I did get one response from Robert indicating that he didn’t want to do the 
work to contribute it.

I offered to do the work and asked for permission to contribute it and no 
response. Followed up later with a ping and also no response.

Ariel

On Fri, Dec 15, 2023, at 9:58 PM, Josh McKenzie wrote:
>> I have reached out to the original maintainer about it and it seems like if 
>> we want to keep using it we will need to start releasing it under a new 
>> package from a different repo.
> 
>> the current maintainer is not interested in donating it to the ASF
> Is that the case Ariel or could you just not reach Robert?
> 
> On Fri, Dec 15, 2023, at 11:55 AM, Jeremiah Jordan wrote:
>>> from a maintenance and
>>> integration testing perspective I think it would be better to keep the
>>> ohc in-tree, so we will be aware of any issues immediately after the
>>> full CI run.
>> 
>> From the original email bringing OHC in tree is not an option because the 
>> current maintainer is not interested in donating it to the ASF.  Thus the 
>> option 1 of some set of people forking it to their own github org and 
>> maintaining a version outside of the ASF C* project.
>> 
>> -Jeremiah
>> 
>> On Dec 15, 2023 at 5:57:31 AM, Maxim Muzafarov  wrote:
>>> Ariel,
>>> thank you for bringing this topic to the ML.
>>> 
>>> I may be missing something, so correct me if I'm wrong somewhere in
>>> the management of the Cassandra ecosystem.  As I see it, the problem
>>> right now is that if we fork the ohc and put it under its own root,
>>> the use of that row cache is still not well tested (the same as it is
>>> now). I am particularly emphasising the dependency management side, as
>>> any version change/upgrade in Cassandra and, as a result of that
>>> change a new set of libraries in the classpath should be tested
>>> against this integration.
>>> 
>>> So, unless it is being widely used by someone else outside of the
>>> community (which it doesn't seem to be), from a maintenance and
>>> integration testing perspective I think it would be better to keep the
>>> ohc in-tree, so we will be aware of any issues immediately after the
>>> full CI run.
>>> 
>>> I'm also +1 for not deprecating it, even if it is used in narrow
>>> cases, while the cost of maintaining its source code remains quite low
>>> and it brings some benefits.
>>> 
>>> On Fri, 15 Dec 2023 at 05:39, Ariel Weisberg  wrote:
 
 Hi,
 
 To add some additional context.
 
 The row cache is disabled by default and it is already pluggable, but 
 there isn’t a Caffeine implementation present. I think one used to exist 
 and could be resurrected.
 
 I personally also think that people should be able to scratch their own 
 itch row cache wise so removing it entirely just because it isn’t commonly 
 used isn’t the right move unless the feature is very far out of scope for 
 Cassandra.
 
 Auto enabling/disabling the cache is a can of worms that could result in 
 performance and reliability inconsistency as the DB enables/disables the 
 cache based on heuristics when you don’t want it to. It being off by 
 default seems good enough to me.
 
 RE forking, we could create a GitHub org for OHC and then add people to 
 it. There are some examples of dependencies that haven’t been contributed 
 to the project that live outside like CCM and JAMM.
 
 Ariel
 
 On Thu, Dec 14, 2023, at 5:07 PM, Dinesh Joshi wrote:
 
 I would avoid taking away a feature even if it works in narrow set of 
 use-cases. I would instead suggest -
 
 1. Leave it disabled by default.
 2. Detect when Row Cache has a low hit rate and warn the operator to turn 
 it off. Cassandra should ideally detect this and do it automatically.
 3. Move to Caffeine instead of OHC.
 
 I would suggest having this as the middle ground.
 
 On Dec 14, 2023, at 4:41 PM, Mick Semb Wever  wrote:
 
 
 
 
 3. Deprecate the row cache entirely in either 5.0 or 5.1 and remove it in 
 a later release
 
 
 
 
 I'm for deprecating and removing it.
 It constantly trips users up and just causes pain.
 
 Yes it works in some very narrow situations, but those situations often 
 change over time and again just bites the user.  Without the row-cache I 
 believe users would quickly find other, more suitable and lasting, 
 solutions.
 
 
> 


Re: Moving Semver4j from test to main dependencies

2023-12-15 Thread Mick Semb Wever
I'd like to add Semver4j to the production dependencies. It is currently on
> the test classpath. The library is pretty lightweight, licensed with MIT
> and has no transitive dependencies.
>
> We need to represent the kernel version somehow in CASSANDRA-19196 and
> Semver4j looks as the right tool for it. Maybe at some point we can replace
> our custom implementation of CassandraVersion as well.
>


I'm +1 on both counts.

But IMO you need to include those that were involved in this past
discussion that touched on the use of this library:
https://lists.apache.org/thread/zz3x1zl1lo8rkqpf0cl992y6fsy4r9gc


Re: Moving Semver4j from test to main dependencies

2023-12-15 Thread Josh McKenzie
+1

On Fri, Dec 15, 2023, at 1:29 PM, Derek Chen-Becker wrote:
> +1
> 
> Semver4j seems reasonable to me. I looked through the code and it's 
> relatively easy to understand. I'm not sure how easy it would be to replace 
> CassandraVersion, but that's not an immediate concern I guess.
> 
> Cheers,
> 
> Derek
> 
> On Fri, Dec 15, 2023 at 2:56 AM Jacek Lewandowski 
>  wrote:
>> Hi,
>> 
>> I'd like to add Semver4j to the production dependencies. It is currently on 
>> the test classpath. The library is pretty lightweight, licensed with MIT and 
>> has no transitive dependencies.
>> 
>> We need to represent the kernel version somehow in CASSANDRA-19196 and 
>> Semver4j looks as the right tool for it. Maybe at some point we can replace 
>> our custom implementation of CassandraVersion as well. 
>> 
>> Thanks,
>> - - -- --- -  -
>> Jacek Lewandowski
> 
> 
> --
> +---+
> | Derek Chen-Becker |
> | GPG Key available at https://keybase.io/dchenbecker and   |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---+
> 

Re: Future direction for the row cache and OHC implementation

2023-12-15 Thread Josh McKenzie
> I have reached out to the original maintainer about it and it seems like if 
> we want to keep using it we will need to start releasing it under a new 
> package from a different repo.

> the current maintainer is not interested in donating it to the ASF
Is that the case Ariel or could you just not reach Robert?

On Fri, Dec 15, 2023, at 11:55 AM, Jeremiah Jordan wrote:
>> from a maintenance and
>> integration testing perspective I think it would be better to keep the
>> ohc in-tree, so we will be aware of any issues immediately after the
>> full CI run.
> 
> From the original email bringing OHC in tree is not an option because the 
> current maintainer is not interested in donating it to the ASF.  Thus the 
> option 1 of some set of people forking it to their own github org and 
> maintaining a version outside of the ASF C* project.
> 
> -Jeremiah
> 
> On Dec 15, 2023 at 5:57:31 AM, Maxim Muzafarov  wrote:
>> Ariel,
>> thank you for bringing this topic to the ML.
>> 
>> I may be missing something, so correct me if I'm wrong somewhere in
>> the management of the Cassandra ecosystem.  As I see it, the problem
>> right now is that if we fork the ohc and put it under its own root,
>> the use of that row cache is still not well tested (the same as it is
>> now). I am particularly emphasising the dependency management side, as
>> any version change/upgrade in Cassandra and, as a result of that
>> change a new set of libraries in the classpath should be tested
>> against this integration.
>> 
>> So, unless it is being widely used by someone else outside of the
>> community (which it doesn't seem to be), from a maintenance and
>> integration testing perspective I think it would be better to keep the
>> ohc in-tree, so we will be aware of any issues immediately after the
>> full CI run.
>> 
>> I'm also +1 for not deprecating it, even if it is used in narrow
>> cases, while the cost of maintaining its source code remains quite low
>> and it brings some benefits.
>> 
>> On Fri, 15 Dec 2023 at 05:39, Ariel Weisberg  wrote:
>>> 
>>> Hi,
>>> 
>>> To add some additional context.
>>> 
>>> The row cache is disabled by default and it is already pluggable, but there 
>>> isn’t a Caffeine implementation present. I think one used to exist and 
>>> could be resurrected.
>>> 
>>> I personally also think that people should be able to scratch their own 
>>> itch row cache wise so removing it entirely just because it isn’t commonly 
>>> used isn’t the right move unless the feature is very far out of scope for 
>>> Cassandra.
>>> 
>>> Auto enabling/disabling the cache is a can of worms that could result in 
>>> performance and reliability inconsistency as the DB enables/disables the 
>>> cache based on heuristics when you don’t want it to. It being off by 
>>> default seems good enough to me.
>>> 
>>> RE forking, we could create a GitHub org for OHC and then add people to it. 
>>> There are some examples of dependencies that haven’t been contributed to 
>>> the project that live outside like CCM and JAMM.
>>> 
>>> Ariel
>>> 
>>> On Thu, Dec 14, 2023, at 5:07 PM, Dinesh Joshi wrote:
>>> 
>>> I would avoid taking away a feature even if it works in narrow set of 
>>> use-cases. I would instead suggest -
>>> 
>>> 1. Leave it disabled by default.
>>> 2. Detect when Row Cache has a low hit rate and warn the operator to turn 
>>> it off. Cassandra should ideally detect this and do it automatically.
>>> 3. Move to Caffeine instead of OHC.
>>> 
>>> I would suggest having this as the middle ground.
>>> 
>>> On Dec 14, 2023, at 4:41 PM, Mick Semb Wever  wrote:
>>> 
>>> 
>>> 
>>> 
>>> 3. Deprecate the row cache entirely in either 5.0 or 5.1 and remove it in a 
>>> later release
>>> 
>>> 
>>> 
>>> 
>>> I'm for deprecating and removing it.
>>> It constantly trips users up and just causes pain.
>>> 
>>> Yes it works in some very narrow situations, but those situations often 
>>> change over time and again just bites the user.  Without the row-cache I 
>>> believe users would quickly find other, more suitable and lasting, 
>>> solutions.
>>> 
>>> 


Re: Long tests, Burn tests, Simulator tests, Fuzz tests - can we clarify the diffs?

2023-12-15 Thread Josh McKenzie
> First of all - when you want to have a parameterized test case you do not 
> have to make the whole test class parameterized - it is per test case. Also, 
> each method can have different parameters.
This is a pretty compelling improvement to me having just had to use the 
somewhat painful and blunt instrument of our current framework's 
parameterization; pretty clunky and broad.

It also looks like they moved to a "test engine abstracted away from test 
identification" approach to their architecture in 5 w/the "vintage" model 
providing native unchanged backwards-compatibility w/junit 4. Assuming they 
didn't bork up their architecture that *should* lower risk of the framework 
change leading to disruption or failure (famous last words...).

A brief perusal shows jqwik as integrated with JUnit 5 taking a fairly 
interesting annotation-based approach to property testing. Curious if you've 
looked into or used that at all David (Capwell)? (link for the lazy: 
https://jqwik.net/docs/current/user-guide.html#detailed-table-of-contents).

On Tue, Dec 12, 2023, at 11:39 AM, Jacek Lewandowski wrote:
> First of all - when you want to have a parameterized test case you do not 
> have to make the whole test class parameterized - it is per test case. Also, 
> each method can have different parameters.
> 
> For the extensions - we can have extensions which provide Cassandra 
> configuration, extensions which provide a running cluster and others. We 
> could for example apply some extensions to all test classes externally 
> without touching those classes, something like logging the begin and end of 
> each test case. 
> 
> 
> 
> wt., 12 gru 2023 o 12:07 Benedict  napisał(a):
>> 
>> Could you give (or link to) some examples of how this would actually benefit 
>> our test suites?
>> 
>> 
>>> On 12 Dec 2023, at 10:51, Jacek Lewandowski  
>>> wrote:
>>> 
>>> I have two major pros for JUnit 5:
>>> - much better support for parameterized tests
>>> - global test hooks (automatically detectable extensions) + 
>>> multi-inheritance
>>> 
>>> 
>>> 
>>> 
>>> pon., 11 gru 2023 o 13:38 Benedict  napisał(a):
 
 Why do we want to move to JUnit 5? 
 
 I’m generally opposed to churn unless well justified, which it may be - 
 just not immediately obvious to me.
 
 
> On 11 Dec 2023, at 08:33, Jacek Lewandowski  
> wrote:
> 
> Nobody referred so far to the idea of moving to JUnit 5, what are the 
> opinions?
> 
> 
> 
> niedz., 10 gru 2023 o 11:03 Benedict  napisał(a):
>> 
>> Alex’s suggestion was that we meta randomise, ie we randomise the config 
>> parameters to gain better rather than lesser coverage overall. This 
>> means we cover these specific configs and more - just not necessarily on 
>> any single commit.
>> 
>> I strongly endorse this approach over the status quo.
>> 
>> 
>>> On 8 Dec 2023, at 13:26, Mick Semb Wever  wrote:
>>> 
>>>  
>>>  
>>>  
 
> I think everyone agrees here, but…. these variations are still 
> catching failures, and until we have an improvement or replacement we 
> do rely on them.   I'm not in favour of removing them until we have 
> proof /confidence that any replacement is catching the same failures. 
>  Especially oa, tries, vnodes. (Not tries and offheap is being 
> replaced with "latest", which will be valuable simplification.)  
 
 What kind of proof do you expect? I cannot imagine how we could prove 
 that because the ability of detecting failures results from the 
 randomness of those tests. That's why when such a test fail you 
 usually cannot reproduce that easily.
>>> 
>>> 
>>> Unit tests that fail consistently but only on one configuration, should 
>>> not be removed/replaced until the replacement also catches the failure.
>>>  
 We could extrapolate that to - why we only have those configurations? 
 why don't test trie / oa + compression, or CDC, or system memtable? 
>>> 
>>> 
>>> Because, along the way, people have decided a certain configuration 
>>> deserves additional testing and it has been done this way in lieu of 
>>> any other more efficient approach.
>>> 
>>> 
>>> 


Re: Custom FSError and CommitLog Error Handling

2023-12-15 Thread Josh McKenzie
Adding a poison-pill error option on finding of corrupt data makes sense to me. 
Not sure if there's enough demand / other customization being done in this 
space to justify the user customizable aspect; any immediate other approaches 
come to mind? If not, this isn't an area of the code that's changed all that 
much, so just adding a new option seems surgical and minimal to me.

On Tue, Dec 12, 2023, at 4:21 AM, Claude Warren, Jr via dev wrote:
> I can see this as a strong improvement in Cassandra management and support 
> it. 
> 
> +1 non binding
> 
> On Mon, Dec 11, 2023 at 8:28 PM Raymond Huffman  
> wrote:
>> Hello All,
>> 
>> On our fork of Cassandra, we've implemented some custom behavior for 
>> handling CommitLog and SSTable Corruption errors. Specifically, if a node 
>> detects one of those errors, we want the node to stop itself, and if the 
>> node is restarted, we want initialization to fail. This is useful in 
>> Kubernetes when you expect nodes to be restarted frequently and makes our 
>> corruption remediation workflows less error-prone. I think we could make 
>> this behavior more pluggable by allowing users to provide custom 
>> implementations of the FSErrorHandler, and the error handler that's 
>> currently implemented at 
>> org.apache.cassandra.db.commitlog.CommitLog#handleCommitError via config in 
>> the same way one can provide custom Partitioners and 
>> Authenticators/Authorizers.
>> 
>> Would you take as a contribution one of the following?
>> 1. user provided implementations of FSErrorHandler and 
>> CommitLogErrorHandler, set via config; and/or
>> 2. new commit failure and disk failure policies that write a poison pill 
>> file to disk and fail on startup if that file exists
>> 
>> The poison pill implementation is what we currently use - we call this a 
>> "Non Transient Error" and we want these states to always require manual 
>> intervention to resolve, including manual action to clear the error. I'd be 
>> happy to contribute this if other users would find it beneficial. I had 
>> initially shared this question in Slack, but I'm now sharing it here for 
>> broader visibility.
>> 
>> -Raymond Huffman


Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-15 Thread Josh McKenzie
> Goals
>  • Introduce a Cascades(2) query optimizer with rules easily extendable 
>  • Improve query performance for most common queries
>  • Add support for EXPLAIN and EXPLAIN ANALYZE to help with query 
> optimization and troubleshooting
>  • Lay the groundwork for the addition of features like joins, subqueries, 
> OR/NOT and index ordering
>  • Put in place some performance benchmarks to validate query optimizations
I think these are sensible goals. We're possibly going to face a chicken-or-egg 
problem with a feature like this that so heavily intersects with other as-yet 
written features where much of the value is in the intersection of them; if we 
continue down the current "one heuristic to rule them all" query planning 
approach we have now, we'll struggle to meaningfully explore or conceptualize 
the value of potential alternatives different optimizers could present us. Flip 
side, to Benedict's point, until SAI hits and/or some other potential future 
things we've all talked about, this cbo would likely fall directly into the 
same path that we effectively have hard-coded today (primary index path only).

One thing I feel pretty strongly about: even if the only outcome of all this 
work were to tighten up inconsistencies in our grammar and provide more robust 
EXPLAIN and EXPLAIN ANALYZE functionality to our end users, I think that would 
be highly valuable. This path of "only" would be predicated on us not having 
successful introduction of a robust secondary index implementation and a 
variety of other things we have a lot of interest in, so I find it unlikely, 
but worth calling out.

re: the removal of ALLOW FILTERING - is there room for compromise here and 
instead converting it to a guardrail that defaults to being enabled? That could 
theoretically give us a more gradual path to migration to a cost-based 
guardrail for instance, and would preserve the current robustness of the system 
while making it at least a touch more configurable.

On Fri, Dec 15, 2023, at 11:03 AM, Chris Lohfink wrote:
> Thanks for time in addressing concerns. At least with initial versions, as 
> long as there is a way to replace it with noop or disable it I would be 
> happy. This is pretty standard practice with features nowadays but I wanted 
> to highlight it as this might require some pretty tight coupling.
> 
> Chris
> 
> On Fri, Dec 15, 2023 at 7:57 AM Benjamin Lerer  wrote:
>> Hey Chris,
>> You raise some valid points.
>> 
>> I believe that there are 3 points that you mentioned:
>> 1) CQL restrictions are some form of safety net and should be kept
>> 2) A lot of Cassandra features do not scale and/or are too easy to use in a 
>> wrong way that can make the whole system collapse. We should not add more to 
>> that list. Especially not joins.
>> 
>> 3) Should we not start to fix features like secondary index rather than 
>> adding new ones? Which is heavily linked to 2).
>> 
>> Feel free to correct me if I got them wrong or missed one.
>> 
>> Regarding 1), I believe that you refer to the "Removing unnecessary CQL 
>> query limitations and inconsistencies" section. We are not planning to 
>> remove any safety net here.
>> What we want to remove is a certain amount of limitations which make things 
>> confusing for a user trying to write a query for no good reason. Like "why 
>> can I define a column alias but not use it anywhere in my query?" or "Why 
>> can I not create a list with 2 bind parameters?". While refactoring some CQL 
>> code, I kept on finding those types of exceptions that we can easily remove 
>> while simplifying the code at the same time.
>> 
>> For 2), I agree that at a certain scale or for some scenarios, some features 
>> simply do not scale or catch users by surprise. The goal of the CEP is to 
>> improve things in 2 ways. One is by making Cassandra smarter in the way it 
>> chooses how to process queries, hopefully improving its overall scalability. 
>> The other by being transparent about how Cassandra will execute the queries 
>> through the use of EXPLAIN. One problem of GROUP BY for example is that most 
>> users do not realize what is actually happening under the hood and therefore 
>> its limitations. I do not believe that EXPLAIN will change everything but it 
>> will help people to get a better understanding of the limitations of some 
>> features.
>> 
>> I do not know which features will be added in the future to C*. That will be 
>> discussed through some future CEPs. Nevertheless, I do not believe that it 
>> makes sense to write a CEP for a query optimizer without taking into account 
>> that we might at some point add some level of support for joins or 
>> subqueries. We have been too often delivering features without looking at 
>> what could be the possible evolutions which resulted in code where adding 
>> new features was more complex than it should have been. I do not want to 
>> make the same mistake. I want to create an optimizer that can be improved 
>> easily and 

Re: [DISCUSS] CEP-36: A Configurable ChannelProxy to alias external storage locations

2023-12-15 Thread Jon Haddad
At a high level I really like the idea of being able to better leverage
cheaper storage especially object stores like S3.

One important thing though - I feel pretty strongly that there's a big,
deal breaking downside.   Backups, disk failure policies, snapshots and
possibly repairs would get more complicated which haven't been particularly
great in the past, and of course there's the issue of failure recovery
being only partially possible if you're looking at a durable block store
paired with an ephemeral one with some of your data not replicated to the
cold side.  That introduces a failure case that's unacceptable for most
teams, which results in needing to implement potentially 2 different backup
solutions.  This is operationally complex with a lot of surface area for
headaches.  I think a lot of teams would probably have an issue with the
big question mark around durability and I probably would avoid it myself.

On the other hand, I'm +1 if we approach it something slightly differently
- where _all_ the data is located on the cold storage, with the local hot
storage used as a cache.  This means we can use the cold directories for
the complete dataset, simplifying backups and node replacements.

For a little background, we had a ticket several years ago where I pointed
out it was possible to do this *today* at the operating system level as
long as you're using block devices (vs an object store) and LVM [1].  For
example, this works well with GP3 EBS w/ low IOPS provisioning + local NVMe
to get a nice balance of great read performance without going nuts on the
cost for IOPS.  I also wrote about this in a little more detail in my blog
[2].  There's also the new mount point tech in AWS which pretty much does
exactly what I've suggested above [3] that's probably worth evaluating just
to get a feel for it.

I'm not insisting we require LVM or the AWS S3 fs, since that would rule
out other cloud providers, but I am pretty confident that the entire
dataset should reside in the "cold" side of things for the practical and
technical reasons I listed above.  I don't think it massively changes the
proposal, and should simplify things for everyone.

Jon

[1] https://rustyrazorblade.com/post/2018/2018-04-24-intro-to-lvm/
[2] https://issues.apache.org/jira/browse/CASSANDRA-8460
[3] https://aws.amazon.com/about-aws/whats-new/2023/03/mountpoint-amazon-s3/


On Thu, Dec 14, 2023 at 1:56 AM Claude Warren  wrote:

> Is there still interest in this?  Can we get some points down on electrons
> so that we all understand the issues?
>
> While it is fairly simple to redirect the read/write to something other
> than the local system for a single node this will not solve the problem for
> tiered storage.
>
> Tiered storage will require that on read/write the primary key be assessed
> and determine if the read/write should be redirected.  My reasoning for
> this statement is that in a cluster with a replication factor greater than
> 1 the node will store data for the keys that would be allocated to it in a
> cluster with a replication factor = 1, as well as some keys from nodes
> earlier in the ring.
>
> Even if we can get the primary keys for all the data we want to write to
> "cold storage" to map to a single node a replication factor > 1 means that
> data will also be placed in "normal storage" on subsequent nodes.
>
> To overcome this, we have to explore ways to route data to different
> storage based on the keys and that different storage may have to be
> available on _all_  the nodes.
>
> Have any of the partial solutions mentioned in this email chain (or
> others) solved this problem?
>
> Claude
>


Re: Moving Semver4j from test to main dependencies

2023-12-15 Thread Derek Chen-Becker
+1

Semver4j seems reasonable to me. I looked through the code and it's
relatively easy to understand. I'm not sure how easy it would be to replace
CassandraVersion, but that's not an immediate concern I guess.

Cheers,

Derek

On Fri, Dec 15, 2023 at 2:56 AM Jacek Lewandowski <
lewandowski.ja...@gmail.com> wrote:

> Hi,
>
> I'd like to add Semver4j to the production dependencies. It is currently
> on the test classpath. The library is pretty lightweight, licensed with MIT
> and has no transitive dependencies.
>
> We need to represent the kernel version somehow in CASSANDRA-19196 and
> Semver4j looks as the right tool for it. Maybe at some point we can replace
> our custom implementation of CassandraVersion as well.
>
> Thanks,
> - - -- --- -  -
> Jacek Lewandowski
>


-- 
+---+
| Derek Chen-Becker |
| GPG Key available at https://keybase.io/dchenbecker and   |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---+


Re: Future direction for the row cache and OHC implementation

2023-12-15 Thread Jeremiah Jordan
>
> from a maintenance and
> integration testing perspective I think it would be better to keep the
> ohc in-tree, so we will be aware of any issues immediately after the
> full CI run.


>From the original email bringing OHC in tree is not an option because the
current maintainer is not interested in donating it to the ASF.  Thus the
option 1 of some set of people forking it to their own github org and
maintaining a version outside of the ASF C* project.

-Jeremiah

On Dec 15, 2023 at 5:57:31 AM, Maxim Muzafarov  wrote:

> Ariel,
> thank you for bringing this topic to the ML.
>
> I may be missing something, so correct me if I'm wrong somewhere in
> the management of the Cassandra ecosystem.  As I see it, the problem
> right now is that if we fork the ohc and put it under its own root,
> the use of that row cache is still not well tested (the same as it is
> now). I am particularly emphasising the dependency management side, as
> any version change/upgrade in Cassandra and, as a result of that
> change a new set of libraries in the classpath should be tested
> against this integration.
>
> So, unless it is being widely used by someone else outside of the
> community (which it doesn't seem to be), from a maintenance and
> integration testing perspective I think it would be better to keep the
> ohc in-tree, so we will be aware of any issues immediately after the
> full CI run.
>
> I'm also +1 for not deprecating it, even if it is used in narrow
> cases, while the cost of maintaining its source code remains quite low
> and it brings some benefits.
>
> On Fri, 15 Dec 2023 at 05:39, Ariel Weisberg  wrote:
>
>
> Hi,
>
>
> To add some additional context.
>
>
> The row cache is disabled by default and it is already pluggable, but
> there isn’t a Caffeine implementation present. I think one used to exist
> and could be resurrected.
>
>
> I personally also think that people should be able to scratch their own
> itch row cache wise so removing it entirely just because it isn’t commonly
> used isn’t the right move unless the feature is very far out of scope for
> Cassandra.
>
>
> Auto enabling/disabling the cache is a can of worms that could result in
> performance and reliability inconsistency as the DB enables/disables the
> cache based on heuristics when you don’t want it to. It being off by
> default seems good enough to me.
>
>
> RE forking, we could create a GitHub org for OHC and then add people to
> it. There are some examples of dependencies that haven’t been contributed
> to the project that live outside like CCM and JAMM.
>
>
> Ariel
>
>
> On Thu, Dec 14, 2023, at 5:07 PM, Dinesh Joshi wrote:
>
>
> I would avoid taking away a feature even if it works in narrow set of
> use-cases. I would instead suggest -
>
>
> 1. Leave it disabled by default.
>
> 2. Detect when Row Cache has a low hit rate and warn the operator to turn
> it off. Cassandra should ideally detect this and do it automatically.
>
> 3. Move to Caffeine instead of OHC.
>
>
> I would suggest having this as the middle ground.
>
>
> On Dec 14, 2023, at 4:41 PM, Mick Semb Wever  wrote:
>
>
>
>
>
> 3. Deprecate the row cache entirely in either 5.0 or 5.1 and remove it in
> a later release
>
>
>
>
>
> I'm for deprecating and removing it.
>
> It constantly trips users up and just causes pain.
>
>
> Yes it works in some very narrow situations, but those situations often
> change over time and again just bites the user.  Without the row-cache I
> believe users would quickly find other, more suitable and lasting,
> solutions.
>
>
>
>


Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-15 Thread Chris Lohfink
Thanks for time in addressing concerns. At least with initial versions, as
long as there is a way to replace it with noop or disable it I would be
happy. This is pretty standard practice with features nowadays but I wanted
to highlight it as this might require some pretty tight coupling.

Chris

On Fri, Dec 15, 2023 at 7:57 AM Benjamin Lerer  wrote:

> Hey Chris,
> You raise some valid points.
>
> I believe that there are 3 points that you mentioned:
> 1) CQL restrictions are some form of safety net and should be kept
> 2) A lot of Cassandra features do not scale and/or are too easy to use in
> a wrong way that can make the whole system collapse. We should not add more
> to that list. Especially not joins.
> 3) Should we not start to fix features like secondary index rather than
> adding new ones? Which is heavily linked to 2).
>
> Feel free to correct me if I got them wrong or missed one.
>
> Regarding 1), I believe that you refer to the "Removing unnecessary CQL
> query limitations and inconsistencies" section. We are not planning to
> remove any safety net here.
> What we want to remove is a certain amount of limitations which make
> things confusing for a user trying to write a query for no good reason.
> Like "why can I define a column alias but not use it anywhere in my query?"
> or "Why can I not create a list with 2 bind parameters?". While refactoring
> some CQL code, I kept on finding those types of exceptions that we can
> easily remove while simplifying the code at the same time.
>
> For 2), I agree that at a certain scale or for some scenarios, some
> features simply do not scale or catch users by surprise. The goal of the
> CEP is to improve things in 2 ways. One is by making Cassandra smarter in
> the way it chooses how to process queries, hopefully improving its overall
> scalability. The other by being transparent about how Cassandra will
> execute the queries through the use of EXPLAIN. One problem of GROUP BY for
> example is that most users do not realize what is actually happening under
> the hood and therefore its limitations. I do not believe that EXPLAIN will
> change everything but it will help people to get a better understanding of
> the limitations of some features.
>
> I do not know which features will be added in the future to C*. That will
> be discussed through some future CEPs. Nevertheless, I do not believe that
> it makes sense to write a CEP for a query optimizer without taking into
> account that we might at some point add some level of support for joins or
> subqueries. We have been too often delivering features without looking at
> what could be the possible evolutions which resulted in code where adding
> new features was more complex than it should have been. I do not want to
> make the same mistake. I want to create an optimizer that can be improved
> easily and considering joins or other features simply help to build things
> in a more generic way.
>
> Regarding feature stabilization, I believe that it is happening. I have
> heard plans of how to solve MVs, range queries, hot partitions, ... and
> there was a lot of thinking behind those plans. Secondary indexes are being
> worked on. We hope that the optimizer will also help with some index
> queries.
>
> It seems to me that this proposal is going toward the direction that you
> want without introducing new problems for scalability.
>
>
> Le jeu. 14 déc. 2023 à 16:47, Chris Lohfink  a
> écrit :
>
>> I don't wanna be a blocker for this CEP or anything but did want to put
>> my 2 cents in. This CEP is horrifying to me.
>>
>> I have seen thousands of clusters across multiple companies and helped
>> them get working successfully. A vast majority of that involved blocking
>> the use of MVs, GROUP BY, secondary indexes, and even just simple _range
>> queries_. The "unncessary restrictions of cql" are not only necessary IMHO,
>> more restrictions are necessary to be successful at scale. The idea of just
>> opening up CQL to general purpose relational queries and lines like 
>> "supporting
>> queries with joins in an efficient way" ... I would really like us to
>> make secondary indexes be a viable option before we start opening up
>> floodgates on stuff like this.
>>
>> Chris
>>
>> On Thu, Dec 14, 2023 at 9:37 AM Benedict  wrote:
>>
>>> > So yes, this physical plan is the structure that you have in mind but
>>> the idea of sharing it is not part of the CEP.
>>>
>>>
>>> I think it should be. This should form a major part of the API on which
>>> any CBO is built.
>>>
>>>
>>> > It seems that there is a difference between the goal of your proposal
>>> and the one of the CEP. The goal of the CEP is first to ensure optimal
>>> performance. It is ok to change the execution plan for one that delivers
>>> better performance. What we want to minimize is having a node performing
>>> queries in an inefficient way for a long period of time.
>>>
>>>
>>> You have made a goal of the CEP synchronising summary statistics across
>>> the 

Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-15 Thread Benjamin Lerer
>
> I'm also torn on the CEP as presented. I think some of it is my negative
> emotional response to the examples - e.g. I've literally never seen a real
> use case where unfolding constants matters, and I'm trying to convince
> myself to read past that.


I totally agree with you, Jeff, if you think about it as an optimization.
It is not. The goal of such a rule is to normalize the query. What you want
is to convert equivalent queries into a consistent form to ensure that the
result of your optimization is predictable.

I also cant tell what exactly you mean when you say "In order to ensure
> that the execution plans on each node are the same, the cardinality
> estimator should provide the same global statistics on every node as well
> as some notification mechanism that can be used to trigger
> re-optimization." In my experience, you'll see variable cost on each host,
> where a machine that went offline temporarily got a spike in sstables from
> repair and has a compaction backlog, causing a higher cost per read on that
> host due to extra sstables/duplicate rows/merges. Is the cost based
> optimizer in your model going to understand the different cost per replica
> and also use that in choosing the appropriate replicas to query?
>

No. Optimization will occur at query preparation time. Therefore you cannot
really use such information about your node. The statistics the CEP is
talking about are about the data distributions of the table being queried
(or indexes).
Based on the data distribution you will want to pick which access method is
the most efficient.

The problem you mention is related to how you route your query, which is at
a different level.

Finally: ALLOW FILTERING should not be deprecated. It doesn't matter if the
> CBO may be able to help improve queries that have filtering. That guard
> exists because most people who are new to cassandra don't understand the
> difference and it prevents far more self-inflicted failures than anyone can
> count. Please do not remove this. You will instantly create a world where
> most new users to the database tip over as soon as their adoption picks
> up.
>

It is interesting. I have heard so many people complain about ALLOW
FILTERING, saying that it should be removed, that I have been quite
surprised to see Benedict and you requesting to keep it.
My impression is that its original goal makes sense but that the
implementation is not the good one. Making it feel like a burden for a lot
of people.
We should probably have a discussion about it in a separate thread so that
we can go to the bottom of the problem. I will remove the section from the
CEP and see what comes out of the discussion. If it is fine with
everybody.

Le jeu. 14 déc. 2023 à 17:29, Jeff Jirsa  a écrit :

> I'm also torn on the CEP as presented. I think some of it is my negative
> emotional response to the examples - e.g. I've literally never seen a real
> use case where unfolding constants matters, and I'm trying to convince
> myself to read past that.
>
> I also cant tell what exactly you mean when you say "In order to ensure
> that the execution plans on each node are the same, the cardinality
> estimator should provide the same global statistics on every node as well
> as some notification mechanism that can be used to trigger
> re-optimization." In my experience, you'll see variable cost on each host,
> where a machine that went offline temporarily got a spike in sstables from
> repair and has a compaction backlog, causing a higher cost per read on that
> host due to extra sstables/duplicate rows/merges. Is the cost based
> optimizer in your model going to understand the different cost per replica
> and also use that in choosing the appropriate replicas to query?
>
> Finally: ALLOW FILTERING should not be deprecated. It doesn't matter if
> the CBO may be able to help improve queries that have filtering. That guard
> exists because most people who are new to cassandra don't understand the
> difference and it prevents far more self-inflicted failures than anyone can
> count. Please do not remove this. You will instantly create a world where
> most new users to the database tip over as soon as their adoption picks up.
>
>
>
> On Thu, Dec 14, 2023 at 7:49 AM Chris Lohfink 
> wrote:
>
>> I don't wanna be a blocker for this CEP or anything but did want to put
>> my 2 cents in. This CEP is horrifying to me.
>>
>> I have seen thousands of clusters across multiple companies and helped
>> them get working successfully. A vast majority of that involved blocking
>> the use of MVs, GROUP BY, secondary indexes, and even just simple _range
>> queries_. The "unncessary restrictions of cql" are not only necessary IMHO,
>> more restrictions are necessary to be successful at scale. The idea of just
>> opening up CQL to general purpose relational queries and lines like 
>> "supporting
>> queries with joins in an efficient way" ... I would really like us to
>> make secondary indexes be a viable option 

Re: [DISCUSS] CEP-39: Cost Based Optimizer

2023-12-15 Thread Benjamin Lerer
Hey Chris,
You raise some valid points.

I believe that there are 3 points that you mentioned:
1) CQL restrictions are some form of safety net and should be kept
2) A lot of Cassandra features do not scale and/or are too easy to use in a
wrong way that can make the whole system collapse. We should not add more
to that list. Especially not joins.
3) Should we not start to fix features like secondary index rather than
adding new ones? Which is heavily linked to 2).

Feel free to correct me if I got them wrong or missed one.

Regarding 1), I believe that you refer to the "Removing unnecessary CQL
query limitations and inconsistencies" section. We are not planning to
remove any safety net here.
What we want to remove is a certain amount of limitations which make things
confusing for a user trying to write a query for no good reason. Like "why
can I define a column alias but not use it anywhere in my query?" or "Why
can I not create a list with 2 bind parameters?". While refactoring some
CQL code, I kept on finding those types of exceptions that we can easily
remove while simplifying the code at the same time.

For 2), I agree that at a certain scale or for some scenarios, some
features simply do not scale or catch users by surprise. The goal of the
CEP is to improve things in 2 ways. One is by making Cassandra smarter in
the way it chooses how to process queries, hopefully improving its overall
scalability. The other by being transparent about how Cassandra will
execute the queries through the use of EXPLAIN. One problem of GROUP BY for
example is that most users do not realize what is actually happening under
the hood and therefore its limitations. I do not believe that EXPLAIN will
change everything but it will help people to get a better understanding of
the limitations of some features.

I do not know which features will be added in the future to C*. That will
be discussed through some future CEPs. Nevertheless, I do not believe that
it makes sense to write a CEP for a query optimizer without taking into
account that we might at some point add some level of support for joins or
subqueries. We have been too often delivering features without looking at
what could be the possible evolutions which resulted in code where adding
new features was more complex than it should have been. I do not want to
make the same mistake. I want to create an optimizer that can be improved
easily and considering joins or other features simply help to build things
in a more generic way.

Regarding feature stabilization, I believe that it is happening. I have
heard plans of how to solve MVs, range queries, hot partitions, ... and
there was a lot of thinking behind those plans. Secondary indexes are being
worked on. We hope that the optimizer will also help with some index
queries.

It seems to me that this proposal is going toward the direction that you
want without introducing new problems for scalability.


Le jeu. 14 déc. 2023 à 16:47, Chris Lohfink  a écrit :

> I don't wanna be a blocker for this CEP or anything but did want to put my
> 2 cents in. This CEP is horrifying to me.
>
> I have seen thousands of clusters across multiple companies and helped
> them get working successfully. A vast majority of that involved blocking
> the use of MVs, GROUP BY, secondary indexes, and even just simple _range
> queries_. The "unncessary restrictions of cql" are not only necessary IMHO,
> more restrictions are necessary to be successful at scale. The idea of just
> opening up CQL to general purpose relational queries and lines like 
> "supporting
> queries with joins in an efficient way" ... I would really like us to
> make secondary indexes be a viable option before we start opening up
> floodgates on stuff like this.
>
> Chris
>
> On Thu, Dec 14, 2023 at 9:37 AM Benedict  wrote:
>
>> > So yes, this physical plan is the structure that you have in mind but
>> the idea of sharing it is not part of the CEP.
>>
>>
>> I think it should be. This should form a major part of the API on which
>> any CBO is built.
>>
>>
>> > It seems that there is a difference between the goal of your proposal
>> and the one of the CEP. The goal of the CEP is first to ensure optimal
>> performance. It is ok to change the execution plan for one that delivers
>> better performance. What we want to minimize is having a node performing
>> queries in an inefficient way for a long period of time.
>>
>>
>> You have made a goal of the CEP synchronising summary statistics across
>> the whole cluster in order to achieve some degree of uniformity of query
>> plan. So this is explicitly a goal of the CEP, and synchronising summary
>> statistics is a hard problem and won’t provide strong guarantees.
>>
>>
>> > The client side proposal targets consistency for a given query on a
>> given driver instance. In practice, it would be possible to have 2 similar
>> queries with 2 different execution plans on the same driver
>>
>>
>> This would only be possible if the driver permitted 

Re: Future direction for the row cache and OHC implementation

2023-12-15 Thread Maxim Muzafarov
Ariel,
thank you for bringing this topic to the ML.

I may be missing something, so correct me if I'm wrong somewhere in
the management of the Cassandra ecosystem.  As I see it, the problem
right now is that if we fork the ohc and put it under its own root,
the use of that row cache is still not well tested (the same as it is
now). I am particularly emphasising the dependency management side, as
any version change/upgrade in Cassandra and, as a result of that
change a new set of libraries in the classpath should be tested
against this integration.

So, unless it is being widely used by someone else outside of the
community (which it doesn't seem to be), from a maintenance and
integration testing perspective I think it would be better to keep the
ohc in-tree, so we will be aware of any issues immediately after the
full CI run.

I'm also +1 for not deprecating it, even if it is used in narrow
cases, while the cost of maintaining its source code remains quite low
and it brings some benefits.

On Fri, 15 Dec 2023 at 05:39, Ariel Weisberg  wrote:
>
> Hi,
>
> To add some additional context.
>
> The row cache is disabled by default and it is already pluggable, but there 
> isn’t a Caffeine implementation present. I think one used to exist and could 
> be resurrected.
>
> I personally also think that people should be able to scratch their own itch 
> row cache wise so removing it entirely just because it isn’t commonly used 
> isn’t the right move unless the feature is very far out of scope for 
> Cassandra.
>
> Auto enabling/disabling the cache is a can of worms that could result in 
> performance and reliability inconsistency as the DB enables/disables the 
> cache based on heuristics when you don’t want it to. It being off by default 
> seems good enough to me.
>
> RE forking, we could create a GitHub org for OHC and then add people to it. 
> There are some examples of dependencies that haven’t been contributed to the 
> project that live outside like CCM and JAMM.
>
> Ariel
>
> On Thu, Dec 14, 2023, at 5:07 PM, Dinesh Joshi wrote:
>
> I would avoid taking away a feature even if it works in narrow set of 
> use-cases. I would instead suggest -
>
> 1. Leave it disabled by default.
> 2. Detect when Row Cache has a low hit rate and warn the operator to turn it 
> off. Cassandra should ideally detect this and do it automatically.
> 3. Move to Caffeine instead of OHC.
>
> I would suggest having this as the middle ground.
>
> On Dec 14, 2023, at 4:41 PM, Mick Semb Wever  wrote:
>
>
>
>
> 3. Deprecate the row cache entirely in either 5.0 or 5.1 and remove it in a 
> later release
>
>
>
>
> I'm for deprecating and removing it.
> It constantly trips users up and just causes pain.
>
> Yes it works in some very narrow situations, but those situations often 
> change over time and again just bites the user.  Without the row-cache I 
> believe users would quickly find other, more suitable and lasting, solutions.
>
>


Moving Semver4j from test to main dependencies

2023-12-15 Thread Jacek Lewandowski
Hi,

I'd like to add Semver4j to the production dependencies. It is currently on
the test classpath. The library is pretty lightweight, licensed with MIT
and has no transitive dependencies.

We need to represent the kernel version somehow in CASSANDRA-19196 and
Semver4j looks as the right tool for it. Maybe at some point we can replace
our custom implementation of CassandraVersion as well.

Thanks,
- - -- --- -  -
Jacek Lewandowski