It is instructive to listen to the concerns of new and existing users in order 
to improve a product like Cassandra, but I think the school yard taunt model 
isn’t the most effective.

In my experience with open and closed source databases, there are always things 
that could be improved. Many have a historical base in how the product evolved 
over time. A newcomer sees those as rough edges right away. In other cases, the 
database creators have often widened their scope to try and solve every data 
problem. This creates the complexity of too many configuration options, etc. 
Even the best RDBMS (Informix!) battled these kinds of issues.

Cassandra, though, introduced another angle of difficulty. In trying to relate 
to RDBMS users (pun intended), it often borrowed terminology to make it seem 
familiar. But they don’t work the same way or even solve the same problems. The 
classic example is secondary indexes. For RDBMS, they are very useful; for 
Cassandra, they are anathema (except for very narrow cases).

However, I think the shots at Cassandra are generally unfair. When I started 
working with it, the DataStax documentation was some of the best documentation 
I had seen on any project, especially an open source one. (If anything the 
cooling off between Apache Cassandra and DataStax may be the most serious 
misstep so far…) The more I learned about how Cassandra worked, the more I 
marveled at the clever combination of intricate solutions (gossip, merkle 
trees, compaction strategies, etc.) to solve specific data problems. This is a 
great product! It has given me lots of sleep-filled nights over the last 4+ 
years. My customers love it, once I explain what it should be used for (and 
what it shouldn’t). I applaud the contributors, whether coders or users. Thank 
you!

Finally, a note on backup. Backing up a distributed system is tough, but 
restores are even more complex (if you want no down-time, no extra disk space, 
point-in-time recovery, etc). If you want to investigate why it is a tough 
problem for Cassandra, go look at RecoverX from Datos IO. They have solved many 
of the problems, but it isn’t an easy task. You could ask people to try and 
recreate all that, or just point them to a working solution. If backup and 
recovery is required (and I would argue it isn’t always required), it is 
probably worth paying for.


Sean Durity
From: Josh McKenzie [mailto:jmcken...@apache.org]
Sent: Wednesday, February 21, 2018 11:28 AM
To: dev@cassandra.apache.org
Cc: User <u...@cassandra.apache.org>
Subject: [EXTERNAL] Re: Cassandra Needs to Grow Up by Version Five!

There's a disheartening amount of "here's where Cassandra is bad, and here's 
what it needs to do for me for free" happening in this thread.

This is open-source software. Everyone is *strongly encouraged* to submit a 
patch to move the needle on *any* of these things being complained about in 
this thread.

For the Apache 
Way<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.apache.org_foundation_governance_&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=2rQSVEnngxWT4yH5056Hyg7HIoaXWYKxcndEyMQhGDU&s=rcKJB94vQnrbZaED-nzTrMFsTPedeCHopB8ch79XB7s&e=>
 to work, people need to step up and meaningfully contribute to a project to 
scratch their own itch instead of just waiting for a random 
corporation-subsidized engineer to happen to have interests that align with 
them and contribute that to the project.

Beating a dead horse for things everyone on the project knows are serious pain 
points is not productive.

On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin 
<oleksandr.shul...@zalando.de<mailto:oleksandr.shul...@zalando.de>> wrote:
On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid<mailto:kenbrot...@yahoo.com.invalid>> wrote:

>
> >> Cluster wide management should be a big theme in any next major release.
> >>
> >Na. Stability and testing should be a big theme in the next major release.
> >
>
> Double Na on that one Jeff.  I think you have a concern there about the
> need to test sufficiently to ensure the stability of the next major
> release.  That makes perfect sense.- for every release, especially the
> major ones.  Continuous improvement is not a phase of development for
> example.  CI should be in everything, in every phase.  Stability and
> testing a part of every release not just one.  A major release should be a
> nice step from the previous major release though.
>

I guess what Jeff refers to is the tick-tock release cycle experiment,
which has proven to be a complete disaster by popular opinion.

There's also the "materialized views" feature which failed to materialize
in the end (pun intended) and had to be declared experimental retroactively.

Another prominent example is incremental repair which was introduced as the
default option in 2.2 and now is not recommended to use because of so many
corner cases where it can fail.  So again experimental as an afterthought.

Not to mention that even if you are aware of the default incremental and go
with full repair instead, you're still up for a sad surprise:
anti-compaction will be triggered despite the "full" repair.  Because
anti-compaction is only disabled in case of sub-range repair (don't ask
why), so you need to use something advanced like Reaper if you want to
avoid that.  I don't think you'll ever find this in the documentation.

Honestly, for an eventually-consistent system like Cassandra anti-entropy
repair is one of the most important pieces to get right.  And Cassandra
fails really badly on that one: the feature is not really well designed,
poorly implemented and under-documented.

In a summary, IMO, Cassandra is a poor implementation of some good ideas.
It is a collection of hacks, not features.  They sometimes play together
accidentally, and rarely by design.

Regards,
--
Alex


________________________________

The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

Reply via email to