I ask Cassandra to be a database that is high-performance, highly
scalable with no single point of failure. Anything "cool" that's added
beyond must be added only as a separate, optional ring around Cassandra
and must not get in the way of my usage.
Yes, I would like some help with some of what's listed here, but you
should understand that most shops adopting Cassandra are already going
to have DevOps/database management personnel, expertise, methods,
protocols and, in some instances, tools already in place. Even the small
shop I work in has guys saddled with taking care of Cassandra (I'm a
developer and not one of these guys) and seem not to share these
concerns because they've already got it covered (like the specific YAML
configuration complaint).
If there were an option or two I'd like to see, one would be the ability
to duplicate data centers exactly (as part of what we stipulate when
creating our KEYSPACE), but this is probably something I want because of
what we were doing up until or what we wanted when we adopted Cassandra
for our future product direction. I would also like to see an option in
Cassandra configuration for absolutelylocking out access to certain
commands (like DROP TABLE, DROP INDEXand DELETE).
From my point of view as a developer, I've had to do many of these
things also for MongoDB, PostgreSQL, MySQL and other databases over my
career.
I'm not criticizing these concerns and suggestions. I'm just pointing
out that, in my opinion, not everything said here is in the realm of,
"duh, Cassandra needs to grow up."
There's so much right about Cassandra, from the great, unequaled
technology to the very liberal licensing model without which I could not
be here.
Russ Bateman
On 02/18/2018 10:39 PM, Kenneth Brotman wrote:
Cassandra feels like an unfinished program to me. The problem is not
that it’s open source or cutting edge. It’s an open source cutting
edge program that lacks some of its basic functionality. We are all
stuck addressing fundamental mechanical tasks for Cassandra because
the basic code that would do that part has not been contributed yet.
Ease of use issues need to be given much more attention. For an
administrator, the ease of use of Cassandra is very poor.
Furthermore, currently Cassandra is an idiot. We have to do
everything for Cassandra. Contrast that with the fact that we are in
the dawn of artificial intelligence.
Software exists to automate tasks for humans, not mechanize humans to
administer tasks for a database. I’m an engineering type. My job is
to apply science and technology to solve real world problems. And
that’s where I need an organization’s I.T. talent to focus; not in
crank starting an unfinished database.
For example, I should be able to go to any node, replace the
Cassandra.yaml file and have a prompt on the display ask me if I want
to update all the yaml files across the cluster. I shouldn’t have to
manually modify yaml files on each node or have to create a script for
some third party automation tool to do it.
I should not have to turn off service, clear directories, restart
service in coordination with the other nodes. It’s already a computer
system. It can do those things on its own.
How about read repair. First there is something wrong with the name.
Maybe it should be called Consistency Repair. An administrator
shouldn’t have to do anything. It should be a behavior of Cassandra
that is programmed in. It should consider the GC setting of each node,
calculate how often it has to run repair, when it should run it so all
the nodes aren’t trying at the same time and when other circumstances
indicate it should also run it.
Certificate management should be automated.
Cluster wide management should be a big theme in any next major
release. What is a major release? How many major releases could a
program have before all the coding for basic stuff like installation,
configuration and maintenance is included!
Finish the basic coding of Cassandra, make it easy to use for
administrators, make is smart, add cluster wide management. Keep
Cassandra competitive or it will soon be the old Model T we all
remember fondly.
I ask the Committee to compile a list of all such items, make a plan,
and commit to including the completed and tested code as part of major
release 5.0. I further ask that release 4.0 not be delayed and then
there be an unusually short skip to version 5.0.
Kenneth Brotman