RE: Cassandra Needs to Grow Up by Version Five!

2018-02-19 Thread Kenneth Brotman
Jeff, you helped me figure out what I was missing.  It just took me a day to 
digest what you wrote.  I’m coming over from another type of engineering.  I 
didn’t know and it’s not really documented.  Cassandra runs in a data center.  
Now days that means the nodes are going to be in managed containers, Docker 
containers, managed by Kerbernetes,  Meso or something, and for that reason 
anyone operating Cassandra in a real world setting would not encounter the 
issues I raised in the way I described.

 

Shouldn’t the architectural diagrams people reference indicate that in some 
way?  That would have help me.

 

Kenneth Brotman

 

From: Kenneth Brotman [mailto:kenbrot...@yahoo.com] 
Sent: Monday, February 19, 2018 10:43 AM
To: 'u...@cassandra.apache.org'
Cc: 'dev@cassandra.apache.org'
Subject: RE: Cassandra Needs to Grow Up by Version Five!

 

Well said.  Very fair.  I wouldn’t mind hearing from others still.  You’re a 
good guy!

 

Kenneth Brotman

 

From: Jeff Jirsa [mailto:jji...@gmail.com] 
Sent: Monday, February 19, 2018 9:10 AM
To: cassandra
Cc: Cassandra DEV
Subject: Re: Cassandra Needs to Grow Up by Version Five!

 

There's a lot of things below I disagree with, but it's ok. I convinced myself 
not to nit-pick every point.

 

https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of Stefan's work 
with cert management

 

Beyond that, I encourage you to do what Michael suggested: open JIRAs for 
things you care strongly about, work on them if you have time. Sometime this 
year we'll schedule a NGCC (Next Generation Cassandra Conference) where we talk 
about future project work and direction, I encourage you to attend if you're 
able (I encourage anyone who cares about the direction of Cassandra to attend, 
it's probably be either free or very low cost, just to cover a venue and some 
food). If nothing else, you'll meet some of the teams who are working on the 
project, and learn why they've selected the projects on which they're working. 
You'll have an opportunity to pitch your vision, and maybe you can talk some 
folks into helping out. 

 

- Jeff

 

 

 

 

On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman  
wrote:

Comments inline

>-Original Message-
>From: Jeff Jirsa [mailto:jji...@gmail.com]
>Sent: Sunday, February 18, 2018 10:58 PM
>To: u...@cassandra.apache.org
>Cc: dev@cassandra.apache.org
>Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
>Comments inline
>
>
>> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman  
>> wrote:
>>
> >Cassandra feels like an unfinished program to me. The problem is not that 
> >it’s open source or cutting edge.  It’s an open source cutting edge program 
> >that lacks some of its basic functionality.  We are all stuck addressing 
> >fundamental mechanical tasks for Cassandra because the basic code that would 
> >do that part has not been contributed yet.
>>
>There’s probably 2-3 reasons why here:
>
>1) Historically the pmc has tried to keep the scope of the project very 
>narrow. It’s a database. We don’t ship drivers. We don’t ship developer tools. 
>We don’t ship fancy UIs. We ship a database. I think for the most part the 
>narrow vision has been for the best, but maybe it’s time to reconsider some of 
>the scope.
>
>Postgres will autovacuum to prevent wraparound (hopefully),  but everyone I 
>know running Postgres uses flexible-freeze in cron - sometimes it’s ok to let 
>the database have its opinions and let third party tools fill in the gaps.
>

I can appreciate the desire to stay in scope.  I believe usability is the King. 
 When users have to learn the database, then learn what they have to automate, 
then learn an automation tool and then use the automation tool to do something 
that is as fundamental as the fundamental tasks I described, then something is 
missing from the database itself that is adversely affecting usability - and 
that is very bad.  Where those big companies need to calculate the ROI is in 
the cost of acquiring or training the next group of users.  Consider how steep 
the learning curve is for new users.  Consider the business case for improving 
ease of use.

>2) Cassandra is, by definition, a database for large scale problems. Most of 
>the companies working on/with it tend to be big companies. Big companies often 
>have pre-existing automation that solved the stuff you consider fundamental 
>tasks, so there’s probably nobody actively working on the solved problems that 
>you may consider missing features - for many people they’re already solved.
>

I could be wrong but it sounds like a lot of the code work is done, and if the 
companies would take the time to contribute more code, then the rest of the 
code needed could be generated easily.

>3) It’s not nearly as basic as you think it is. Datastax seemingly had a 
>multi-person team on opscenter, and while it was better than anything else 
>around last time I used it (before it stopped 

[RELEASE] Apache Cassandra 3.11.2 released - PLEASE READ NOTICE

2018-02-19 Thread Michael Shuler
PLEASE READ: MAXIMUM TTL EXPIRATION DATE NOTICE (CASSANDRA-14092)
--

The maximum expiration timestamp that can be represented by the storage
engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL
thatl expire after this date are not currently supported. By default,
INSERTS with TTL exceeding the maximum supported date are rejected, but
it's possible to choose a different expiration overflow policy. See
CASSANDRA-14092.txt for more details.

Prior to 3.0.16 (3.0.X) and 3.11.2 (3.11.x) there was no protection
against INSERTS with TTL expiring after the maximum supported date,
causing the expiration time field to overflow and the records to expire
immediately. Clusters in the 2.X and lower series are not subject to
this when assertions are enabled. Backed up SSTables can be potentially
recovered and recovery instructions can be found on the
CASSANDRA-14092.txt file.

If you use or plan to use very large TTLS (10 to 20 years), read
CASSANDRA-14092.txt for more information.
--

The Cassandra team is pleased to announce the release of Apache
Cassandra version 3.11.2.

Apache Cassandra is a fully distributed database. It is the right choice
when you need scalability and high availability without compromising
performance.

 http://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download
section:

 http://cassandra.apache.org/download/

This version is a bug fix release[1] on the 3.11 series. As always,
please pay attention to the release notes[2] and Let us know[3] if you
were to encounter any problem.

Enjoy!

[1]: (CHANGES.txt) https://goo.gl/mQjYnb
[2]: (NEWS.txt) https://goo.gl/NJGdhu
[3]: https://issues.apache.org/jira/browse/CASSANDRA



signature.asc
Description: OpenPGP digital signature


[RELEASE] Apache Cassandra 3.0.16 released - PLEASE READ NOTICE

2018-02-19 Thread Michael Shuler
PLEASE READ: MAXIMUM TTL EXPIRATION DATE NOTICE (CASSANDRA-14092)
--

The maximum expiration timestamp that can be represented by the storage
engine is 2038-01-19T03:14:06+00:00, which means that inserts with TTL
thatl expire after this date are not currently supported. By default,
INSERTS with TTL exceeding the maximum supported date are rejected, but
it's possible to choose a different expiration overflow policy. See
CASSANDRA-14092.txt for more details.

Prior to 3.0.16 (3.0.X) and 3.11.2 (3.11.x) there was no protection
against INSERTS with TTL expiring after the maximum supported date,
causing the expiration time field to overflow and the records to expire
immediately. Clusters in the 2.X and lower series are not subject to
this when assertions are enabled. Backed up SSTables can be potentially
recovered and recovery instructions can be found on the
CASSANDRA-14092.txt file.

If you use or plan to use very large TTLS (10 to 20 years), read
CASSANDRA-14092.txt for more information.
--

The Cassandra team is pleased to announce the release of Apache
Cassandra version 3.0.16.

Apache Cassandra is a fully distributed database. It is the right choice
when you need scalability and high availability without compromising
performance.

 http://cassandra.apache.org/

Downloads of source and binary distributions are listed in our download
section:

 http://cassandra.apache.org/download/

This version is a bug fix release[1] on the 3.0 series. As always,
please pay attention to the release notes[2] and Let us know[3] if you
were to encounter any problem.

Enjoy!

[1]: (CHANGES.txt) https://goo.gl/ST7ij6
[2]: (NEWS.txt) https://goo.gl/Ek5hve
[3]: https://issues.apache.org/jira/browse/CASSANDRA



signature.asc
Description: OpenPGP digital signature


RE: Cassandra Needs to Grow Up by Version Five!

2018-02-19 Thread Kenneth Brotman
Well said.  Very fair.  I wouldn’t mind hearing from others still.  You’re a 
good guy!

 

Kenneth Brotman

 

From: Jeff Jirsa [mailto:jji...@gmail.com] 
Sent: Monday, February 19, 2018 9:10 AM
To: cassandra
Cc: Cassandra DEV
Subject: Re: Cassandra Needs to Grow Up by Version Five!

 

There's a lot of things below I disagree with, but it's ok. I convinced myself 
not to nit-pick every point.

 

https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of Stefan's work 
with cert management

 

Beyond that, I encourage you to do what Michael suggested: open JIRAs for 
things you care strongly about, work on them if you have time. Sometime this 
year we'll schedule a NGCC (Next Generation Cassandra Conference) where we talk 
about future project work and direction, I encourage you to attend if you're 
able (I encourage anyone who cares about the direction of Cassandra to attend, 
it's probably be either free or very low cost, just to cover a venue and some 
food). If nothing else, you'll meet some of the teams who are working on the 
project, and learn why they've selected the projects on which they're working. 
You'll have an opportunity to pitch your vision, and maybe you can talk some 
folks into helping out. 

 

- Jeff

 

 

 

 

On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman  
wrote:

Comments inline

>-Original Message-
>From: Jeff Jirsa [mailto:jji...@gmail.com]
>Sent: Sunday, February 18, 2018 10:58 PM
>To: u...@cassandra.apache.org
>Cc: dev@cassandra.apache.org
>Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
>Comments inline
>
>
>> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman  
>> wrote:
>>
> >Cassandra feels like an unfinished program to me. The problem is not that 
> >it’s open source or cutting edge.  It’s an open source cutting edge program 
> >that lacks some of its basic functionality.  We are all stuck addressing 
> >fundamental mechanical tasks for Cassandra because the basic code that would 
> >do that part has not been contributed yet.
>>
>There’s probably 2-3 reasons why here:
>
>1) Historically the pmc has tried to keep the scope of the project very 
>narrow. It’s a database. We don’t ship drivers. We don’t ship developer tools. 
>We don’t ship fancy UIs. We ship a database. I think for the most part the 
>narrow vision has been for the best, but maybe it’s time to reconsider some of 
>the scope.
>
>Postgres will autovacuum to prevent wraparound (hopefully),  but everyone I 
>know running Postgres uses flexible-freeze in cron - sometimes it’s ok to let 
>the database have its opinions and let third party tools fill in the gaps.
>

I can appreciate the desire to stay in scope.  I believe usability is the King. 
 When users have to learn the database, then learn what they have to automate, 
then learn an automation tool and then use the automation tool to do something 
that is as fundamental as the fundamental tasks I described, then something is 
missing from the database itself that is adversely affecting usability - and 
that is very bad.  Where those big companies need to calculate the ROI is in 
the cost of acquiring or training the next group of users.  Consider how steep 
the learning curve is for new users.  Consider the business case for improving 
ease of use.

>2) Cassandra is, by definition, a database for large scale problems. Most of 
>the companies working on/with it tend to be big companies. Big companies often 
>have pre-existing automation that solved the stuff you consider fundamental 
>tasks, so there’s probably nobody actively working on the solved problems that 
>you may consider missing features - for many people they’re already solved.
>

I could be wrong but it sounds like a lot of the code work is done, and if the 
companies would take the time to contribute more code, then the rest of the 
code needed could be generated easily.

>3) It’s not nearly as basic as you think it is. Datastax seemingly had a 
>multi-person team on opscenter, and while it was better than anything else 
>around last time I used it (before it stopped supporting the OSS version), it 
>left a lot to be desired. It’s probably 2-3 engineers working for a month  to 
>have any sort of meaningful, reliable, mostly trivial cluster-managing UI, and 
>I can think of about 10 JIRAs I’d rather see that time be spent on first.

How about 6-9 engineers working 12 months a year on it then.  I'm not kidding.  
For a big company with revenues in the tens of billions or more, and a heavy 
use of Cassandra nodes, it's easy to make a case for having a full time person 
or more that involved.  They aren't paying for using the open source code that 
is Cassandra.  Let's see what would the licensing fees be for a big company if 
the costs where like Microsoft or Oracle would charge for their enterprise 
level relational database?   What's the contribution of one or two people in 
comparison.

>> Ease of use issues need 

[VOTE PASSED] (Take 3) Release Apache Cassandra 3.11.2

2018-02-19 Thread Michael Shuler
I count 6 binding +1, 1 non-binding +1, and no other votes for this
release of 3.11.2. I will get the artifacts uploaded shortly.

-- 
Kind regards,
Michael

On 02/14/2018 03:09 PM, Michael Shuler wrote:
> I propose the following artifacts for release as 3.11.2.
> 
> sha1: 1d506f9d09c880ff2b2693e3e27fa58c02ecf398
> Git:
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/3.11.2-tentative
> Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-1158/org/apache/cassandra/apache-cassandra/3.11.2/
> Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-1158/
> 
> Debian and RPM packages are available here:
> http://people.apache.org/~mshuler
> 
> *** This release addresses an important fix for CASSANDRA-14092 ***
> "Max ttl of 20 years will overflow localDeletionTime"
> https://issues.apache.org/jira/browse/CASSANDRA-14092
> 
> The vote will be open for 72 hours (longer if needed).
> 
> [1]: (CHANGES.txt) https://goo.gl/RLZLrR
> [2]: (NEWS.txt) https://goo.gl/kpnVHp
> 




signature.asc
Description: OpenPGP digital signature


[VOTE PASSED] (Take 2) Release Apache Cassandra 3.0.16

2018-02-19 Thread Michael Shuler
With 5 binding +1, 2 non-binding +1, and no other votes, this release
has passed. I will get 3.0.16 uploaded as soon as I can!

-- 
Kind regards,
Michael

On 02/14/2018 02:40 PM, Michael Shuler wrote:
> I propose the following artifacts for release as 3.0.16.
> 
> sha1: 890f319142ddd3cf2692ff45ff28e71001365e96
> Git:
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/3.0.16-tentative
> Artifacts:
> https://repository.apache.org/content/repositories/orgapachecassandra-1157/org/apache/cassandra/apache-cassandra/3.0.16/
> Staging repository:
> https://repository.apache.org/content/repositories/orgapachecassandra-1157/
> 
> Debian and RPM packages are available here:
> http://people.apache.org/~mshuler
> 
> *** This release addresses an important fix for CASSANDRA-14092 ***
> "Max ttl of 20 years will overflow localDeletionTime"
> https://issues.apache.org/jira/browse/CASSANDRA-14092
> 
> The vote will be open for 72 hours (longer if needed).
> 
> [1]: (CHANGES.txt) https://goo.gl/rLj59Z
> [2]: (NEWS.txt) https://goo.gl/EkrT4G
> 




signature.asc
Description: OpenPGP digital signature


Re: Cassandra Needs to Grow Up by Version Five!

2018-02-19 Thread Jeff Jirsa
There's a lot of things below I disagree with, but it's ok. I convinced
myself not to nit-pick every point.

https://issues.apache.org/jira/browse/CASSANDRA-13971 has some of Stefan's
work with cert management

Beyond that, I encourage you to do what Michael suggested: open JIRAs for
things you care strongly about, work on them if you have time. Sometime
this year we'll schedule a NGCC (Next Generation Cassandra Conference)
where we talk about future project work and direction, I encourage you to
attend if you're able (I encourage anyone who cares about the direction of
Cassandra to attend, it's probably be either free or very low cost, just to
cover a venue and some food). If nothing else, you'll meet some of the
teams who are working on the project, and learn why they've selected the
projects on which they're working. You'll have an opportunity to pitch your
vision, and maybe you can talk some folks into helping out.

- Jeff




On Mon, Feb 19, 2018 at 1:01 AM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:

> Comments inline
>
> >-Original Message-
> >From: Jeff Jirsa [mailto:jji...@gmail.com]
> >Sent: Sunday, February 18, 2018 10:58 PM
> >To: u...@cassandra.apache.org
> >Cc: dev@cassandra.apache.org
> >Subject: Re: Cassandra Needs to Grow Up by Version Five!
> >
> >Comments inline
> >
> >
> >> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman
>  wrote:
> >>
> > >Cassandra feels like an unfinished program to me. The problem is not
> that it’s open source or cutting edge.  It’s an open source cutting edge
> program that lacks some of its basic functionality.  We are all stuck
> addressing fundamental mechanical tasks for Cassandra because the basic
> code that would do that part has not been contributed yet.
> >>
> >There’s probably 2-3 reasons why here:
> >
> >1) Historically the pmc has tried to keep the scope of the project very
> narrow. It’s a database. We don’t ship drivers. We don’t ship developer
> tools. We don’t ship fancy UIs. We ship a database. I think for the most
> part the narrow vision has been for the best, but maybe it’s time to
> reconsider some of the scope.
> >
> >Postgres will autovacuum to prevent wraparound (hopefully),  but everyone
> I know running Postgres uses flexible-freeze in cron - sometimes it’s ok to
> let the database have its opinions and let third party tools fill in the
> gaps.
> >
>
> I can appreciate the desire to stay in scope.  I believe usability is the
> King.  When users have to learn the database, then learn what they have to
> automate, then learn an automation tool and then use the automation tool to
> do something that is as fundamental as the fundamental tasks I described,
> then something is missing from the database itself that is adversely
> affecting usability - and that is very bad.  Where those big companies need
> to calculate the ROI is in the cost of acquiring or training the next group
> of users.  Consider how steep the learning curve is for new users.
> Consider the business case for improving ease of use.
>
> >2) Cassandra is, by definition, a database for large scale problems. Most
> of the companies working on/with it tend to be big companies. Big companies
> often have pre-existing automation that solved the stuff you consider
> fundamental tasks, so there’s probably nobody actively working on the
> solved problems that you may consider missing features - for many people
> they’re already solved.
> >
>
> I could be wrong but it sounds like a lot of the code work is done, and if
> the companies would take the time to contribute more code, then the rest of
> the code needed could be generated easily.
>
> >3) It’s not nearly as basic as you think it is. Datastax seemingly had a
> multi-person team on opscenter, and while it was better than anything else
> around last time I used it (before it stopped supporting the OSS version),
> it left a lot to be desired. It’s probably 2-3 engineers working for a
> month  to have any sort of meaningful, reliable, mostly trivial
> cluster-managing UI, and I can think of about 10 JIRAs I’d rather see that
> time be spent on first.
>
> How about 6-9 engineers working 12 months a year on it then.  I'm not
> kidding.  For a big company with revenues in the tens of billions or more,
> and a heavy use of Cassandra nodes, it's easy to make a case for having a
> full time person or more that involved.  They aren't paying for using the
> open source code that is Cassandra.  Let's see what would the licensing
> fees be for a big company if the costs where like Microsoft or Oracle would
> charge for their enterprise level relational database?   What's the
> contribution of one or two people in comparison.
>
> >> Ease of use issues need to be given much more attention.  For an
> administrator, the ease of use of Cassandra is very poor.
> >>
> >>Furthermore, currently Cassandra is an idiot.  We have to do everything
> for Cassandra. Contrast that with the fact that we are in 

Re: Cassandra Needs to Grow Up by Version Five!

2018-02-19 Thread Michael Kjellman
the things you are asking for are unfortunately not tiny effort. as you don’t 
seem to have the time to contribute code the best way you personally create 
change would be (again) to file individual jiras for each enhancement or 
feature request.

highlight key ones you filed via the mailing list that you’d personally like to 
see prioritized - and advocate to have resources allocated towards implementing 
and ultimately get those scheduled for a release over other ones.

best,
kjellman

> On Feb 18, 2018, at 11:07 PM, Kenneth Brotman  
> wrote:
> 
> Hi Michael, actually I do very much like the database.  thanks for the 
> thoughts... a few comments:
> 
> 1) Lots of big companies like, let's see, Apple is a big one, probably could 
> easily justify contributing resources to finish up the basic development of 
> Cassandra. 
> 2) There are lots of big companies using Cassandra.  Each could contribute a 
> tiny effort and everyone would benefit greatly.
> 3) A focused effort by a small group of talented people like there are in 
> this group could knock it out easily.
> 4) Not everyone is a Cassandra coder.  It's not for me to do Michael.
> 5) I'm an individual.  I am not working at a big company at the moment 
> Michael.  
> 
> Best,
> Kenneth Brotman
> 
> 
> -Original Message-
> From: Michael Kjellman [mailto:kjell...@apple.com] 
> Sent: Sunday, February 18, 2018 10:18 PM
> To: dev@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
> 
> hi ken, sorry you don’t like the database. some thoughts:
> 
> 1) please file actionable jiras for places you feel need to be improved in 
> the database... this is the best way to make and encourage the change you’re 
> looking for. it seems you have quite a few ideas from your post that could be 
> broken down into individual actionable jiras.
> 2) please don’t cross post between mailing lists.
> 3) pull requests are always welcomed!
> 
> best,
> kjellman
> 
>> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman  
>> wrote:
>> 
>> Cassandra feels like an unfinished program to me.  The problem is not 
>> that it's open source or cutting edge.  It's an open source cutting 
>> edge program that lacks some of its basic functionality.  We are all 
>> stuck addressing fundamental mechanical tasks for Cassandra because 
>> the basic code that would do that part has not been contributed yet.
>> 
>> Ease of use issues need to be given much more attention.  For an 
>> administrator, the ease of use of Cassandra is very poor.
>> 
>> Furthermore, currently Cassandra is an idiot.  We have to do 
>> everything for Cassandra. Contrast that with the fact that we are in 
>> the dawn of artificial intelligence.
>> 
>> Software exists to automate tasks for humans, not mechanize humans to 
>> administer tasks for a database.  I'm an engineering type.  My job is 
>> to apply science and technology to solve real world problems.  And 
>> that's where I need an organization's I.T. talent to focus; not in 
>> crank starting an unfinished database.
>> 
>> For example, I should be able to go to any node, replace the 
>> Cassandra.yaml file and have a prompt on the display ask me if I want 
>> to update all the yaml files across the cluster.  I shouldn't have to 
>> manually modify yaml files on each node or have to create a script for 
>> some third party automation tool to do it.
>> 
>> I should not have to turn off service, clear directories, restart 
>> service in coordination with the other nodes.  It's already a computer 
>> system.  It can do those things on its own.
>> 
>> How about read repair.  First there is something wrong with the name.  
>> Maybe it should be called Consistency Repair.  An administrator 
>> shouldn't have to do anything.  It should be a behavior of Cassandra 
>> that is programmed in. It should consider the GC setting of each node, 
>> calculate how often it has to run repair, when it should run it so all 
>> the nodes aren't trying at the same time and when other circumstances 
>> indicate it should also run it.
>> 
>> Certificate management should be automated.
>> 
>> Cluster wide management should be a big theme in any next major release.
>> What is a major release?  How many major releases could a program have 
>> before all the coding for basic stuff like installation, configuration 
>> and maintenance is included!
>> 
>> Finish the basic coding of Cassandra, make it easy to use for 
>> administrators, make is smart, add cluster wide management.  Keep 
>> Cassandra competitive or it will soon be the old Model T we all remember 
>> fondly.
>> 
>> I ask the Committee to compile a list of all such items, make a plan, 
>> and commit to including the completed and tested code as part of major 
>> release 5.0.  I further ask that release 4.0 not be delayed and then 
>> there be an unusually short skip to version 5.0.
>> 
>> Kenneth Brotman
>> 
> 
> 

RE: Cassandra Needs to Grow Up by Version Five!

2018-02-19 Thread Kenneth Brotman
Comments inline

>-Original Message-
>From: Jeff Jirsa [mailto:jji...@gmail.com] 
>Sent: Sunday, February 18, 2018 10:58 PM
>To: u...@cassandra.apache.org
>Cc: dev@cassandra.apache.org
>Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
>Comments inline 
>
>
>> On Feb 18, 2018, at 9:39 PM, Kenneth Brotman  
>> wrote:
>>
> >Cassandra feels like an unfinished program to me. The problem is not that 
> >it’s open source or cutting edge.  It’s an open source cutting edge program 
> >that lacks some of its basic functionality.  We are all stuck addressing 
> >fundamental mechanical tasks for Cassandra because the basic code that would 
> >do that part has not been contributed yet.
>> 
>There’s probably 2-3 reasons why here:
>
>1) Historically the pmc has tried to keep the scope of the project very 
>narrow. It’s a database. We don’t ship drivers. We don’t ship developer tools. 
>We don’t ship fancy UIs. We ship a database. I think for the most part the 
>narrow vision has been for the best, but maybe it’s time to reconsider some of 
>the scope. 
>
>Postgres will autovacuum to prevent wraparound (hopefully),  but everyone I 
>know running Postgres uses flexible-freeze in cron - sometimes it’s ok to let 
>the database have its opinions and let third party tools fill in the gaps.
>

I can appreciate the desire to stay in scope.  I believe usability is the King. 
 When users have to learn the database, then learn what they have to automate, 
then learn an automation tool and then use the automation tool to do something 
that is as fundamental as the fundamental tasks I described, then something is 
missing from the database itself that is adversely affecting usability - and 
that is very bad.  Where those big companies need to calculate the ROI is in 
the cost of acquiring or training the next group of users.  Consider how steep 
the learning curve is for new users.  Consider the business case for improving 
ease of use. 

>2) Cassandra is, by definition, a database for large scale problems. Most of 
>the companies working on/with it tend to be big companies. Big companies often 
>have pre-existing automation that solved the stuff you consider fundamental 
>tasks, so there’s probably nobody actively working on the solved problems that 
>you may consider missing features - for many people they’re already solved.
>

I could be wrong but it sounds like a lot of the code work is done, and if the 
companies would take the time to contribute more code, then the rest of the 
code needed could be generated easily.

>3) It’s not nearly as basic as you think it is. Datastax seemingly had a 
>multi-person team on opscenter, and while it was better than anything else 
>around last time I used it (before it stopped supporting the OSS version), it 
>left a lot to be desired. It’s probably 2-3 engineers working for a month  to 
>have any sort of meaningful, reliable, mostly trivial cluster-managing UI, and 
>I can think of about 10 JIRAs I’d rather see that time be spent on first. 

How about 6-9 engineers working 12 months a year on it then.  I'm not kidding.  
For a big company with revenues in the tens of billions or more, and a heavy 
use of Cassandra nodes, it's easy to make a case for having a full time person 
or more that involved.  They aren't paying for using the open source code that 
is Cassandra.  Let's see what would the licensing fees be for a big company if 
the costs where like Microsoft or Oracle would charge for their enterprise 
level relational database?   What's the contribution of one or two people in 
comparison.

>> Ease of use issues need to be given much more attention.  For an 
>> administrator, the ease of use of Cassandra is very poor. 
>>
>>Furthermore, currently Cassandra is an idiot.  We have to do everything for 
>>Cassandra. Contrast that with the fact that we are in the dawn of artificial 
>>intelligence.
>> 
>
>And for everything you think is obvious, there’s a 50% chance someone else 
>will have already solved differently, and your obvious new solution will be 
>seen as an inconvenient assumption and complexity they won’t appreciate. Open 
>source projects get to walk a fine line of trying to be useful without making 
>too many assumptions, being “too” opinionated, or overstepping bounds. We may 
>be too conservative, but it’s very easy to go too far in the opposite 
>direction. 
>

I appreciate that but when such concerns result in inaction instead of 
resolution that is no good.

>> Software exists to automate tasks for humans, not mechanize humans to 
>> administer tasks for a database.  I’m an engineering type.  My job is to 
>> apply science and technology to solve real world problems.  And that’s where 
>> I need an organization’s I.T. talent to focus; not in crank starting an 
>> unfinished database.
>> 
>
>And that’s why nobody’s done it - we all have bigger problems we’re being paid 
>to solve, and nobody’s felt it necessary. Because it’s