Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Oleksandr Shulgin
On Wed, Feb 21, 2018 at 7:54 PM, Durity, Sean R  wrote:

>
>
> However, I think the shots at Cassandra are generally unfair. When I
> started working with it, the DataStax documentation was some of the best
> documentation I had seen on any project, especially an open source one.
>

Oh, don't get me started on documentation, especially the DataStax one.  I
come from Postgres.  In comparison, Cassandra documentation is mostly
non-existent (and this is just a way to avoid listing other uncomfortable
epithets).

Not sure if I would be able to submit patches to improve that, however,
since most of the time it would require me to already know the answer to my
questions when the doc is incomplete.

The move from DataStax to Apache.org for docs is actually good, IMO, since
the docs were maintained very poorly and there was no real leverage to
influence that.

Cheers,
--
Alex


Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Michael Kjellman
Please do send them! There was a *lot* of really hard great work by a lot of 
people over the past year to significantly improve the documentation in tree.

http://cassandra.apache.org/doc/latest/
https://github.com/apache/cassandra/tree/trunk/doc

I still didn't see a reply from you re: my request for your jira information so 
i'm unable to follow what issues you're referring to as you haven't linked to 
any in your emails either. If you still see holes in the new and improved 
documentation above, _please_ do create tickets to track that so we can improve 
that asap! a fresh set of eyes on areas not covered is obviously welcomed; 
especially those with overlap with the links you're referring to in your email 
obviously.

best,
kjellman

On Feb 21, 2018, at 4:13 PM, Kenneth Brotman 
mailto:kenbrot...@yahoo.com.INVALID>> wrote:



Jeff,



I already addressed everything you said.  Boy! Would I like to bring up the out 
of date articles on the web that trip people up and the lousy documentation on 
the Apache website but I can’t because a lot of folks don’t know me or why I’m 
saying these things.



I will be making another post that I hope clarifies what’s going on with me.  
After that I will either be a freakishly valuable asset to this community or I 
will be a freakishly valuable asset to another community.



You sure have a funny way of reigning in people that are used to helping out.  
You sure misjudged me.  Wow.



Kenneth Brotman



From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Wednesday, February 21, 2018 3:12 PM
To: cassandra
Cc: Cassandra DEV
Subject: Re: Cassandra Needs to Grow Up by Version Five!





On Wed, Feb 21, 2018 at 2:53 PM, Kenneth Brotman 
mailto:kenbrot...@yahoo.com.invalid>> wrote:

Hi Akash,

I get the part about outside work which is why in replying to Jeff Jirsa I was 
suggesting the big companies could justify taking it on easy enough and you 
know actually pay the people who would be working at it so those people could 
have a life.

The part I don't get is the aversion to usability.  Isn't that what you think 
about when you are coding?  "Am I making this thing I'm building easy to use?"  
If you were programming for me, we would be constantly talking about what we 
are building and how we can make things easier for users.  If I had to fight 
with a developer, architect or engineer about usability all the time, they 
would be gone and quick.  How do approach programming if you aren't trying to 
make things easy.





There's no aversion to usability, you're assuming things that just aren't true 
Nobody's against usability, we've just prioritized other things HIGHER. We make 
those decisions in part by looking at open JIRAs and determining what's asked 
for the most, what members of the community have contributed, and then balance 
that against what we ourselves care about. You're making a statement that it 
should be the top priority for the next release, with no JIRA, and history of 
contributing (and indeed, no real clear sign that you even understand the full 
extent of the database), no sign that you're willing to do the work yourself, 
and making a ton of assumptions about the level of effort and ROI.



I would love for Cassandra to be easier to use, I'm sure everyone does. There's 
a dozen features I'd love to add if I had infinite budget and infinite 
manpower. But what you're asking for is A LOT of effort and / or A LOT of 
money, and you're assuming someone's going to step up and foot the bill, but 
there's no real reason to believe that's the case.



In the mean time, everyone's spending hours replying to this thread that is 0% 
actionable. We would all have been objectively better off had everyone ignored 
this thread and just spent 10 minutes writing some section of the docs. So the 
next time I get the urge to reply, I'm just going to do that instead.










RE: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Kenneth Brotman
 

Jeff,

 

I already addressed everything you said.  Boy! Would I like to bring up the out 
of date articles on the web that trip people up and the lousy documentation on 
the Apache website but I can’t because a lot of folks don’t know me or why I’m 
saying these things.  

 

I will be making another post that I hope clarifies what’s going on with me.  
After that I will either be a freakishly valuable asset to this community or I 
will be a freakishly valuable asset to another community.  

 

You sure have a funny way of reigning in people that are used to helping out.  
You sure misjudged me.  Wow.

 

Kenneth Brotman

 

From: Jeff Jirsa [mailto:jji...@gmail.com] 
Sent: Wednesday, February 21, 2018 3:12 PM
To: cassandra
Cc: Cassandra DEV
Subject: Re: Cassandra Needs to Grow Up by Version Five!

 

 

On Wed, Feb 21, 2018 at 2:53 PM, Kenneth Brotman  
wrote:

Hi Akash,

I get the part about outside work which is why in replying to Jeff Jirsa I was 
suggesting the big companies could justify taking it on easy enough and you 
know actually pay the people who would be working at it so those people could 
have a life.

The part I don't get is the aversion to usability.  Isn't that what you think 
about when you are coding?  "Am I making this thing I'm building easy to use?"  
If you were programming for me, we would be constantly talking about what we 
are building and how we can make things easier for users.  If I had to fight 
with a developer, architect or engineer about usability all the time, they 
would be gone and quick.  How do approach programming if you aren't trying to 
make things easy.

 

 

There's no aversion to usability, you're assuming things that just aren't true 
Nobody's against usability, we've just prioritized other things HIGHER. We make 
those decisions in part by looking at open JIRAs and determining what's asked 
for the most, what members of the community have contributed, and then balance 
that against what we ourselves care about. You're making a statement that it 
should be the top priority for the next release, with no JIRA, and history of 
contributing (and indeed, no real clear sign that you even understand the full 
extent of the database), no sign that you're willing to do the work yourself, 
and making a ton of assumptions about the level of effort and ROI.

 

I would love for Cassandra to be easier to use, I'm sure everyone does. There's 
a dozen features I'd love to add if I had infinite budget and infinite 
manpower. But what you're asking for is A LOT of effort and / or A LOT of 
money, and you're assuming someone's going to step up and foot the bill, but 
there's no real reason to believe that's the case. 

 

In the mean time, everyone's spending hours replying to this thread that is 0% 
actionable. We would all have been objectively better off had everyone ignored 
this thread and just spent 10 minutes writing some section of the docs. So the 
next time I get the urge to reply, I'm just going to do that instead.

 

 

 



Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Chris Lohfink
Instead of saying "Make X better" you can quantify "Here's how we can make X 
better" in a jira and the conversation will continue with interested parties 
(opening jiras are free!). Being combative and insulting project on mailing 
list may help vent some frustrations but it is counter productive and makes 
people defensive.

People are not averse to usability, quite the opposite actually. People do tend 
to be averse to conversations opened up with "cassandra is an idiot" with no 
clear definition of how to make it better or what a better solution would look 
like though. Note however that saying "make backups better" or "look at 
marketing literature for these guys" is hard for an engineer or architect to 
break into actionable item. Coming up with cool ideas on how to do something 
will more likely hook a developer into working on it then trying to shame the 
community with a sales pitch from another DB's sales guy.

Chris

> On Feb 21, 2018, at 4:53 PM, Kenneth Brotman  
> wrote:
> 
> Hi Akash,
> 
> I get the part about outside work which is why in replying to Jeff Jirsa I 
> was suggesting the big companies could justify taking it on easy enough and 
> you know actually pay the people who would be working at it so those people 
> could have a life.
> 
> The part I don't get is the aversion to usability.  Isn't that what you think 
> about when you are coding?  "Am I making this thing I'm building easy to 
> use?"  If you were programming for me, we would be constantly talking about 
> what we are building and how we can make things easier for users.  If I had 
> to fight with a developer, architect or engineer about usability all the 
> time, they would be gone and quick.  How do approach programming if you 
> aren't trying to make things easy.
> 
> Kenneth Brotman
> 
> -Original Message-
> From: Akash Gangil [mailto:akashg1...@gmail.com] 
> Sent: Wednesday, February 21, 2018 2:24 PM
> To: dev@cassandra.apache.org
> Cc: u...@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
> 
> I would second Jon in the arguments he made. Contributing outside work is 
> draining and really requires a lot of commitment. If someone requires 
> features around usability etc, just pay for it, period.
> 
> On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman < 
> kenbrot...@yahoo.com.invalid> wrote:
> 
>> Jon,
>> 
>> Very sorry that you don't see the value of the time I'm taking for this.
>> I don't have demands; I do have a stern warning and I'm right Jon.  
>> Please be very careful not to mischaracterized my words Jon.
>> 
>> You suggest I put things in JIRA's, then seem to suggest that I'd be 
>> lucky if anyone looked at it and did anything. That's what I figured too.
>> 
>> I don't appreciate the hostility.  You will understand more fully in 
>> the next post where I'm coming from.  Try to keep the conversation civilized.
>> I'm trying or at least so you understand I think what I'm doing is 
>> saving your gig and mine.  I really like a lot of people is this group.
>> 
>> I've come to a preliminary assessment on things.  Soon the cloud will 
>> clear or I'll be gone.  Don't worry.  I'm a very peaceful person and 
>> like you I am driven by real important projects that I feel compelled 
>> to work on for the good of others.  I don't have time for people to 
>> hand hold a database and I can't get stuck with my projects on the wrong 
>> stuff.
>> 
>> Kenneth Brotman
>> 
>> 
>> -Original Message-
>> From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon 
>> Haddad
>> Sent: Wednesday, February 21, 2018 12:44 PM
>> To: u...@cassandra.apache.org
>> Cc: dev@cassandra.apache.org
>> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>> 
>> Ken,
>> 
>> Maybe it’s not clear how open source projects work, so let me try to 
>> explain.  There’s a bunch of us who either get paid by someone or 
>> volunteer on our free time.  The folks that get paid, (yay!) usually 
>> take direction on what the priorities are, and work on projects that 
>> directly affect our jobs.  That means that someone needs to care 
>> enough about the features you want to work on them, if you’re not going to 
>> do it yourself.
>> 
>> Now as others have said already, please put your list of demands in 
>> JIRA, if someone is interested, they will work on it.  You may need to 
>> contribute a little more than you’ve done already, be prepared to get 
>> involved if you actually want to to see something get done.  Perhaps 
>> learning a little more about Cassandra’s internals and the people 
>> involved will reveal some of the design decisions and priorities of the 
>> project.
>> 
>> Third, you seem to be a little obsessed with market share.  While 
>> market share is fun to talk about, *most* of us that are working on 
>> and contributing to Cassandra do so because it does actually solve a 
>> problem we have, and solves it reasonably well.  If some magic open 
>> source DB appears out of no wher

Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Jason Brown
Hi all,

I'd like to deescalate a bit here.

Since this is an Apache and an OSS project, contributions come in many
forms: code, speaking/advocacy, documentation, support, project management,
and so on. None of these things come for free.

Ken, I appreciate you bring up these usability topics; they are certainly
valid concerns. You've mentioned you are working on posting of some sort
that I think will amount to an enumerated list of the topics/issues you
feel need addressing. Some may be simple changes, some may be more
invasive, some we can consider implementing, some not. I look forward to a
positive discussion.

I think what would be best would be for you to complete that list and work
with the community, in a *positive and constructive manner*, towards
getting it done. That is certainly contributing, and contributing in a big
way: project management. Working with the community is going to be the most
beneficial path for everyone.

Ken, if you feel like you'd like some help getting such an initiative
going, and contributing substantively to it (not necessarily in terms of
code) please feel free to reach out to me directly (jasedbr...@gmail.com).

Hoping this leads somewhere positive, that benefits everyone,

-Jason



On Wed, Feb 21, 2018 at 2:53 PM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:

> Hi Akash,
>
> I get the part about outside work which is why in replying to Jeff Jirsa I
> was suggesting the big companies could justify taking it on easy enough and
> you know actually pay the people who would be working at it so those people
> could have a life.
>
> The part I don't get is the aversion to usability.  Isn't that what you
> think about when you are coding?  "Am I making this thing I'm building easy
> to use?"  If you were programming for me, we would be constantly talking
> about what we are building and how we can make things easier for users.  If
> I had to fight with a developer, architect or engineer about usability all
> the time, they would be gone and quick.  How do approach programming if you
> aren't trying to make things easy.
>
> Kenneth Brotman
>
> -Original Message-
> From: Akash Gangil [mailto:akashg1...@gmail.com]
> Sent: Wednesday, February 21, 2018 2:24 PM
> To: dev@cassandra.apache.org
> Cc: u...@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> I would second Jon in the arguments he made. Contributing outside work is
> draining and really requires a lot of commitment. If someone requires
> features around usability etc, just pay for it, period.
>
> On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman <
> kenbrot...@yahoo.com.invalid> wrote:
>
> > Jon,
> >
> > Very sorry that you don't see the value of the time I'm taking for this.
> > I don't have demands; I do have a stern warning and I'm right Jon.
> > Please be very careful not to mischaracterized my words Jon.
> >
> > You suggest I put things in JIRA's, then seem to suggest that I'd be
> > lucky if anyone looked at it and did anything. That's what I figured too.
> >
> > I don't appreciate the hostility.  You will understand more fully in
> > the next post where I'm coming from.  Try to keep the conversation
> civilized.
> > I'm trying or at least so you understand I think what I'm doing is
> > saving your gig and mine.  I really like a lot of people is this group.
> >
> > I've come to a preliminary assessment on things.  Soon the cloud will
> > clear or I'll be gone.  Don't worry.  I'm a very peaceful person and
> > like you I am driven by real important projects that I feel compelled
> > to work on for the good of others.  I don't have time for people to
> > hand hold a database and I can't get stuck with my projects on the wrong
> stuff.
> >
> > Kenneth Brotman
> >
> >
> > -Original Message-
> > From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon
> > Haddad
> > Sent: Wednesday, February 21, 2018 12:44 PM
> > To: u...@cassandra.apache.org
> > Cc: dev@cassandra.apache.org
> > Subject: Re: Cassandra Needs to Grow Up by Version Five!
> >
> > Ken,
> >
> > Maybe it’s not clear how open source projects work, so let me try to
> > explain.  There’s a bunch of us who either get paid by someone or
> > volunteer on our free time.  The folks that get paid, (yay!) usually
> > take direction on what the priorities are, and work on projects that
> > directly affect our jobs.  That means that someone needs to care
> > enough about the features you want to work on them, if you’re not going
> to do it yourself.
> >
> > Now as others have said already, please put your list of demands in
> > JIRA, if someone is interested, they will work on it.  You may need to
> > contribute a little more than you’ve done already, be prepared to get
> > involved if you actually want to to see something get done.  Perhaps
> > learning a little more about Cassandra’s internals and the people
> > involved will reveal some of the design decisions and priorities of the
> project.
> >
> >

Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Jeff Jirsa
On Wed, Feb 21, 2018 at 2:53 PM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:

> Hi Akash,
>
> I get the part about outside work which is why in replying to Jeff Jirsa I
> was suggesting the big companies could justify taking it on easy enough and
> you know actually pay the people who would be working at it so those people
> could have a life.
>
> The part I don't get is the aversion to usability.  Isn't that what you
> think about when you are coding?  "Am I making this thing I'm building easy
> to use?"  If you were programming for me, we would be constantly talking
> about what we are building and how we can make things easier for users.  If
> I had to fight with a developer, architect or engineer about usability all
> the time, they would be gone and quick.  How do approach programming if you
> aren't trying to make things easy.
>


There's no aversion to usability, you're assuming things that just aren't
true. Nobody's against usability, we've just prioritized other things
HIGHER. We make those decisions in part by looking at open JIRAs and
determining what's asked for the most, what members of the community have
contributed, and then balance that against what we ourselves care about.
You're making a statement that it should be the top priority for the next
release, with no JIRA, and history of contributing (and indeed, no real
clear sign that you even understand the full extent of the database), no
sign that you're willing to do the work yourself, and making a ton of
assumptions about the level of effort and ROI.

I would love for Cassandra to be easier to use, I'm sure everyone does.
There's a dozen features I'd love to add if I had infinite budget and
infinite manpower. But what you're asking for is A LOT of effort and / or A
LOT of money, and you're assuming someone's going to step up and foot the
bill, but there's no real reason to believe that's the case.

In the mean time, everyone's spending hours replying to this thread that is
0% actionable. We would all have been objectively better off had everyone
ignored this thread and just spent 10 minutes writing some section of the
docs. So the next time I get the urge to reply, I'm just going to do that
instead.


Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Brandon Williams
The only progress from this point is what Jon said: enumerate and detail
your issues in jira tickets.

On Wed, Feb 21, 2018 at 4:53 PM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:

> Hi Akash,
>
> I get the part about outside work which is why in replying to Jeff Jirsa I
> was suggesting the big companies could justify taking it on easy enough and
> you know actually pay the people who would be working at it so those people
> could have a life.
>
> The part I don't get is the aversion to usability.  Isn't that what you
> think about when you are coding?  "Am I making this thing I'm building easy
> to use?"  If you were programming for me, we would be constantly talking
> about what we are building and how we can make things easier for users.  If
> I had to fight with a developer, architect or engineer about usability all
> the time, they would be gone and quick.  How do approach programming if you
> aren't trying to make things easy.
>
> Kenneth Brotman
>
> -Original Message-
> From: Akash Gangil [mailto:akashg1...@gmail.com]
> Sent: Wednesday, February 21, 2018 2:24 PM
> To: dev@cassandra.apache.org
> Cc: u...@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> I would second Jon in the arguments he made. Contributing outside work is
> draining and really requires a lot of commitment. If someone requires
> features around usability etc, just pay for it, period.
>
> On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman <
> kenbrot...@yahoo.com.invalid> wrote:
>
> > Jon,
> >
> > Very sorry that you don't see the value of the time I'm taking for this.
> > I don't have demands; I do have a stern warning and I'm right Jon.
> > Please be very careful not to mischaracterized my words Jon.
> >
> > You suggest I put things in JIRA's, then seem to suggest that I'd be
> > lucky if anyone looked at it and did anything. That's what I figured too.
> >
> > I don't appreciate the hostility.  You will understand more fully in
> > the next post where I'm coming from.  Try to keep the conversation
> civilized.
> > I'm trying or at least so you understand I think what I'm doing is
> > saving your gig and mine.  I really like a lot of people is this group.
> >
> > I've come to a preliminary assessment on things.  Soon the cloud will
> > clear or I'll be gone.  Don't worry.  I'm a very peaceful person and
> > like you I am driven by real important projects that I feel compelled
> > to work on for the good of others.  I don't have time for people to
> > hand hold a database and I can't get stuck with my projects on the wrong
> stuff.
> >
> > Kenneth Brotman
> >
> >
> > -Original Message-
> > From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon
> > Haddad
> > Sent: Wednesday, February 21, 2018 12:44 PM
> > To: u...@cassandra.apache.org
> > Cc: dev@cassandra.apache.org
> > Subject: Re: Cassandra Needs to Grow Up by Version Five!
> >
> > Ken,
> >
> > Maybe it’s not clear how open source projects work, so let me try to
> > explain.  There’s a bunch of us who either get paid by someone or
> > volunteer on our free time.  The folks that get paid, (yay!) usually
> > take direction on what the priorities are, and work on projects that
> > directly affect our jobs.  That means that someone needs to care
> > enough about the features you want to work on them, if you’re not going
> to do it yourself.
> >
> > Now as others have said already, please put your list of demands in
> > JIRA, if someone is interested, they will work on it.  You may need to
> > contribute a little more than you’ve done already, be prepared to get
> > involved if you actually want to to see something get done.  Perhaps
> > learning a little more about Cassandra’s internals and the people
> > involved will reveal some of the design decisions and priorities of the
> project.
> >
> > Third, you seem to be a little obsessed with market share.  While
> > market share is fun to talk about, *most* of us that are working on
> > and contributing to Cassandra do so because it does actually solve a
> > problem we have, and solves it reasonably well.  If some magic open
> > source DB appears out of no where and does everything you want
> > Cassandra to, and is bug free, keeps your data consistent,
> > automatically does backups, comes with really nice cert management, ad
> > hoc querying, amazing materialized views that are perfect, no caveats
> > to secondary indexes, and somehow still gives you linear scalability
> > without any mental overhead whatsoever then sure, people might start
> > using it.  And that’s actually OK, because if that happens we’ll all
> > be incredibly pumped out of our minds because we won’t have to work as
> > hard.  If on the slim chance that doesn’t manifest, those of us that
> > use Cassandra and are part of the community will keep working on the
> > things we care about, iterating, and improving things.  Maybe someone
> will even take a look at your JIRA issues.
> >
> > Further 

Re: Issues with Materialized-Views updates during a cluster change?

2018-02-21 Thread Paulo Motta
> 1. It seems that for example when RF=3, each one of the three base replicas 
> will send a view update to the fourth "pending node". While this is not 
> wrong, it's also inefficient - why send three copies of the same update? 
> Wouldn't it be more efficient that just one of the base replicas - the one 
> which eventually will be paired with the pending node - should send the 
> updates to it? Is there a problem with such a scheme?

This optimization can be done when there's a single pending range per
view replica set, but when there are multiple pending ranges and there
are failures, it's possible that the paired view replica changes what
can lead to missing updates. For instance, see the following scenario:
- There are 2 pending ranges A' and B'.
- Base replica A sends update to pending-paired view replica A'.
- Base replica B is down, so pending-paired view replica B' does not get update.
- Range movement A' fails and B' succeeds.
- B' becomes A new paired view replica.
- A will be out of sync with B'

Furthermore we would need to cache the ring state after the range
movement is completed to be able to compute the pending-paired view
replica but we don't have this info easily available currently, so it
seems that it would not be a trivial change but perhaps worth pursuing
in the single pending range case.

> 2. There's an optimization that when we're lucky enough that the paired view 
> replica is the same as this base replica, mutateMV doesn't use the normal 
> view-mutation-sending code (wrapViewBatchResponseHandler) and just writes the 
> mutation locally. In particular, in this case we do NOT write to the pending 
> node (unless I'm missing something). But, sometimes all replicas will be 
> paired with themselves - this can happen for example when number of nodes is 
> equal to RF, or when the base and view table have the same partition keys 
> (but different clustering keys). In this case, it seems the pending node will 
> not be written at all... Isn't this a bug?

Good catch! This indeed seems to be a regression caused by
CASSANDRA-13069, so I created CASSANDRA-14251 to restore the correct
behavior.

bq. Being paired with yourself is not only a "trick", but also
something which really happens (by chance or in some cases as I showed
above, always), and needs to be handled correctly, even if the cluster
grows. If none of the base replicas will send the view update to the
pending node, it will end up missing this update...

Exactly, I only considered the case where the local address was used
as a marker to indicate there was no paired endpoint, and
brainfarted/missed the more important case when the local node is the
paired endpoint and there is a pending endpoint. Unfortunately this
overlook was not catch by any tests, so I will add one on
CASSANDRA-14251.

2018-02-21 11:26 GMT-03:00 Nadav Har'El :
> Hi, I was trying to understand how view tables are updated during a period
> of range movements, namely bootstrapping of a new node or decommissioning
> one of the nodes. In particular, during the period of data streaming, we can
> have a new replica on a "pending node" to which we also need to send the
> view update.
>
> I looked at the mutateMV() code, and think I spotted two issues with it, and
> I wonder if I'm missing something or these are real problems:
>
> 1. It seems that for example when RF=3, each one of the three base replicas
> will send a view update to the fourth "pending node". While this is not
> wrong, it's also inefficient - why send three copies of the same update?
> Wouldn't it be more efficient that just one of the base replicas - the one
> which eventually will be paired with the pending node - should send the
> updates to it? Is there a problem with such a scheme?
>
> 2. There's an optimization that when we're lucky enough that the paired view
> replica is the same as this base replica, mutateMV doesn't use the normal
> view-mutation-sending code (wrapViewBatchResponseHandler) and just writes
> the mutation locally. In particular, in this case we do NOT write to the
> pending node (unless I'm missing something). But, sometimes all replicas
> will be paired with themselves - this can happen for example when number of
> nodes is equal to RF, or when the base and view table have the same
> partition keys (but different clustering keys). In this case, it seems the
> pending node will not be written at all... Isn't this a bug?
>
> The strange thing about issue 2 is that this code used to be correct (at
> least according to my understanding...) - it used to avoid this optimization
> if pendingNodes was not empty. But then this was changed in commit
> 12103653f31. Why?
> https://issues.apache.org/jira/browse/CASSANDRA-13069 contains an
> explanation to that change:
>  "I also removed the pendingEndpoints.isEmpty() condition to skip the
> batchlog for local mutations, since this was a pre-CASSANDRA-10674 leftover
> when ViewUtils.getViewNaturalEndpoint returned the local address 

RE: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Kenneth Brotman
Hi Akash,

I get the part about outside work which is why in replying to Jeff Jirsa I was 
suggesting the big companies could justify taking it on easy enough and you 
know actually pay the people who would be working at it so those people could 
have a life.

The part I don't get is the aversion to usability.  Isn't that what you think 
about when you are coding?  "Am I making this thing I'm building easy to use?"  
If you were programming for me, we would be constantly talking about what we 
are building and how we can make things easier for users.  If I had to fight 
with a developer, architect or engineer about usability all the time, they 
would be gone and quick.  How do approach programming if you aren't trying to 
make things easy.

Kenneth Brotman

-Original Message-
From: Akash Gangil [mailto:akashg1...@gmail.com] 
Sent: Wednesday, February 21, 2018 2:24 PM
To: dev@cassandra.apache.org
Cc: u...@cassandra.apache.org
Subject: Re: Cassandra Needs to Grow Up by Version Five!

I would second Jon in the arguments he made. Contributing outside work is 
draining and really requires a lot of commitment. If someone requires features 
around usability etc, just pay for it, period.

On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman < 
kenbrot...@yahoo.com.invalid> wrote:

> Jon,
>
> Very sorry that you don't see the value of the time I'm taking for this.
> I don't have demands; I do have a stern warning and I'm right Jon.  
> Please be very careful not to mischaracterized my words Jon.
>
> You suggest I put things in JIRA's, then seem to suggest that I'd be 
> lucky if anyone looked at it and did anything. That's what I figured too.
>
> I don't appreciate the hostility.  You will understand more fully in 
> the next post where I'm coming from.  Try to keep the conversation civilized.
> I'm trying or at least so you understand I think what I'm doing is 
> saving your gig and mine.  I really like a lot of people is this group.
>
> I've come to a preliminary assessment on things.  Soon the cloud will 
> clear or I'll be gone.  Don't worry.  I'm a very peaceful person and 
> like you I am driven by real important projects that I feel compelled 
> to work on for the good of others.  I don't have time for people to 
> hand hold a database and I can't get stuck with my projects on the wrong 
> stuff.
>
> Kenneth Brotman
>
>
> -Original Message-
> From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon 
> Haddad
> Sent: Wednesday, February 21, 2018 12:44 PM
> To: u...@cassandra.apache.org
> Cc: dev@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> Ken,
>
> Maybe it’s not clear how open source projects work, so let me try to 
> explain.  There’s a bunch of us who either get paid by someone or 
> volunteer on our free time.  The folks that get paid, (yay!) usually 
> take direction on what the priorities are, and work on projects that 
> directly affect our jobs.  That means that someone needs to care 
> enough about the features you want to work on them, if you’re not going to do 
> it yourself.
>
> Now as others have said already, please put your list of demands in 
> JIRA, if someone is interested, they will work on it.  You may need to 
> contribute a little more than you’ve done already, be prepared to get 
> involved if you actually want to to see something get done.  Perhaps 
> learning a little more about Cassandra’s internals and the people 
> involved will reveal some of the design decisions and priorities of the 
> project.
>
> Third, you seem to be a little obsessed with market share.  While 
> market share is fun to talk about, *most* of us that are working on 
> and contributing to Cassandra do so because it does actually solve a 
> problem we have, and solves it reasonably well.  If some magic open 
> source DB appears out of no where and does everything you want 
> Cassandra to, and is bug free, keeps your data consistent, 
> automatically does backups, comes with really nice cert management, ad 
> hoc querying, amazing materialized views that are perfect, no caveats 
> to secondary indexes, and somehow still gives you linear scalability 
> without any mental overhead whatsoever then sure, people might start 
> using it.  And that’s actually OK, because if that happens we’ll all 
> be incredibly pumped out of our minds because we won’t have to work as 
> hard.  If on the slim chance that doesn’t manifest, those of us that 
> use Cassandra and are part of the community will keep working on the 
> things we care about, iterating, and improving things.  Maybe someone will 
> even take a look at your JIRA issues.
>
> Further filling the mailing list with your grievances will likely not 
> help you progress towards your goal of a Cassandra that’s easier to 
> use, so I encourage you to try to be a little more productive and try 
> to help rather than just complain, which is not constructive.  I did a 
> quick search for your name on the mailing l

Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Akash Gangil
I would second Jon in the arguments he made. Contributing outside work is
draining and really requires a lot of commitment. If someone requires
features around usability etc, just pay for it, period.

On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:

> Jon,
>
> Very sorry that you don't see the value of the time I'm taking for this.
> I don't have demands; I do have a stern warning and I'm right Jon.  Please
> be very careful not to mischaracterized my words Jon.
>
> You suggest I put things in JIRA's, then seem to suggest that I'd be lucky
> if anyone looked at it and did anything. That's what I figured too.
>
> I don't appreciate the hostility.  You will understand more fully in the
> next post where I'm coming from.  Try to keep the conversation civilized.
> I'm trying or at least so you understand I think what I'm doing is saving
> your gig and mine.  I really like a lot of people is this group.
>
> I've come to a preliminary assessment on things.  Soon the cloud will
> clear or I'll be gone.  Don't worry.  I'm a very peaceful person and like
> you I am driven by real important projects that I feel compelled to work on
> for the good of others.  I don't have time for people to hand hold a
> database and I can't get stuck with my projects on the wrong stuff.
>
> Kenneth Brotman
>
>
> -Original Message-
> From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon
> Haddad
> Sent: Wednesday, February 21, 2018 12:44 PM
> To: u...@cassandra.apache.org
> Cc: dev@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> Ken,
>
> Maybe it’s not clear how open source projects work, so let me try to
> explain.  There’s a bunch of us who either get paid by someone or volunteer
> on our free time.  The folks that get paid, (yay!) usually take direction
> on what the priorities are, and work on projects that directly affect our
> jobs.  That means that someone needs to care enough about the features you
> want to work on them, if you’re not going to do it yourself.
>
> Now as others have said already, please put your list of demands in JIRA,
> if someone is interested, they will work on it.  You may need to contribute
> a little more than you’ve done already, be prepared to get involved if you
> actually want to to see something get done.  Perhaps learning a little more
> about Cassandra’s internals and the people involved will reveal some of the
> design decisions and priorities of the project.
>
> Third, you seem to be a little obsessed with market share.  While market
> share is fun to talk about, *most* of us that are working on and
> contributing to Cassandra do so because it does actually solve a problem we
> have, and solves it reasonably well.  If some magic open source DB appears
> out of no where and does everything you want Cassandra to, and is bug free,
> keeps your data consistent, automatically does backups, comes with really
> nice cert management, ad hoc querying, amazing materialized views that are
> perfect, no caveats to secondary indexes, and somehow still gives you
> linear scalability without any mental overhead whatsoever then sure, people
> might start using it.  And that’s actually OK, because if that happens
> we’ll all be incredibly pumped out of our minds because we won’t have to
> work as hard.  If on the slim chance that doesn’t manifest, those of us
> that use Cassandra and are part of the community will keep working on the
> things we care about, iterating, and improving things.  Maybe someone will
> even take a look at your JIRA issues.
>
> Further filling the mailing list with your grievances will likely not help
> you progress towards your goal of a Cassandra that’s easier to use, so I
> encourage you to try to be a little more productive and try to help rather
> than just complain, which is not constructive.  I did a quick search for
> your name on the mailing list, and I’ve seen very little from you, so to
> everyone’s who’s been around for a while and trying to help you it looks
> like you’re just some random dude asking for people to work for free on the
> things you’re asking for, without offering anything back in return.
>
> Jon
>
>
> > On Feb 21, 2018, at 11:56 AM, Kenneth Brotman
>  wrote:
> >
> > Josh,
> >
> > To say nothing is indifference.  If you care about your community,
> sometimes don't you have to bring up a subject even though you know it's
> also temporarily adding some discomfort?
> >
> > As to opening a JIRA, I've got a very specific topic to try in mind
> now.  An easy one I'll work on and then announce.  Someone else will have
> to do the coding.  A year from now I would probably just knock it out to
> make sure it's as easy as I expect it to be but to be honest, as I've been
> saying, I'm not set up to do that right now.  I've barely looked at any
> Cassandra code; for one; everyone on this list probably codes more than I
> do, secondly; and lastly, it's a good one for someon

Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Michael Kjellman
kenneth: could you please send your jira information? i'm unable to even find 
an account on http://issues.apache.org with your name despite multiple 
attempts. thanks!

best,
kjellman

> On Feb 21, 2018, at 2:20 PM, Kenneth Brotman  
> wrote:
> 
> Jon,
> 
> Very sorry that you don't see the value of the time I'm taking for this.  I 
> don't have demands; I do have a stern warning and I'm right Jon.  Please be 
> very careful not to mischaracterized my words Jon.
> 
> You suggest I put things in JIRA's, then seem to suggest that I'd be lucky if 
> anyone looked at it and did anything. That's what I figured too.  
> 
> I don't appreciate the hostility.  You will understand more fully in the next 
> post where I'm coming from.  Try to keep the conversation civilized.  I'm 
> trying or at least so you understand I think what I'm doing is saving your 
> gig and mine.  I really like a lot of people is this group.
> 
> I've come to a preliminary assessment on things.  Soon the cloud will clear 
> or I'll be gone.  Don't worry.  I'm a very peaceful person and like you I am 
> driven by real important projects that I feel compelled to work on for the 
> good of others.  I don't have time for people to hand hold a database and I 
> can't get stuck with my projects on the wrong stuff.  
> 
> Kenneth Brotman
> 
> 
> -Original Message-
> From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon Haddad
> Sent: Wednesday, February 21, 2018 12:44 PM
> To: u...@cassandra.apache.org
> Cc: dev@cassandra.apache.org
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
> 
> Ken,
> 
> Maybe it’s not clear how open source projects work, so let me try to explain. 
>  There’s a bunch of us who either get paid by someone or volunteer on our 
> free time.  The folks that get paid, (yay!) usually take direction on what 
> the priorities are, and work on projects that directly affect our jobs.  That 
> means that someone needs to care enough about the features you want to work 
> on them, if you’re not going to do it yourself. 
> 
> Now as others have said already, please put your list of demands in JIRA, if 
> someone is interested, they will work on it.  You may need to contribute a 
> little more than you’ve done already, be prepared to get involved if you 
> actually want to to see something get done.  Perhaps learning a little more 
> about Cassandra’s internals and the people involved will reveal some of the 
> design decisions and priorities of the project.  
> 
> Third, you seem to be a little obsessed with market share.  While market 
> share is fun to talk about, *most* of us that are working on and contributing 
> to Cassandra do so because it does actually solve a problem we have, and 
> solves it reasonably well.  If some magic open source DB appears out of no 
> where and does everything you want Cassandra to, and is bug free, keeps your 
> data consistent, automatically does backups, comes with really nice cert 
> management, ad hoc querying, amazing materialized views that are perfect, no 
> caveats to secondary indexes, and somehow still gives you linear scalability 
> without any mental overhead whatsoever then sure, people might start using 
> it.  And that’s actually OK, because if that happens we’ll all be incredibly 
> pumped out of our minds because we won’t have to work as hard.  If on the 
> slim chance that doesn’t manifest, those of us that use Cassandra and are 
> part of the community will keep working on the things we care about, 
> iterating, and improving things.  Maybe someone will even take a look at your 
> JIRA issues.  
> 
> Further filling the mailing list with your grievances will likely not help 
> you progress towards your goal of a Cassandra that’s easier to use, so I 
> encourage you to try to be a little more productive and try to help rather 
> than just complain, which is not constructive.  I did a quick search for your 
> name on the mailing list, and I’ve seen very little from you, so to 
> everyone’s who’s been around for a while and trying to help you it looks like 
> you’re just some random dude asking for people to work for free on the things 
> you’re asking for, without offering anything back in return.
> 
> Jon
> 
> 
>> On Feb 21, 2018, at 11:56 AM, Kenneth Brotman  
>> wrote:
>> 
>> Josh,
>> 
>> To say nothing is indifference.  If you care about your community, sometimes 
>> don't you have to bring up a subject even though you know it's also 
>> temporarily adding some discomfort?  
>> 
>> As to opening a JIRA, I've got a very specific topic to try in mind now.  An 
>> easy one I'll work on and then announce.  Someone else will have to do the 
>> coding.  A year from now I would probably just knock it out to make sure 
>> it's as easy as I expect it to be but to be honest, as I've been saying, I'm 
>> not set up to do that right now.  I've barely looked at any Cassandra code; 
>> for one; everyone on this list probably codes more than I do, secondly; and 

RE: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Kenneth Brotman
Jon,

Very sorry that you don't see the value of the time I'm taking for this.  I 
don't have demands; I do have a stern warning and I'm right Jon.  Please be 
very careful not to mischaracterized my words Jon.

You suggest I put things in JIRA's, then seem to suggest that I'd be lucky if 
anyone looked at it and did anything. That's what I figured too.  

I don't appreciate the hostility.  You will understand more fully in the next 
post where I'm coming from.  Try to keep the conversation civilized.  I'm 
trying or at least so you understand I think what I'm doing is saving your gig 
and mine.  I really like a lot of people is this group.

I've come to a preliminary assessment on things.  Soon the cloud will clear or 
I'll be gone.  Don't worry.  I'm a very peaceful person and like you I am 
driven by real important projects that I feel compelled to work on for the good 
of others.  I don't have time for people to hand hold a database and I can't 
get stuck with my projects on the wrong stuff.  

Kenneth Brotman


-Original Message-
From: Jon Haddad [mailto:jonathan.had...@gmail.com] On Behalf Of Jon Haddad
Sent: Wednesday, February 21, 2018 12:44 PM
To: u...@cassandra.apache.org
Cc: dev@cassandra.apache.org
Subject: Re: Cassandra Needs to Grow Up by Version Five!

Ken,

Maybe it’s not clear how open source projects work, so let me try to explain.  
There’s a bunch of us who either get paid by someone or volunteer on our free 
time.  The folks that get paid, (yay!) usually take direction on what the 
priorities are, and work on projects that directly affect our jobs.  That means 
that someone needs to care enough about the features you want to work on them, 
if you’re not going to do it yourself. 

Now as others have said already, please put your list of demands in JIRA, if 
someone is interested, they will work on it.  You may need to contribute a 
little more than you’ve done already, be prepared to get involved if you 
actually want to to see something get done.  Perhaps learning a little more 
about Cassandra’s internals and the people involved will reveal some of the 
design decisions and priorities of the project.  

Third, you seem to be a little obsessed with market share.  While market share 
is fun to talk about, *most* of us that are working on and contributing to 
Cassandra do so because it does actually solve a problem we have, and solves it 
reasonably well.  If some magic open source DB appears out of no where and does 
everything you want Cassandra to, and is bug free, keeps your data consistent, 
automatically does backups, comes with really nice cert management, ad hoc 
querying, amazing materialized views that are perfect, no caveats to secondary 
indexes, and somehow still gives you linear scalability without any mental 
overhead whatsoever then sure, people might start using it.  And that’s 
actually OK, because if that happens we’ll all be incredibly pumped out of our 
minds because we won’t have to work as hard.  If on the slim chance that 
doesn’t manifest, those of us that use Cassandra and are part of the community 
will keep working on the things we care about, iterating, and improving things. 
 Maybe someone will even take a look at your JIRA issues.  

Further filling the mailing list with your grievances will likely not help you 
progress towards your goal of a Cassandra that’s easier to use, so I encourage 
you to try to be a little more productive and try to help rather than just 
complain, which is not constructive.  I did a quick search for your name on the 
mailing list, and I’ve seen very little from you, so to everyone’s who’s been 
around for a while and trying to help you it looks like you’re just some random 
dude asking for people to work for free on the things you’re asking for, 
without offering anything back in return.

Jon


> On Feb 21, 2018, at 11:56 AM, Kenneth Brotman  
> wrote:
> 
> Josh,
> 
> To say nothing is indifference.  If you care about your community, sometimes 
> don't you have to bring up a subject even though you know it's also 
> temporarily adding some discomfort?  
> 
> As to opening a JIRA, I've got a very specific topic to try in mind now.  An 
> easy one I'll work on and then announce.  Someone else will have to do the 
> coding.  A year from now I would probably just knock it out to make sure it's 
> as easy as I expect it to be but to be honest, as I've been saying, I'm not 
> set up to do that right now.  I've barely looked at any Cassandra code; for 
> one; everyone on this list probably codes more than I do, secondly; and 
> lastly, it's a good one for someone that wants an easy one to start with: 
> vNodes.  I've already seen too many people seeking assistance with the vNode 
> setting.
> 
> And you can expect as others have been mentioning that there should be 
> similar ones on compaction, repair and backup. 
> 
> Microsoft knows poor usability gives them an easy market to take over. And 
> they make it easy to switch.
> 

Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread DuyHai Doan
So before buying any marketing claims from Microsoft or whoever, maybe
should you try to use it extensively ?

And talking about backup, have a look at DynamoDB:
http://i68.tinypic.com/n1b6yr.jpg

>From my POV, if a multi-billions company like Amazon doesn't get it right
or can't make it easy for end-user (without involving  an unwieldy Hadoop
machinery:
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DynamoDBPipeline.html),
what Cassandra offers in term of back-up restore is more than satisfactory




On Wed, Feb 21, 2018 at 8:56 PM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:

>  Josh,
>
> To say nothing is indifference.  If you care about your community,
> sometimes don't you have to bring up a subject even though you know it's
> also temporarily adding some discomfort?
>
> As to opening a JIRA, I've got a very specific topic to try in mind now.
> An easy one I'll work on and then announce.  Someone else will have to do
> the coding.  A year from now I would probably just knock it out to make
> sure it's as easy as I expect it to be but to be honest, as I've been
> saying, I'm not set up to do that right now.  I've barely looked at any
> Cassandra code; for one; everyone on this list probably codes more than I
> do, secondly; and lastly, it's a good one for someone that wants an easy
> one to start with: vNodes.  I've already seen too many people seeking
> assistance with the vNode setting.
>
> And you can expect as others have been mentioning that there should be
> similar ones on compaction, repair and backup.
>
> Microsoft knows poor usability gives them an easy market to take over. And
> they make it easy to switch.
>
> Beginning at 4:17 in the video, it says the following:
>
> "You don't need to worry about replica sets, quorum or read
> repair.  You can focus on writing correct application logic."
>
> At 4:42, it says:
> "Hopefully this gives you a quick idea of how seamlessly you can
> bring your existing Cassandra applications to Azure Cosmos DB.  No code
> changes are required.  It works with your favorite Cassandra tools and
> drivers including for example native Cassandra driver for Spark. And it
> takes seconds to get going, and it's elastically and globally scalable."
>
> More to come,
>
> Kenneth Brotman
>
> -Original Message-
> From: Josh McKenzie [mailto:jmcken...@apache.org]
> Sent: Wednesday, February 21, 2018 8:28 AM
> To: dev@cassandra.apache.org
> Cc: User
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
>
> There's a disheartening amount of "here's where Cassandra is bad, and
> here's what it needs to do for me for free" happening in this thread.
>
> This is open-source software. Everyone is *strongly encouraged* to submit
> a patch to move the needle on *any* of these things being complained about
> in this thread.
>
> For the Apache Way  to
> work, people need to step up and meaningfully contribute to a project to
> scratch their own itch instead of just waiting for a random
> corporation-subsidized engineer to happen to have interests that align with
> them and contribute that to the project.
>
> Beating a dead horse for things everyone on the project knows are serious
> pain points is not productive.
>
> On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin <
> oleksandr.shul...@zalando.de> wrote:
>
> > On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
> > kenbrot...@yahoo.com.invalid> wrote:
> >
> > >
> > > >> Cluster wide management should be a big theme in any next major
> > release.
> > > >>
> > > >Na. Stability and testing should be a big theme in the next major
> > release.
> > > >
> > >
> > > Double Na on that one Jeff.  I think you have a concern there about
> > > the need to test sufficiently to ensure the stability of the next
> > > major release.  That makes perfect sense.- for every release,
> > > especially the major ones.  Continuous improvement is not a phase of
> > > development for example.  CI should be in everything, in every
> > > phase.  Stability and testing a part of every release not just one.
> > > A major release should be
> > a
> > > nice step from the previous major release though.
> > >
> >
> > I guess what Jeff refers to is the tick-tock release cycle experiment,
> > which has proven to be a complete disaster by popular opinion.
> >
> > There's also the "materialized views" feature which failed to
> > materialize in the end (pun intended) and had to be declared
> > experimental retroactively.
> >
> > Another prominent example is incremental repair which was introduced
> > as the default option in 2.2 and now is not recommended to use because
> > of so many corner cases where it can fail.  So again experimental as an
> afterthought.
> >
> > Not to mention that even if you are aware of the default incremental
> > and go with full repair instead, you're still up for a sad surprise:
> > anti-compaction will be triggered despite the "full" repair.

Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Jon Haddad
Ken,

Maybe it’s not clear how open source projects work, so let me try to explain.  
There’s a bunch of us who either get paid by someone or volunteer on our free 
time.  The folks that get paid, (yay!) usually take direction on what the 
priorities are, and work on projects that directly affect our jobs.  That means 
that someone needs to care enough about the features you want to work on them, 
if you’re not going to do it yourself. 

Now as others have said already, please put your list of demands in JIRA, if 
someone is interested, they will work on it.  You may need to contribute a 
little more than you’ve done already, be prepared to get involved if you 
actually want to to see something get done.  Perhaps learning a little more 
about Cassandra’s internals and the people involved will reveal some of the 
design decisions and priorities of the project.  

Third, you seem to be a little obsessed with market share.  While market share 
is fun to talk about, *most* of us that are working on and contributing to 
Cassandra do so because it does actually solve a problem we have, and solves it 
reasonably well.  If some magic open source DB appears out of no where and does 
everything you want Cassandra to, and is bug free, keeps your data consistent, 
automatically does backups, comes with really nice cert management, ad hoc 
querying, amazing materialized views that are perfect, no caveats to secondary 
indexes, and somehow still gives you linear scalability without any mental 
overhead whatsoever then sure, people might start using it.  And that’s 
actually OK, because if that happens we’ll all be incredibly pumped out of our 
minds because we won’t have to work as hard.  If on the slim chance that 
doesn’t manifest, those of us that use Cassandra and are part of the community 
will keep working on the things we care about, iterating, and improving things. 
 Maybe someone will even take a look at your JIRA issues.  

Further filling the mailing list with your grievances will likely not help you 
progress towards your goal of a Cassandra that’s easier to use, so I encourage 
you to try to be a little more productive and try to help rather than just 
complain, which is not constructive.  I did a quick search for your name on the 
mailing list, and I’ve seen very little from you, so to everyone’s who’s been 
around for a while and trying to help you it looks like you’re just some random 
dude asking for people to work for free on the things you’re asking for, 
without offering anything back in return.

Jon


> On Feb 21, 2018, at 11:56 AM, Kenneth Brotman  
> wrote:
> 
> Josh, 
> 
> To say nothing is indifference.  If you care about your community, sometimes 
> don't you have to bring up a subject even though you know it's also 
> temporarily adding some discomfort?  
> 
> As to opening a JIRA, I've got a very specific topic to try in mind now.  An 
> easy one I'll work on and then announce.  Someone else will have to do the 
> coding.  A year from now I would probably just knock it out to make sure it's 
> as easy as I expect it to be but to be honest, as I've been saying, I'm not 
> set up to do that right now.  I've barely looked at any Cassandra code; for 
> one; everyone on this list probably codes more than I do, secondly; and 
> lastly, it's a good one for someone that wants an easy one to start with: 
> vNodes.  I've already seen too many people seeking assistance with the vNode 
> setting.
> 
> And you can expect as others have been mentioning that there should be 
> similar ones on compaction, repair and backup. 
> 
> Microsoft knows poor usability gives them an easy market to take over. And 
> they make it easy to switch.
> 
> Beginning at 4:17 in the video, it says the following:
> 
>   "You don't need to worry about replica sets, quorum or read repair.  
> You can focus on writing correct application logic."
> 
> At 4:42, it says:
>   "Hopefully this gives you a quick idea of how seamlessly you can bring 
> your existing Cassandra applications to Azure Cosmos DB.  No code changes are 
> required.  It works with your favorite Cassandra tools and drivers including 
> for example native Cassandra driver for Spark. And it takes seconds to get 
> going, and it's elastically and globally scalable."
> 
> More to come,
> 
> Kenneth Brotman
> 
> -Original Message-
> From: Josh McKenzie [mailto:jmcken...@apache.org] 
> Sent: Wednesday, February 21, 2018 8:28 AM
> To: dev@cassandra.apache.org
> Cc: User
> Subject: Re: Cassandra Needs to Grow Up by Version Five!
> 
> There's a disheartening amount of "here's where Cassandra is bad, and here's 
> what it needs to do for me for free" happening in this thread.
> 
> This is open-source software. Everyone is *strongly encouraged* to submit a 
> patch to move the needle on *any* of these things being complained about in 
> this thread.
> 
> For the Apache Way  to work, 
> people need to step up and 

Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Durity, Sean R
It is instructive to listen to the concerns of new and existing users in order 
to improve a product like Cassandra, but I think the school yard taunt model 
isn’t the most effective.

In my experience with open and closed source databases, there are always things 
that could be improved. Many have a historical base in how the product evolved 
over time. A newcomer sees those as rough edges right away. In other cases, the 
database creators have often widened their scope to try and solve every data 
problem. This creates the complexity of too many configuration options, etc. 
Even the best RDBMS (Informix!) battled these kinds of issues.

Cassandra, though, introduced another angle of difficulty. In trying to relate 
to RDBMS users (pun intended), it often borrowed terminology to make it seem 
familiar. But they don’t work the same way or even solve the same problems. The 
classic example is secondary indexes. For RDBMS, they are very useful; for 
Cassandra, they are anathema (except for very narrow cases).

However, I think the shots at Cassandra are generally unfair. When I started 
working with it, the DataStax documentation was some of the best documentation 
I had seen on any project, especially an open source one. (If anything the 
cooling off between Apache Cassandra and DataStax may be the most serious 
misstep so far…) The more I learned about how Cassandra worked, the more I 
marveled at the clever combination of intricate solutions (gossip, merkle 
trees, compaction strategies, etc.) to solve specific data problems. This is a 
great product! It has given me lots of sleep-filled nights over the last 4+ 
years. My customers love it, once I explain what it should be used for (and 
what it shouldn’t). I applaud the contributors, whether coders or users. Thank 
you!

Finally, a note on backup. Backing up a distributed system is tough, but 
restores are even more complex (if you want no down-time, no extra disk space, 
point-in-time recovery, etc). If you want to investigate why it is a tough 
problem for Cassandra, go look at RecoverX from Datos IO. They have solved many 
of the problems, but it isn’t an easy task. You could ask people to try and 
recreate all that, or just point them to a working solution. If backup and 
recovery is required (and I would argue it isn’t always required), it is 
probably worth paying for.


Sean Durity
From: Josh McKenzie [mailto:jmcken...@apache.org]
Sent: Wednesday, February 21, 2018 11:28 AM
To: dev@cassandra.apache.org
Cc: User 
Subject: [EXTERNAL] Re: Cassandra Needs to Grow Up by Version Five!

There's a disheartening amount of "here's where Cassandra is bad, and here's 
what it needs to do for me for free" happening in this thread.

This is open-source software. Everyone is *strongly encouraged* to submit a 
patch to move the needle on *any* of these things being complained about in 
this thread.

For the Apache 
Way
 to work, people need to step up and meaningfully contribute to a project to 
scratch their own itch instead of just waiting for a random 
corporation-subsidized engineer to happen to have interests that align with 
them and contribute that to the project.

Beating a dead horse for things everyone on the project knows are serious pain 
points is not productive.

On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin 
mailto:oleksandr.shul...@zalando.de>> wrote:
On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:

>
> >> Cluster wide management should be a big theme in any next major release.
> >>
> >Na. Stability and testing should be a big theme in the next major release.
> >
>
> Double Na on that one Jeff.  I think you have a concern there about the
> need to test sufficiently to ensure the stability of the next major
> release.  That makes perfect sense.- for every release, especially the
> major ones.  Continuous improvement is not a phase of development for
> example.  CI should be in everything, in every phase.  Stability and
> testing a part of every release not just one.  A major release should be a
> nice step from the previous major release though.
>

I guess what Jeff refers to is the tick-tock release cycle experiment,
which has proven to be a complete disaster by popular opinion.

There's also the "materialized views" feature which failed to materialize
in the end (pun intended) and had to be declared experimental retroactively.

Another prominent example is incremental repair which was introduced as the
default option in 2.2 and now is not recommended to use because of so many
corner cases where it can fail.  So again experimental as an afterthought.

Not to mention that even if you are aware of the defa

RE: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Kenneth Brotman
 Josh, 

To say nothing is indifference.  If you care about your community, sometimes 
don't you have to bring up a subject even though you know it's also temporarily 
adding some discomfort?  

As to opening a JIRA, I've got a very specific topic to try in mind now.  An 
easy one I'll work on and then announce.  Someone else will have to do the 
coding.  A year from now I would probably just knock it out to make sure it's 
as easy as I expect it to be but to be honest, as I've been saying, I'm not set 
up to do that right now.  I've barely looked at any Cassandra code; for one; 
everyone on this list probably codes more than I do, secondly; and lastly, it's 
a good one for someone that wants an easy one to start with: vNodes.  I've 
already seen too many people seeking assistance with the vNode setting.

And you can expect as others have been mentioning that there should be similar 
ones on compaction, repair and backup. 

Microsoft knows poor usability gives them an easy market to take over. And they 
make it easy to switch.

Beginning at 4:17 in the video, it says the following:

"You don't need to worry about replica sets, quorum or read repair.  
You can focus on writing correct application logic."

At 4:42, it says:
"Hopefully this gives you a quick idea of how seamlessly you can bring 
your existing Cassandra applications to Azure Cosmos DB.  No code changes are 
required.  It works with your favorite Cassandra tools and drivers including 
for example native Cassandra driver for Spark. And it takes seconds to get 
going, and it's elastically and globally scalable."

More to come,

Kenneth Brotman

-Original Message-
From: Josh McKenzie [mailto:jmcken...@apache.org] 
Sent: Wednesday, February 21, 2018 8:28 AM
To: dev@cassandra.apache.org
Cc: User
Subject: Re: Cassandra Needs to Grow Up by Version Five!

There's a disheartening amount of "here's where Cassandra is bad, and here's 
what it needs to do for me for free" happening in this thread.

This is open-source software. Everyone is *strongly encouraged* to submit a 
patch to move the needle on *any* of these things being complained about in 
this thread.

For the Apache Way  to work, 
people need to step up and meaningfully contribute to a project to scratch 
their own itch instead of just waiting for a random corporation-subsidized 
engineer to happen to have interests that align with them and contribute that 
to the project.

Beating a dead horse for things everyone on the project knows are serious pain 
points is not productive.

On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin < 
oleksandr.shul...@zalando.de> wrote:

> On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman < 
> kenbrot...@yahoo.com.invalid> wrote:
>
> >
> > >> Cluster wide management should be a big theme in any next major
> release.
> > >>
> > >Na. Stability and testing should be a big theme in the next major
> release.
> > >
> >
> > Double Na on that one Jeff.  I think you have a concern there about 
> > the need to test sufficiently to ensure the stability of the next 
> > major release.  That makes perfect sense.- for every release, 
> > especially the major ones.  Continuous improvement is not a phase of 
> > development for example.  CI should be in everything, in every 
> > phase.  Stability and testing a part of every release not just one.  
> > A major release should be
> a
> > nice step from the previous major release though.
> >
>
> I guess what Jeff refers to is the tick-tock release cycle experiment, 
> which has proven to be a complete disaster by popular opinion.
>
> There's also the "materialized views" feature which failed to 
> materialize in the end (pun intended) and had to be declared 
> experimental retroactively.
>
> Another prominent example is incremental repair which was introduced 
> as the default option in 2.2 and now is not recommended to use because 
> of so many corner cases where it can fail.  So again experimental as an 
> afterthought.
>
> Not to mention that even if you are aware of the default incremental 
> and go with full repair instead, you're still up for a sad surprise:
> anti-compaction will be triggered despite the "full" repair.  Because 
> anti-compaction is only disabled in case of sub-range repair (don't 
> ask why), so you need to use something advanced like Reaper if you 
> want to avoid that.  I don't think you'll ever find this in the documentation.
>
> Honestly, for an eventually-consistent system like Cassandra 
> anti-entropy repair is one of the most important pieces to get right.  
> And Cassandra fails really badly on that one: the feature is not 
> really well designed, poorly implemented and under-documented.
>
> In a summary, IMO, Cassandra is a poor implementation of some good ideas.
> It is a collection of hacks, not features.  They sometimes play 
> together accidentally, and rarely by design.
>
> Regards,
> --
> Alex
>


-

FINAL REMINDER: CFP for Apache EU Roadshow Closes 25th February

2018-02-21 Thread Sharan F

Hello Apache Supporters and Enthusiasts

This is your FINAL reminder that the Call for Papers (CFP) for the 
Apache EU Roadshow is closing soon. Our Apache EU Roadshow will focus on 
Cloud, IoT, Apache Tomcat, Apache Http and will run from 13-14 June 2018 
in Berlin.
Note that the CFP deadline has been extended to *25*^*th* *February *and 
it will be your final opportunity to submit a talk for thisevent.


Please make your submissions at http://apachecon.com/euroadshow18/

Also note that early bird ticket registrations to attend FOSS Backstage 
including the Apache EU Roadshow, have also been extended and will be 
available until 23^rd February. Please register at 
https://foss-backstage.de/tickets


We look forward to seeing you in Berlin!

Thanks
Sharan Foga, VP Apache Community Development

PLEASE NOTE: You are receiving this message because you are subscribed 
to a user@ or dev@ list of one or more Apache Software Foundation projects.




Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Josh McKenzie
There's a disheartening amount of "here's where Cassandra is bad, and
here's what it needs to do for me for free" happening in this thread.

This is open-source software. Everyone is *strongly encouraged* to submit a
patch to move the needle on *any* of these things being complained about in
this thread.

For the Apache Way  to work,
people need to step up and meaningfully contribute to a project to scratch
their own itch instead of just waiting for a random corporation-subsidized
engineer to happen to have interests that align with them and contribute
that to the project.

Beating a dead horse for things everyone on the project knows are serious
pain points is not productive.

On Wed, Feb 21, 2018 at 5:45 AM, Oleksandr Shulgin <
oleksandr.shul...@zalando.de> wrote:

> On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
> kenbrot...@yahoo.com.invalid> wrote:
>
> >
> > >> Cluster wide management should be a big theme in any next major
> release.
> > >>
> > >Na. Stability and testing should be a big theme in the next major
> release.
> > >
> >
> > Double Na on that one Jeff.  I think you have a concern there about the
> > need to test sufficiently to ensure the stability of the next major
> > release.  That makes perfect sense.- for every release, especially the
> > major ones.  Continuous improvement is not a phase of development for
> > example.  CI should be in everything, in every phase.  Stability and
> > testing a part of every release not just one.  A major release should be
> a
> > nice step from the previous major release though.
> >
>
> I guess what Jeff refers to is the tick-tock release cycle experiment,
> which has proven to be a complete disaster by popular opinion.
>
> There's also the "materialized views" feature which failed to materialize
> in the end (pun intended) and had to be declared experimental
> retroactively.
>
> Another prominent example is incremental repair which was introduced as the
> default option in 2.2 and now is not recommended to use because of so many
> corner cases where it can fail.  So again experimental as an afterthought.
>
> Not to mention that even if you are aware of the default incremental and go
> with full repair instead, you're still up for a sad surprise:
> anti-compaction will be triggered despite the "full" repair.  Because
> anti-compaction is only disabled in case of sub-range repair (don't ask
> why), so you need to use something advanced like Reaper if you want to
> avoid that.  I don't think you'll ever find this in the documentation.
>
> Honestly, for an eventually-consistent system like Cassandra anti-entropy
> repair is one of the most important pieces to get right.  And Cassandra
> fails really badly on that one: the feature is not really well designed,
> poorly implemented and under-documented.
>
> In a summary, IMO, Cassandra is a poor implementation of some good ideas.
> It is a collection of hacks, not features.  They sometimes play together
> accidentally, and rarely by design.
>
> Regards,
> --
> Alex
>


Issues with Materialized-Views updates during a cluster change?

2018-02-21 Thread Nadav Har'El
Hi, I was trying to understand how view tables are updated during a period
of range movements, namely bootstrapping of a new node or decommissioning
one of the nodes. In particular, during the period of data streaming, we
can have a new replica on a "pending node" to which we also need to send
the view update.

I looked at the mutateMV() code, and think I spotted two issues with it,
and I wonder if I'm missing something or these are real problems:

1. It seems that for example when RF=3, each one of the three base replicas
will send a view update to the fourth "pending node". While this is not
wrong, it's also inefficient - why send three copies of the same update?
Wouldn't it be more efficient that just one of the base replicas - the one
which eventually will be paired with the pending node - should send the
updates to it? Is there a problem with such a scheme?

2. There's an optimization that when we're lucky enough that the paired
view replica is the same as this base replica, mutateMV doesn't use the
normal view-mutation-sending code (wrapViewBatchResponseHandler) and just
writes the mutation locally. In particular, in this case we do NOT write to
the pending node (unless I'm missing something). But, sometimes all
replicas will be paired with themselves - this can happen for example when
number of nodes is equal to RF, or when the base and view table have the
same partition keys (but different clustering keys). In this case, it seems
the pending node will not be written at all... Isn't this a bug?

The strange thing about issue 2 is that this code used to be correct (at
least according to my understanding...) - it used to avoid this
optimization if pendingNodes was not empty. But then this was changed in
commit 12103653f31. Why?
https://issues.apache.org/jira/browse/CASSANDRA-13069 contains an
explanation to that change:
 "I also removed the pendingEndpoints.isEmpty() condition to skip the
batchlog for local mutations, since this was a pre-CASSANDRA-10674 leftover
when ViewUtils.getViewNaturalEndpoint returned the local address to force
non-paired replicas to be written to the batchlog." (Paulo Motta, 21/Dec/16)

But I don't understand this explanation... Being paired with yourself is
not only a "trick", but also something which really happens (by chance or
in some cases as I showed above, always), and needs to be handled
correctly, even if the cluster grows. If none of the base replicas will
send the view update to the pending node, it will end up missing this
update...

Thanks,
Nadav.


--
Nadav Har'El
n...@scylladb.com


Re: Cassandra Needs to Grow Up by Version Five!

2018-02-21 Thread Oleksandr Shulgin
On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:

>
> >> Cluster wide management should be a big theme in any next major release.
> >>
> >Na. Stability and testing should be a big theme in the next major release.
> >
>
> Double Na on that one Jeff.  I think you have a concern there about the
> need to test sufficiently to ensure the stability of the next major
> release.  That makes perfect sense.- for every release, especially the
> major ones.  Continuous improvement is not a phase of development for
> example.  CI should be in everything, in every phase.  Stability and
> testing a part of every release not just one.  A major release should be a
> nice step from the previous major release though.
>

I guess what Jeff refers to is the tick-tock release cycle experiment,
which has proven to be a complete disaster by popular opinion.

There's also the "materialized views" feature which failed to materialize
in the end (pun intended) and had to be declared experimental retroactively.

Another prominent example is incremental repair which was introduced as the
default option in 2.2 and now is not recommended to use because of so many
corner cases where it can fail.  So again experimental as an afterthought.

Not to mention that even if you are aware of the default incremental and go
with full repair instead, you're still up for a sad surprise:
anti-compaction will be triggered despite the "full" repair.  Because
anti-compaction is only disabled in case of sub-range repair (don't ask
why), so you need to use something advanced like Reaper if you want to
avoid that.  I don't think you'll ever find this in the documentation.

Honestly, for an eventually-consistent system like Cassandra anti-entropy
repair is one of the most important pieces to get right.  And Cassandra
fails really badly on that one: the feature is not really well designed,
poorly implemented and under-documented.

In a summary, IMO, Cassandra is a poor implementation of some good ideas.
It is a collection of hacks, not features.  They sometimes play together
accidentally, and rarely by design.

Regards,
--
Alex