Re: Implicit Casts for Arithmetic Operators

2018-11-21 Thread Jonathan Haddad
I can’t agree more. We should be able to make changes in a manner that
improves the DB In the long term, rather than live with the technical debt
of arbitrary decisions made by a handful of people.

I also agree that putting a knob in place to let people migrate over is a
reasonable decision.

Jon

On Wed, Nov 21, 2018 at 4:54 PM Benedict Elliott Smith 
wrote:

> The goal is simply to agree on a set of well-defined principles for how we
> should behave.  If we don’t like the implications that arise, we’ll have
> another vote?  A democracy cannot bind itself, so I never understood this
> fear of a decision.
>
> A database also has a thousand toggles.  If we absolutely need to, we can
> introduce one more.
>
> We should be doing this upfront a great deal more often.  Doing it
> retrospectively sucks, but in my opinion it's a bad reason to bind
> ourselves to whatever made it in.
>
> Do we anywhere define the principles of our current behaviour?  I couldn’t
> find it.
>
>
> > On 21 Nov 2018, at 21:08, Sylvain Lebresne  wrote:
> >
> > On Tue, Nov 20, 2018 at 5:02 PM Benedict Elliott Smith <
> bened...@apache.org>
> > wrote:
> >
> >> FWIW, my meaning of arithmetic in this context extends to any features
> we
> >> have already released (such as aggregates, and perhaps other built-in
> >> functions) that operate on the same domain.  We should be consistent,
> after
> >> all.
> >>
> >> Whether or not we need to revisit any existing functionality we can
> figure
> >> out after the fact, once we have agreed what our behaviour should be.
> >>
> >
> > I'm not sure I correctly understand the process suggested, but I don't
> > particularly like/agree with what I understand. What I understand is a
> > suggestion for voting on agreeing to be ANSI SQL 92 compliant, with no
> real
> > evaluation of what that entails (at least I haven't seen one), and that
> > this vote, if passed, would imply we'd then make any backward
> incompatible
> > change necessary to achieve compliance ("my meaning of arithmetic in this
> > context extends to any features we have already released" and "Whether or
> > not we need to revisit any existing functionality we can figure out after
> > the fact, once we have agreed what our behaviour should be").
> >
> > This might make sense of a new product, but at our stage that seems
> > backward to me. I think we owe our users to first make the effort of
> > identifying what "inconsistencies" our existing arithmetic has[1] and
> > _then_ consider what options we have to fix those, with their pros and
> cons
> > (including how bad they break backward compatibility). And if _then_
> > getting ANSI SQL 92 compliant proves to not be disruptive (or at least
> > acceptably so), then sure, that's great.
> >
> > [1]: one possibly efficient way to do that could actually be to compare
> our
> > arithmetic to ANSI SQL 92. Not that all differences found would imply
> > inconsistencies/wrongness of our arithmetic, but still, it should be
> > helpful. And I guess my whole point is that we should that analysis
> first,
> > and then maybe decide that being ANSI SQL 92 is a reasonable option, not
> > decide first and live with the consequences no matter what they are.
> >
> > --
> > Sylvain
> >
> >
> >> I will make this more explicit for the vote, but just to clarify the
> >> intention so that we are all discussing the same thing.
> >>
> >>
> >>> On 20 Nov 2018, at 14:18, Ariel Weisberg  wrote:
> >>>
> >>> Hi,
> >>>
> >>> +1
> >>>
> >>> This is a public API so we will be much better off if we get it right
> >> the first time.
> >>>
> >>> Ariel
> >>>
>  On Nov 16, 2018, at 10:36 AM, Jonathan Haddad 
> >> wrote:
> 
>  Sounds good to me.
> 
>  On Fri, Nov 16, 2018 at 5:09 AM Benedict Elliott Smith <
> >> bened...@apache.org>
>  wrote:
> 
> > So, this thread somewhat petered out.
> >
> > There are still a number of unresolved issues, but to make progress I
> > wonder if it would first be helpful to have a vote on ensuring we are
> >> ANSI
> > SQL 92 compliant for our arithmetic?  This seems like a sensible
> >> baseline,
> > since we will hopefully minimise surprise to operators this way.
> >
> > If people largely agree, I will call a vote, and we can pick up a
> >> couple
> > of more focused discussions afterwards on how we interpret the leeway
> >> it
> > gives.
> >
> >
> >> On 12 Oct 2018, at 18:10, Ariel Weisberg  wrote:
> >>
> >> Hi,
> >>
> >> From reading the spec. Precision is always implementation defined.
> The
> > spec specifies scale in several cases, but never precision for any
> >> type or
> > operation (addition/subtraction, multiplication, division).
> >>
> >> So we don't implement anything remotely approaching precision and
> >> scale
> > in CQL when it comes to numbers I think? So we aren't going to follow
> >> the
> > spec for scale. We are already pretty far down that road so I would
> >

Re: Implicit Casts for Arithmetic Operators

2018-11-21 Thread Benedict Elliott Smith
The goal is simply to agree on a set of well-defined principles for how we 
should behave.  If we don’t like the implications that arise, we’ll have 
another vote?  A democracy cannot bind itself, so I never understood this fear 
of a decision.

A database also has a thousand toggles.  If we absolutely need to, we can 
introduce one more.

We should be doing this upfront a great deal more often.  Doing it 
retrospectively sucks, but in my opinion it's a bad reason to bind ourselves to 
whatever made it in.

Do we anywhere define the principles of our current behaviour?  I couldn’t find 
it.


> On 21 Nov 2018, at 21:08, Sylvain Lebresne  wrote:
> 
> On Tue, Nov 20, 2018 at 5:02 PM Benedict Elliott Smith 
> wrote:
> 
>> FWIW, my meaning of arithmetic in this context extends to any features we
>> have already released (such as aggregates, and perhaps other built-in
>> functions) that operate on the same domain.  We should be consistent, after
>> all.
>> 
>> Whether or not we need to revisit any existing functionality we can figure
>> out after the fact, once we have agreed what our behaviour should be.
>> 
> 
> I'm not sure I correctly understand the process suggested, but I don't
> particularly like/agree with what I understand. What I understand is a
> suggestion for voting on agreeing to be ANSI SQL 92 compliant, with no real
> evaluation of what that entails (at least I haven't seen one), and that
> this vote, if passed, would imply we'd then make any backward incompatible
> change necessary to achieve compliance ("my meaning of arithmetic in this
> context extends to any features we have already released" and "Whether or
> not we need to revisit any existing functionality we can figure out after
> the fact, once we have agreed what our behaviour should be").
> 
> This might make sense of a new product, but at our stage that seems
> backward to me. I think we owe our users to first make the effort of
> identifying what "inconsistencies" our existing arithmetic has[1] and
> _then_ consider what options we have to fix those, with their pros and cons
> (including how bad they break backward compatibility). And if _then_
> getting ANSI SQL 92 compliant proves to not be disruptive (or at least
> acceptably so), then sure, that's great.
> 
> [1]: one possibly efficient way to do that could actually be to compare our
> arithmetic to ANSI SQL 92. Not that all differences found would imply
> inconsistencies/wrongness of our arithmetic, but still, it should be
> helpful. And I guess my whole point is that we should that analysis first,
> and then maybe decide that being ANSI SQL 92 is a reasonable option, not
> decide first and live with the consequences no matter what they are.
> 
> --
> Sylvain
> 
> 
>> I will make this more explicit for the vote, but just to clarify the
>> intention so that we are all discussing the same thing.
>> 
>> 
>>> On 20 Nov 2018, at 14:18, Ariel Weisberg  wrote:
>>> 
>>> Hi,
>>> 
>>> +1
>>> 
>>> This is a public API so we will be much better off if we get it right
>> the first time.
>>> 
>>> Ariel
>>> 
 On Nov 16, 2018, at 10:36 AM, Jonathan Haddad 
>> wrote:
 
 Sounds good to me.
 
 On Fri, Nov 16, 2018 at 5:09 AM Benedict Elliott Smith <
>> bened...@apache.org>
 wrote:
 
> So, this thread somewhat petered out.
> 
> There are still a number of unresolved issues, but to make progress I
> wonder if it would first be helpful to have a vote on ensuring we are
>> ANSI
> SQL 92 compliant for our arithmetic?  This seems like a sensible
>> baseline,
> since we will hopefully minimise surprise to operators this way.
> 
> If people largely agree, I will call a vote, and we can pick up a
>> couple
> of more focused discussions afterwards on how we interpret the leeway
>> it
> gives.
> 
> 
>> On 12 Oct 2018, at 18:10, Ariel Weisberg  wrote:
>> 
>> Hi,
>> 
>> From reading the spec. Precision is always implementation defined. The
> spec specifies scale in several cases, but never precision for any
>> type or
> operation (addition/subtraction, multiplication, division).
>> 
>> So we don't implement anything remotely approaching precision and
>> scale
> in CQL when it comes to numbers I think? So we aren't going to follow
>> the
> spec for scale. We are already pretty far down that road so I would
>> leave
> it alone.
>> 
>> I don't think the spec is asking for the most approximate type. It's
> just saying the result is approximate, and the precision is
>> implementation
> defined. We could return either float or double. I think if one of the
> operands is a double we should return a double because clearly the
>> schema
> thought a double was required to represent that number. I would also
>> be in
> favor of returning a double all the time so that people can expect a
> consistent type from expressions involving approximate numbers.
>> 
>>

Re: Implicit Casts for Arithmetic Operators

2018-11-21 Thread Sylvain Lebresne
On Tue, Nov 20, 2018 at 5:02 PM Benedict Elliott Smith 
wrote:

> FWIW, my meaning of arithmetic in this context extends to any features we
> have already released (such as aggregates, and perhaps other built-in
> functions) that operate on the same domain.  We should be consistent, after
> all.
>
> Whether or not we need to revisit any existing functionality we can figure
> out after the fact, once we have agreed what our behaviour should be.
>

I'm not sure I correctly understand the process suggested, but I don't
particularly like/agree with what I understand. What I understand is a
suggestion for voting on agreeing to be ANSI SQL 92 compliant, with no real
evaluation of what that entails (at least I haven't seen one), and that
this vote, if passed, would imply we'd then make any backward incompatible
change necessary to achieve compliance ("my meaning of arithmetic in this
context extends to any features we have already released" and "Whether or
not we need to revisit any existing functionality we can figure out after
the fact, once we have agreed what our behaviour should be").

This might make sense of a new product, but at our stage that seems
backward to me. I think we owe our users to first make the effort of
identifying what "inconsistencies" our existing arithmetic has[1] and
_then_ consider what options we have to fix those, with their pros and cons
(including how bad they break backward compatibility). And if _then_
getting ANSI SQL 92 compliant proves to not be disruptive (or at least
acceptably so), then sure, that's great.

[1]: one possibly efficient way to do that could actually be to compare our
arithmetic to ANSI SQL 92. Not that all differences found would imply
inconsistencies/wrongness of our arithmetic, but still, it should be
helpful. And I guess my whole point is that we should that analysis first,
and then maybe decide that being ANSI SQL 92 is a reasonable option, not
decide first and live with the consequences no matter what they are.

--
Sylvain


> I will make this more explicit for the vote, but just to clarify the
> intention so that we are all discussing the same thing.
>
>
> > On 20 Nov 2018, at 14:18, Ariel Weisberg  wrote:
> >
> > Hi,
> >
> > +1
> >
> > This is a public API so we will be much better off if we get it right
> the first time.
> >
> > Ariel
> >
> >> On Nov 16, 2018, at 10:36 AM, Jonathan Haddad 
> wrote:
> >>
> >> Sounds good to me.
> >>
> >> On Fri, Nov 16, 2018 at 5:09 AM Benedict Elliott Smith <
> bened...@apache.org>
> >> wrote:
> >>
> >>> So, this thread somewhat petered out.
> >>>
> >>> There are still a number of unresolved issues, but to make progress I
> >>> wonder if it would first be helpful to have a vote on ensuring we are
> ANSI
> >>> SQL 92 compliant for our arithmetic?  This seems like a sensible
> baseline,
> >>> since we will hopefully minimise surprise to operators this way.
> >>>
> >>> If people largely agree, I will call a vote, and we can pick up a
> couple
> >>> of more focused discussions afterwards on how we interpret the leeway
> it
> >>> gives.
> >>>
> >>>
>  On 12 Oct 2018, at 18:10, Ariel Weisberg  wrote:
> 
>  Hi,
> 
>  From reading the spec. Precision is always implementation defined. The
> >>> spec specifies scale in several cases, but never precision for any
> type or
> >>> operation (addition/subtraction, multiplication, division).
> 
>  So we don't implement anything remotely approaching precision and
> scale
> >>> in CQL when it comes to numbers I think? So we aren't going to follow
> the
> >>> spec for scale. We are already pretty far down that road so I would
> leave
> >>> it alone.
> 
>  I don't think the spec is asking for the most approximate type. It's
> >>> just saying the result is approximate, and the precision is
> implementation
> >>> defined. We could return either float or double. I think if one of the
> >>> operands is a double we should return a double because clearly the
> schema
> >>> thought a double was required to represent that number. I would also
> be in
> >>> favor of returning a double all the time so that people can expect a
> >>> consistent type from expressions involving approximate numbers.
> 
>  I am a big fan of widening for arithmetic expressions in a database to
> >>> avoid having to error on overflow. You can go to the trouble of only
> >>> widening the minimum amount, but I think it's simpler if we always
> widen to
> >>> bigint and double. This would be something the spec allows.
> 
>  Definitely if we can make overflow not occur we should and the spec
> >>> allows that. We should also not return different types for the same
> operand
> >>> types just to work around overflow if we detect we need more precision.
> 
>  Ariel
> > On Fri, Oct 12, 2018, at 12:45 PM, Benedict Elliott Smith wrote:
> > If it’s in the SQL spec, I’m fairly convinced.  Thanks for digging
> this
> > out (and Mike for getting some empirical exam

Re: Request to review feature-freeze proposed tickets

2018-11-21 Thread Joshua McKenzie
> If those tickets were sitting in patch available state prior to the
freeze they *should* get in.
I assume it's obvious to everyone that this should be taken on a
case-by-case basis. There's at least 2 that were in that list (one of which
Marcus bumped off PA) that are potentially big and hairy changes that would
disrupt in-flight testing cycles.

On Wed, Nov 21, 2018 at 3:43 AM dinesh.jo...@yahoo.com.INVALID
 wrote:

> Kurt, I don't believe this should be subject of "heated debate". If those
> tickets were sitting in patch available state prior to the freeze they
> *should* get in.
> Vinay, I can help review the tickets.
> Dinesh
>
> On Tuesday, November 20, 2018, 2:59:18 PM PST, kurt greaves <
> k...@instaclustr.com> wrote:
>
>  Thanks Vinay. While I suspect this will be subject to heated debate, I'm
> also for this. The time to review for this project is incredibly
> demotivating, and it stems from a lack of contributors that are interested
> in the general health of the project. I think this can be quite easily
> remedied by making more committers/PMC, however there is a catch-22 that to
> achieve this our existing set of committers needs to be dedicated to
> reviewing contributions from non-committers.
>
> If we can get dedicated reviewers for the listed tickets I'll take on some
> of the work to get the tickets up to scratch.
>
> On Wed, 21 Nov 2018 at 02:12, Ariel Weisberg  wrote:
>
> > Hi,
> >
> > I would like to get as many of these as is feasible in. Before the
> feature
> > freeze started 1 out of 17 JIRAs that were patch available were reviewed
> > and committed.
> >
> > If you didn’t have access reviewers and committers, as the one out of the
> > 17 did, it has been essentially impossible to get your problems with
> > Cassandra fixed in 4.0.
> >
> > This is basically the same as saying that despite the fact Cassandra is
> > open source it does you no good because it will be years before the
> issues
> > impacting you get fixed even if you contribute the fixes yourself.
> >
> > Pulling up the ladder after getting “your own” fixes in is a sure fire
> way
> > to fracture the community into a collection of private forks containing
> the
> > fixes people can’t live without, and pushing people to look at
> alternatives.
> >
> > Private forks are a serious threat to the project. The people on them are
> > at risk of getting left behind and Cassandra stagnates for them and
> becomes
> > uncompetitive. Those with the resources to maintain a seriously diverged
> > fork are also the ones better positioned to be active contributors.
> >
> > Regards,
> > Ariel
> >
> > > On Nov 18, 2018, at 9:18 PM, Vinay Chella 
> > wrote:
> > >
> > > Hi,
> > >
> > > We still have 15 Patch Available/ open tickets which were requested for
> > > reviews before the Sep 1, 2018 freeze. I am starting this email thread
> to
> > > resurface and request a review of community tickets as most of these
> > > tickets address vital correctness, performance, and usability bugs that
> > > help avoid critical production issues. I tried to provide context on
> why
> > we
> > > feel these tickets are important to get into 4.0. If you would like to
> > > discuss the technical details of a particular ticket, let's try to do
> > that
> > > in JIRA.
> > >
> > > CASSANDRA-14525: Cluster enters an inconsistent state after bootstrap
> > > failures. (Correctness bug, Production impact, Ready to Commit)
> > >
> > > CASSANDRA-14459: DES sends requests to the wrong nodes routinely. (SLA
> > > breaking latencies, Production impact, Review in progress)
> > >
> > > CASSANDRA-14303 and CASSANDRA-14557: Currently production 3.0+ clusters
> > > cannot be rebuilt after node failure due to 3.0’s introduction of the
> > > system_auth keyspace with rf of 1. These tickets both fix the
> regression
> > > introduced in 3.0 by letting operators configure rf=3 and prevent
> future
> > > outages (Usability bug, Production impact, Patch Available).
> > >
> > > CASSANDRA-14096: Cassandra 3.11.1 Repair Causes Out of Memory. We
> believe
> > > this may also impact 3.0 (Title says it all, Production impact, Patch
> > > Available)
> > >
> > > CASSANDRA-10023: It is impossible to accurately determine local
> > read/write
> > > calls on C*. This patch allows users to detect when they are choosing
> > > incorrect coordinators. (Usability bug (troubleshoot), Review in
> > progress)
> > >
> > > CASSANDRA-10789: There is no way to safely stop bad clients bringing
> down
> > > C* nodes. This patch would give operators a very important tool to use
> > > during production incidents to mitigate impact. (Usability bug,
> > Production
> > > Impact (recovery), Patch Available)
> > >
> > > CASSANDRA-13010: No visibility into which disk is being compacted to.
> > > (Usability bug, Production Impact (troubleshoot), Review in progress)
> > >
> > > CASSANDRA-12783 - Break up large MV mutations to prevent OOMs (Title
> says
> > > it all, Production Impact, Patch InProgress/ Awaiting Fee

Re: Request to review feature-freeze proposed tickets

2018-11-21 Thread dinesh.jo...@yahoo.com.INVALID
Kurt, I don't believe this should be subject of "heated debate". If those 
tickets were sitting in patch available state prior to the freeze they *should* 
get in.
Vinay, I can help review the tickets.
Dinesh 

On Tuesday, November 20, 2018, 2:59:18 PM PST, kurt greaves 
 wrote:  
 
 Thanks Vinay. While I suspect this will be subject to heated debate, I'm
also for this. The time to review for this project is incredibly
demotivating, and it stems from a lack of contributors that are interested
in the general health of the project. I think this can be quite easily
remedied by making more committers/PMC, however there is a catch-22 that to
achieve this our existing set of committers needs to be dedicated to
reviewing contributions from non-committers.

If we can get dedicated reviewers for the listed tickets I'll take on some
of the work to get the tickets up to scratch.

On Wed, 21 Nov 2018 at 02:12, Ariel Weisberg  wrote:

> Hi,
>
> I would like to get as many of these as is feasible in. Before the feature
> freeze started 1 out of 17 JIRAs that were patch available were reviewed
> and committed.
>
> If you didn’t have access reviewers and committers, as the one out of the
> 17 did, it has been essentially impossible to get your problems with
> Cassandra fixed in 4.0.
>
> This is basically the same as saying that despite the fact Cassandra is
> open source it does you no good because it will be years before the issues
> impacting you get fixed even if you contribute the fixes yourself.
>
> Pulling up the ladder after getting “your own” fixes in is a sure fire way
> to fracture the community into a collection of private forks containing the
> fixes people can’t live without, and pushing people to look at alternatives.
>
> Private forks are a serious threat to the project. The people on them are
> at risk of getting left behind and Cassandra stagnates for them and becomes
> uncompetitive. Those with the resources to maintain a seriously diverged
> fork are also the ones better positioned to be active contributors.
>
> Regards,
> Ariel
>
> > On Nov 18, 2018, at 9:18 PM, Vinay Chella 
> wrote:
> >
> > Hi,
> >
> > We still have 15 Patch Available/ open tickets which were requested for
> > reviews before the Sep 1, 2018 freeze. I am starting this email thread to
> > resurface and request a review of community tickets as most of these
> > tickets address vital correctness, performance, and usability bugs that
> > help avoid critical production issues. I tried to provide context on why
> we
> > feel these tickets are important to get into 4.0. If you would like to
> > discuss the technical details of a particular ticket, let's try to do
> that
> > in JIRA.
> >
> > CASSANDRA-14525: Cluster enters an inconsistent state after bootstrap
> > failures. (Correctness bug, Production impact, Ready to Commit)
> >
> > CASSANDRA-14459: DES sends requests to the wrong nodes routinely. (SLA
> > breaking latencies, Production impact, Review in progress)
> >
> > CASSANDRA-14303 and CASSANDRA-14557: Currently production 3.0+ clusters
> > cannot be rebuilt after node failure due to 3.0’s introduction of the
> > system_auth keyspace with rf of 1. These tickets both fix the regression
> > introduced in 3.0 by letting operators configure rf=3 and prevent future
> > outages (Usability bug, Production impact, Patch Available).
> >
> > CASSANDRA-14096: Cassandra 3.11.1 Repair Causes Out of Memory. We believe
> > this may also impact 3.0 (Title says it all, Production impact, Patch
> > Available)
> >
> > CASSANDRA-10023: It is impossible to accurately determine local
> read/write
> > calls on C*. This patch allows users to detect when they are choosing
> > incorrect coordinators. (Usability bug (troubleshoot), Review in
> progress)
> >
> > CASSANDRA-10789: There is no way to safely stop bad clients bringing down
> > C* nodes. This patch would give operators a very important tool to use
> > during production incidents to mitigate impact. (Usability bug,
> Production
> > Impact (recovery), Patch Available)
> >
> > CASSANDRA-13010: No visibility into which disk is being compacted to.
> > (Usability bug, Production Impact (troubleshoot), Review in progress)
> >
> > CASSANDRA-12783 - Break up large MV mutations to prevent OOMs (Title says
> > it all, Production Impact, Patch InProgress/ Awaiting Feedback)
> >
> > CASSANDRA-14319 - nodetool rebuild from DC lets you pass invalid
> > datacenters (Usability bug, Production impact, Patch available)
> >
> > CASSANDRA-13841 - Smarter nodetool rebuild. Kind of a bug but would be
> nice
> > to get it in 4.0. (Production Impact (recovery), Patch Available)
> >
> > CASSANDRA-9452: Cleanup of old configuration, confusing to new C*
> > operators. (Cleanup, Patch Available)
> >
> > CASSANDRA-14309: Hint window persistence across the record. This way
> hints
> > that are accumulated over a period of time when nodes are creating are
> less
> > likely to take down the entire cluster. (Potential Production I