If we don’t reject by default, but log by default, my fear is that we’ll simply be alerting the operator to something that has already gone very wrong that they may not be in any position to ever address.

On Sep 12, 2024, at 12:44 PM, Jordan West <jw...@apache.org> wrote:


I’m +1 on enabling rejection by default on all branches. We have been bit by silent data loss (due to other bugs like the schema issues in 4.1) from lack of rejection on several occasions and short of writing extremely specialized tooling its unrecoverable. While both lack of availability and data loss are critical, I will always pick lack of availability over data loss. Its better to fail a write that will be lost than silently lose it. 

Of course, a change like this requires very good communication in NEWS.txt and elsewhere but I think its well worth it. While it may surprise some users I think they would be more surprised that they were silently losing data. 

Jordan 

On Thu, Sep 12, 2024 at 10:22 Mick Semb Wever <m...@apache.org> wrote:
Thanks for starting the thread Caleb, it is a big and impacting patch.

Appreciate the criticality, in a new major release rejection by default is obvious.   Otherwise the logging and metrics is an important addition to help users validate the existence and degree of any problem.  

Also worth mentioning that rejecting writes can cause degraded availability in situations that pose no problem.  This is a coordination problem on a probabilistic design, it's choose your evil: unnecessary degraded availability or mislocated data (eventual data loss).   Logging and metrics makes alerting on and handling the data mislocation possible, i.e. avoids data loss with manual intervention.  (Logging and metrics also face the same problem with false positives.)

I'm +0 for rejection default in 5.0.1, and +1 for only logging default in 4.x


On Thu, 12 Sept 2024 at 18:56, Jeff Jirsa <jji...@gmail.com> wrote:
This patch is so hard for me.

The safety it adds is critical and should have been added a decade ago.
Also it’s a huge patch, and touches “everything”.

It definitely belongs in 5.0. I’d probably reject by default in 5.0.1. 

4.0 / 4.1 - if we treat this like a fix for latent opportunity for data loss (which it implicitly is), I guess?



> On Sep 12, 2024, at 9:46 AM, Brandon Williams <dri...@gmail.com> wrote:
>
> On Thu, Sep 12, 2024 at 11:41 AM Caleb Rackliffe
> <calebrackli...@gmail.com> wrote:
>>
>> Are you opposed to the patch in its entirety, or just rejecting unsafe operations by default?
>
> I had the latter in mind.  Changing any default in a patch release is
> a potential surprise for operators and one of this nature especially
> so.
>
> Kind Regards,
> Brandon

Reply via email to