I am +1 with enabling rejection by default. We had encountered similar situations before that we lost data in silence, which made us create a patch to trade availability with data loss. While I agree that it might be a surprise to operators, I think it's worth having good communication in the NEWS.txt and logging the exceptions explicitly. That said, we might create a one-time surprise instead of losing data over and over silently.
On Thu, Sep 12, 2024 at 10:44 AM Jordan West <jw...@apache.org> wrote: > I’m +1 on enabling rejection by default on all branches. We have been bit > by silent data loss (due to other bugs like the schema issues in 4.1) from > lack of rejection on several occasions and short of writing extremely > specialized tooling its unrecoverable. While both lack of availability and > data loss are critical, I will always pick lack of availability over data > loss. Its better to fail a write that will be lost than silently lose it. > > Of course, a change like this requires very good communication in NEWS.txt > and elsewhere but I think its well worth it. While it may surprise some > users I think they would be more surprised that they were silently losing > data. > > Jordan > > On Thu, Sep 12, 2024 at 10:22 Mick Semb Wever <m...@apache.org> wrote: > >> Thanks for starting the thread Caleb, it is a big and impacting patch. >> >> Appreciate the criticality, in a new major release rejection by default >> is obvious. Otherwise the logging and metrics is an important addition to >> help users validate the existence and degree of any problem. >> >> Also worth mentioning that rejecting writes can cause degraded >> availability in situations that pose no problem. This is a coordination >> problem on a probabilistic design, it's choose your evil: unnecessary >> degraded availability or mislocated data (eventual data loss). Logging >> and metrics makes alerting on and handling the data mislocation possible, >> i.e. avoids data loss with manual intervention. (Logging and metrics also >> face the same problem with false positives.) >> >> I'm +0 for rejection default in 5.0.1, and +1 for only logging default in >> 4.x >> >> >> On Thu, 12 Sept 2024 at 18:56, Jeff Jirsa <jji...@gmail.com> wrote: >> >>> This patch is so hard for me. >>> >>> The safety it adds is critical and should have been added a decade ago. >>> Also it’s a huge patch, and touches “everything”. >>> >>> It definitely belongs in 5.0. I’d probably reject by default in 5.0.1. >>> >>> 4.0 / 4.1 - if we treat this like a fix for latent opportunity for data >>> loss (which it implicitly is), I guess? >>> >>> >>> >>> > On Sep 12, 2024, at 9:46 AM, Brandon Williams <dri...@gmail.com> >>> wrote: >>> > >>> > On Thu, Sep 12, 2024 at 11:41 AM Caleb Rackliffe >>> > <calebrackli...@gmail.com> wrote: >>> >> >>> >> Are you opposed to the patch in its entirety, or just rejecting >>> unsafe operations by default? >>> > >>> > I had the latter in mind. Changing any default in a patch release is >>> > a potential surprise for operators and one of this nature especially >>> > so. >>> > >>> > Kind Regards, >>> > Brandon >>> >>>