To everyone following this e-mail thread.

The Project Management Committees have discussed the matter and would like to 
draw attention to "Statement from the Solr and Lucene PMC regarding recent Code 
of Conduct violations" posted to this list today, and linked below:

https://lists.apache.org/thread.html/r9875b53aeaebca8678ee0127562d8a35c7938906fbd318ac17ba011d%40%3Cdev.solr.apache.org%3E
 
<https://lists.apache.org/thread.html/r9875b53aeaebca8678ee0127562d8a35c7938906fbd318ac17ba011d@%3Cdev.solr.apache.org%3E>
Jan Høydahl
Solr PMC Chair

> 21. mai 2021 kl. 05:52 skrev David Smiley <[email protected]>:
> 
> I removed [email protected] <mailto:[email protected]> from my 
> response here.  Please everyone do the same and don't email both Lucene & 
> Solr at the same time.  I recall that's an old best practice / rule in 
> general -- never address an email to more than one list.
> 
> I agree 100% with Erick.  It's shameful and looks bad on our community and 
> it's just so not necessary.  It's a clear code-of-conduct violation.  I hope 
> Andrzej is "okay" emotionally; I'd be a mess in his shoes.  At least the 
> apologies are very reasonable to me; I was expecting Ishan/Noble to dig their 
> heels in (as I witnessed some months ago) and I'm relieved not to see that.  
> 
> The internal complexity of Solr (esp. SolrCloud) is very high; it's difficult 
> to make changes and not have some worry that maybe a change has some ill 
> effect.  Yet we can't simply not touch it.  The irony here is that the change 
> in question was targeted directly at improving the quality of Solr; I love 
> those types of changes, honestly.
> 
> Perhaps Solr getting it's own Docker images as part of the project may lead 
> to automated Solr-upgrade testing to catch compatibility bugs?  Maybe that 
> might be done at the K8S Solr Operator level integration tests since I'm 
> guessing the Operator facilitates upgrades already?
> 
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley 
> <http://www.linkedin.com/in/davidwsmiley>
> 
> On Tue, May 18, 2021 at 8:54 AM Ishan Chattopadhyaya 
> <[email protected] <mailto:[email protected]>> wrote:
> I apologize for the harsh words, and personally to Andrzej for hurting your 
> feelings. I had no such intentions. 
> 
> > You conveniently don’t mention that I WITHDREW my objection, and instead 
> > proposed a lenient validation (but validation nonetheless!).
> Yes, let me mention that you agreed in principal to reduce the impact of the 
> change (even though not completely revert it). I welcome that and thank you 
> for that. By the time you replied on JIRA, I had already sent this mail.
> 
> > I see no urgency at all in this matter. This can be handled as day-to-day 
> > bug fixing as usual.
> I think this requires an immediate notification to all users to be aware of 
> this situation before upgrading. Also, an immediate breakfix should be 
> helpful for them. 
> 
> > My feelings are hurt, and I'm greatly disappointed in your words, quick 
> > attacking off the cuff regularly rude (IMO) because you happened to have a 
> > bad day.
> I apologize.
> 
> How I saw things is that we have a commitment to our users to give them good 
> quality software that they can rely on. My intention was not to attack 
> Andrzej personally, but to bring about collective awareness regarding this 
> problem: that we, as a community, don't care enough for our users. We need to 
> get better at testing, get better at reviews, better at benchmarks, etc. 
> Individually, we all have the best of intentions, and obviously so does 
> Andrzej. However, we need to get better, and I wanted this to be a starting 
> point in that conversation. Clearly, I was carried over and I apologize for 
> that.
> 
> On Tue, May 18, 2021 at 5:52 PM Andrzej Białecki <[email protected] 
> <mailto:[email protected]>> wrote:
> Ishan, as I pointed out in Jira I don’t care for you implying that I have 
> evil intentions, I resent also your implication that I’m behaving 
> irrationally or don’t care for the users. Those of you who are interested may 
> read the comments in Jira and judge for themselves.
> 
> You conveniently don’t mention that I WITHDREW my objection, and instead 
> proposed a lenient validation (but validation nonetheless!). It’s easy to 
> scream “revert! revert!” but it actually takes some consideration to properly 
> address the original purpose of this change - that is, detecting and avoiding 
> the corruption of replica state. Let’s focus on this and not on pointing 
> fingers.
> 
> As for the production outage - I’m sorry this happened to you. As I hope you 
> and Noble and others are sorry for other inadvertently introduced bugs, which 
> I’m sure brought down many clusters at inconvenient hours... 
> 
> 
>> On 18 May 2021, at 13:26, Ishan Chattopadhyaya <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> https://issues.apache.org/jira/browse/SOLR-14245 
>> <https://issues.apache.org/jira/browse/SOLR-14245>
>> 
>> There was a production outage at odd hours at my (and Noble's) client, due 
>> to this above change in Solr 8.5 onwards by Andrzej Bialecki.
>> 
>> In short, there is some bug in Solr where a replica gets "null" as the 
>> node_name (upon invocation of a collection API command). On the rare 
>> occasions where we encountered such situations in the past, the replica 
>> would be unavailable and the system would work fine overall. However, this 
>> change (which introduces strict validation of errors while *reading* Replica 
>> objects) now means that if such a situation arises (where some Solr's APIs 
>> itself results in node_name being null in a state.json), all SolrJ clients 
>> and all Solr nodes will go for a toss (possibly crash, and not start back 
>> up).
>> 
>> This change was rushed in, without any discussions or review, without 
>> extensive testing for the failures it will cause on existing systems where 
>> cluster state is messed up but system is running, and without any 
>> consideration for the impact on users.
>> 
>> Noble and I are of the opinion that this change should be reverted 
>> immediately, considering the impact to users. However, there is strong 
>> disagreement on Andrzej's part.
>> 
>> Mistakes happen, but doubling down on them irrationally [1] will destroy the 
>> reputation of the project, let alone the peace of mind of those who are 
>> running Solr in production.
>> 
>> Does someone have any thoughts or opinions?
>> 
>> [1] - 
>> https://issues.apache.org/jira/browse/SOLR-14245?focusedCommentId=17346758&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17346758
>>  
>> <https://issues.apache.org/jira/browse/SOLR-14245?focusedCommentId=17346758&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17346758>

Reply via email to