Everyone, please stop for a minute, read our CoC 
https://solr.apache.org/community.html#code-of-conduct, take a deep breath and 
continue in a constructive way.

I see no urgency at all in this matter. This can be handled as day-to-day bug 
fixing as usual.

The fix in SOLR-14245 (which has been out for more than a year) was done in 
good faith, and solved a real issue. It may not be perfect, and I'm sure it 
helped uncover other bugs. I have no idea how a 'null' value made it into 
state.json but I'm sure it is possible to find that bug, commit a fix, 
celebrate and move on!

Jan

> 18. mai 2021 kl. 13:26 skrev Ishan Chattopadhyaya <[email protected]>:
> 
> https://issues.apache.org/jira/browse/SOLR-14245 
> <https://issues.apache.org/jira/browse/SOLR-14245>
> 
> There was a production outage at odd hours at my (and Noble's) client, due to 
> this above change in Solr 8.5 onwards by Andrzej Bialecki.
> 
> In short, there is some bug in Solr where a replica gets "null" as the 
> node_name (upon invocation of a collection API command). On the rare 
> occasions where we encountered such situations in the past, the replica would 
> be unavailable and the system would work fine overall. However, this change 
> (which introduces strict validation of errors while *reading* Replica 
> objects) now means that if such a situation arises (where some Solr's APIs 
> itself results in node_name being null in a state.json), all SolrJ clients 
> and all Solr nodes will go for a toss (possibly crash, and not start back up).
> 
> This change was rushed in, without any discussions or review, without 
> extensive testing for the failures it will cause on existing systems where 
> cluster state is messed up but system is running, and without any 
> consideration for the impact on users.
> 
> Noble and I are of the opinion that this change should be reverted 
> immediately, considering the impact to users. However, there is strong 
> disagreement on Andrzej's part.
> 
> Mistakes happen, but doubling down on them irrationally [1] will destroy the 
> reputation of the project, let alone the peace of mind of those who are 
> running Solr in production.
> 
> Does someone have any thoughts or opinions?
> 
> [1] - 
> https://issues.apache.org/jira/browse/SOLR-14245?focusedCommentId=17346758&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17346758
>  
> <https://issues.apache.org/jira/browse/SOLR-14245?focusedCommentId=17346758&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17346758>

Reply via email to