Everyone, please stop for a minute, read our CoC https://solr.apache.org/community.html#code-of-conduct, take a deep breath and continue in a constructive way.
I see no urgency at all in this matter. This can be handled as day-to-day bug fixing as usual. The fix in SOLR-14245 (which has been out for more than a year) was done in good faith, and solved a real issue. It may not be perfect, and I'm sure it helped uncover other bugs. I have no idea how a 'null' value made it into state.json but I'm sure it is possible to find that bug, commit a fix, celebrate and move on! Jan > 18. mai 2021 kl. 13:26 skrev Ishan Chattopadhyaya <[email protected]>: > > https://issues.apache.org/jira/browse/SOLR-14245 > <https://issues.apache.org/jira/browse/SOLR-14245> > > There was a production outage at odd hours at my (and Noble's) client, due to > this above change in Solr 8.5 onwards by Andrzej Bialecki. > > In short, there is some bug in Solr where a replica gets "null" as the > node_name (upon invocation of a collection API command). On the rare > occasions where we encountered such situations in the past, the replica would > be unavailable and the system would work fine overall. However, this change > (which introduces strict validation of errors while *reading* Replica > objects) now means that if such a situation arises (where some Solr's APIs > itself results in node_name being null in a state.json), all SolrJ clients > and all Solr nodes will go for a toss (possibly crash, and not start back up). > > This change was rushed in, without any discussions or review, without > extensive testing for the failures it will cause on existing systems where > cluster state is messed up but system is running, and without any > consideration for the impact on users. > > Noble and I are of the opinion that this change should be reverted > immediately, considering the impact to users. However, there is strong > disagreement on Andrzej's part. > > Mistakes happen, but doubling down on them irrationally [1] will destroy the > reputation of the project, let alone the peace of mind of those who are > running Solr in production. > > Does someone have any thoughts or opinions? > > [1] - > https://issues.apache.org/jira/browse/SOLR-14245?focusedCommentId=17346758&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17346758 > > <https://issues.apache.org/jira/browse/SOLR-14245?focusedCommentId=17346758&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17346758>
