With respect to the effect on JSON output I'm for that. A single dimensional Array encoding a 2D array to communicate key/value pairs is very contrary to any programmer's expectations (at least within this century). So improving the rendering of our JSON at some point: definitely +1 While we are at it, let's fix the cases that emit JSON objects with duplicate keys!
The next major thought I have is that this sort of effort should be coupled with a very clear, modern demonstration of the actual performance benefits of NamedList/SimpleOrderedMap. As I have noted elsewhere <https://issues.apache.org/jira/browse/SOLR-912?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&focusedCommentId=17897044#comment-17897044> all of its other claimed benefits seem to be untrue. I would pit SimpleOrderedMap against java.util.HashMap. I have a strong suspicion that the only case that SOM might win is in the case of entirely novel keys where the string hashcode needs to be calculated fresh for each insertion. In any case with a string used multiple times, that hashcode is cached, and I expect the JVM wizards at sun/oracle will have strategies for identifying interning strings used frequently. But what's the magnitude of that benefit? Is it worth it? Will that benefit outweigh optimizations Jackson might have for handling maps (certainly a core use case for them)? Even if NamedList was a GREAT idea in January of 2006 when it was added, Almost 20 years of library and JVM development may well have made it obsolete (or not! the point is we don't know) So to me the first step of this is verifying that SOM is really what we want in the first place. On Fri, Dec 20, 2024 at 7:01 PM David Smiley <dsmi...@apache.org> wrote: > Problem: Today, lots of Solr code will create a NamedList and forget to > consider creating a SimpleOrderedMap (subclassing NamedList) instead. The > vast majority of NamedLists I've seen *should* be a SimpleOrderedMap but > were not created as such. The distinction is highly subtle, affecting how > Solr serializes a NamedList to JSON -- whether it should render it as a Map > or another strategy dependent on the json.nl parameter. The subtle-ness > means lack of testing and ease of breaking compatibility. And not using > SimpleOrderedMap when we should is annoying to a JSON consumer who then has > to parse it weirdly, maybe flip-flopping with json.nl. > > Proposal: Strongly differentiate creation of NamedList instances between > SimpleOrderedMap (exists), and a new subclass to convey that keys may > repeat. NamedList will become abstract and gain some factory methods to > instantiate one of these concisely that basically everyone will use. The > exact naming is TBD for JIRA. Adding the factory methods and the type can > come to 9.9 if a user wants to start using them, but making it abstract and > making a sweeping change across the codebase is 10 only. The sweeping > changes will *not* change any declared parameters/fields/variables to be > different from NamedList to a specific type, thus the change won't be too > huge. Fortunately, most/all of the javabin consuming code won't have a > compatibility problem since NamedLists that become a SimpleOrderedMap are > still nonetheless a NamedList. But users requesting JSON will in many > cases find the JSON structure of many of Solr's APIs to have changed, and > I'm not sure we can enumerate them. This can only be done in a major > release -- Solr 10, especially so sweeping of a change. Are we okay with > this? > > I know many of us have expressed a distaste for NamedList generally. There > are several old JIRA issues about switching away from NamedList in many > places. I imagine a distant Solr 11 with V2 complete (100% new v2 not old > v2) and V1 gone, there will be far fewer NamedLists running around thanks > to embracing annotated classes with JSON serialization instead. Still, > NamedList should not be frozen, waiting for that future to unfold. > > For reference: all unresolved JIRA issues with "NamedList" in the summary: > > https://issues.apache.org/jira/issues/?jql=project%20%3D%20SOLR%20AND%20summary%20~%20NamedList%20AND%20resolution%20is%20EMPTY%20ORDER%20BY%20created%20DESC > > ~ David Smiley > Apache Lucene/Solr Search Developer > http://www.linkedin.com/in/davidwsmiley > -- http://www.needhamsoftware.com (work) https://a.co/d/b2sZLD9 (my fantasy fiction book)