Hello, I was asked to draw attention and summarize the cross-road situation with the current Solr 9 blocker issue about the "schemaless mode".
Basically, the current shameless mode is broken in two main ways, one familiar and one that is a bit more obscure: 1) Familiar: The schemaless mode will update schema on the first view of a field value, which means it can fail on the second view of the field value and the solution requires pre-registering trouble fields (e.g. in our films example) and/or removing errant fields from updated schema. 2) Obscure: Even if schemaless mode works, we currently say to disable it when it goes into production and provide instructions for that. But, schemaless mode does not only update schema, it also does custom parsing. This is especially obvious for Date parsing, as we allow 7 different formats rather than just one. So, when schemaless mode is disabled as we recommend, the indexing that worked before will start failing for those case in very non-obvious ways (see https://github.com/apache/solr/blob/9eaefdf6b6178042bd26420fede08c0db65c45f3/solr/server/solr/configsets/_default/conf/solrconfig.xml#L1001-L1011) This has been discussed in SOLR-14701, SOLR-11741 and even SOLR-6939, with most suggestions focusing on alternatives to 1) above, but ignoring the 2). Most popular idea is to generate schema altering commands or JSON or similar. I have tried to build a solution that dealt with both of above points in https://github.com/apache/lucene-solr/pull/1863 . I felt I was implementing Hoss' proposal from SOLR-6939 by batching the documents' mappings and then doing the type widening and final schema creation on commit. I do not have a specific client that needs it, so I was trying to do something that fixes the pain points instead of completely reimagining it. My solution is still not fantastic, but - I feel - it does address both issues above. However, the PR discussion, which boomeranged back into SOLR-14701 has gotten stuck as multiple people were not able to get onto the same page about the specific implementation points. So, I marked the issue as a blocker, to help recognition that we are at the cross-roads. The options I see are: 1) We can unmark the issue and just keep shipping the broken implementation with the knowingly wrong advice; probably for a very long time as (just as my data point) it was a lot of effort to even understand the logic of current AddSchemaFieldsUpdateProcessorFactory 2) We can rip out this mode all together and/or move it into plugin 3) We can adopt my solution, possibly with some minor adjustments (and deprecate/remove from config the other one) 4) Somebody other than me can do something completely else. I failed to understand the alternative proposals, despite trying very hard and my own 'alternative' proposal is very very different. I, myself, no longer have a preferred position on this issue. But I was asked to bring it back to the community anyway, just in case the time and summary will help to move this forward. Regards, Alex.