Gregory Chanan created SOLR-6250:
------------------------------------

             Summary: Schemaless parsing does not work on a consistent schema
                 Key: SOLR-6250
                 URL: https://issues.apache.org/jira/browse/SOLR-6250
             Project: Solr
          Issue Type: Improvement
          Components: Schema and Analysis
            Reporter: Gregory Chanan


See this comment 
(https://issues.apache.org/jira/browse/SOLR-6137?focusedCommentId=14044366&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14044366),
 reproduced here:

bq. One small issue I noticed is that there is a race between parsing and 
schema addition. The AddSchemaFieldsUpdateProcessor handles this by only 
working on a fixed schema, so the schema doesn't change underneath it. If it 
decides on a schema addition and that fails (because another addition beat it), 
it will grab the latest schema and retry. But the parsers don't do that so the 
core's schema can change in the middle of parsing. It may make sense to defend 
against that by moving the retry code from the AddSchemaFieldsUpdateProcessor 
to some processor that runs before all the parsers. The downside is if the 
schema addition fails, you have to rerun all the parsers, but that may be a 
minor concern.
bq. This may not actually matter. Consider the case tested at the end of the 
test: two documents are simultaneously inserted with the same field having a 
Long and Date value. Assume the Date wins the schema "race" and is updated 
first. While parsing the Long, each parser may see the schema as having a date 
field or no field. If a valid parser (that is, one that can modify the field 
value) sees a date field, it won't do any modifications because shouldMutate 
will fail, leaving the object in whatever state the serializer left it (either 
Long or String). If it sees no field, it will mutate the object to create a 
Long object. In either case, we should get an error at the point we actually 
create the lucene document, because neither a Long nor 
String-representation-of-a-long can be stored in a Date field. This is pretty 
difficult to reason about though.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to