Gregory Chanan created SOLR-6250:
------------------------------------
Summary: Schemaless parsing does not work on a consistent schema
Key: SOLR-6250
URL: https://issues.apache.org/jira/browse/SOLR-6250
Project: Solr
Issue Type: Improvement
Components: Schema and Analysis
Reporter: Gregory Chanan
See this comment
(https://issues.apache.org/jira/browse/SOLR-6137?focusedCommentId=14044366&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14044366),
reproduced here:
bq. One small issue I noticed is that there is a race between parsing and
schema addition. The AddSchemaFieldsUpdateProcessor handles this by only
working on a fixed schema, so the schema doesn't change underneath it. If it
decides on a schema addition and that fails (because another addition beat it),
it will grab the latest schema and retry. But the parsers don't do that so the
core's schema can change in the middle of parsing. It may make sense to defend
against that by moving the retry code from the AddSchemaFieldsUpdateProcessor
to some processor that runs before all the parsers. The downside is if the
schema addition fails, you have to rerun all the parsers, but that may be a
minor concern.
bq. This may not actually matter. Consider the case tested at the end of the
test: two documents are simultaneously inserted with the same field having a
Long and Date value. Assume the Date wins the schema "race" and is updated
first. While parsing the Long, each parser may see the schema as having a date
field or no field. If a valid parser (that is, one that can modify the field
value) sees a date field, it won't do any modifications because shouldMutate
will fail, leaving the object in whatever state the serializer left it (either
Long or String). If it sees no field, it will mutate the object to create a
Long object. In either case, we should get an error at the point we actually
create the lucene document, because neither a Long nor
String-representation-of-a-long can be stored in a Date field. This is pretty
difficult to reason about though.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]