[jira] [Commented] (SOLR-15018) Atomic update deletes child documents if schema has catch-all ignore field
[ https://issues.apache.org/jira/browse/SOLR-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17259253#comment-17259253 ] David Smiley commented on SOLR-15018: - Perhaps we should have a more structured schema that reflects child doc relations, which would solve this. This idea has been thrown around a bit in conversations around nested docs. I've also been torn in retrospect if the child documents ought to have been put into a map "to the side" of the normal field values. Pros/cons. Perhaps that should have been done? It's not too late for a refactor of that nature; 9.0 is on the horizon. Or maybe an easier way to iterate so that you pre-filter them out if you don't want them? Shrug. In the mean time, "patches welcome" – either a fix here or code to detect the problem and throw an exception instead of silently delete. > Atomic update deletes child documents if schema has catch-all ignore field > -- > > Key: SOLR-15018 > URL: https://issues.apache.org/jira/browse/SOLR-15018 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update >Affects Versions: 8.6.3 >Reporter: Andreas Hubold >Priority: Major > Labels: AtomicUpdate, ChildDocuments, NestedDocuments > > Nested child documents disappear when some unrelated fields of a parent > document are atomically updated, if the schema contains a catch-all dynamic > field to ignore unknown fields like: > {noformat} > > class="solr.StrField" /> {noformat} > {{DistributedUpdateProcessor#getUpdatedDocument}} tries to reconstruct the > original document, but it does not receive nested documents from > {{RealTimeComponent#getInputDocument}}. Nested documents are correctly found > in the index but get lost when {{RealTimeGetComponent#toSolrInputDocument}} > creates a SolrInputDocument for it. The problematic code is: > {code:java} > SchemaField sf = schema.getFieldOrNull(fname); > if (sf != null) { > if ((!sf.hasDocValues() && !sf.stored()) || schema.isCopyFieldTarget(sf)) > continue; > } {code} > The code finds the "ignored" SchemaField as matching field for the nested > document name (loaded from _nest_path_). Because of that they're not added to > the SolrInputDocument. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-15018) Atomic update deletes child documents if schema has catch-all ignore field
[ https://issues.apache.org/jira/browse/SOLR-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240483#comment-17240483 ] David Smiley commented on SOLR-15018: - FWIW, I don't recommend a "*" ignored catch-all dynamic field except maybe in some early prototyping scenario as a sort of TODO before you lock down the schema. > Atomic update deletes child documents if schema has catch-all ignore field > -- > > Key: SOLR-15018 > URL: https://issues.apache.org/jira/browse/SOLR-15018 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update >Affects Versions: 8.6.3 >Reporter: Andreas Hubold >Priority: Major > Labels: AtomicUpdate, ChildDocuments, NestedDocuments > > Nested child documents disappear when some unrelated fields of a parent > document are atomically updated, if the schema contains a catch-all dynamic > field to ignore unknown fields like: > {noformat} > > class="solr.StrField" /> {noformat} > {{DistributedUpdateProcessor#getUpdatedDocument}} tries to reconstruct the > original document, but it does not receive nested documents from > {{RealTimeComponent#getInputDocument}}. Nested documents are correctly found > in the index but get lost when {{RealTimeGetComponent#toSolrInputDocument}} > creates a SolrInputDocument for it. The problematic code is: > {code:java} > SchemaField sf = schema.getFieldOrNull(fname); > if (sf != null) { > if ((!sf.hasDocValues() && !sf.stored()) || schema.isCopyFieldTarget(sf)) > continue; > } {code} > The code finds the "ignored" SchemaField as matching field for the nested > document name (loaded from _nest_path_). Because of that they're not added to > the SolrInputDocument. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-15018) Atomic update deletes child documents if schema has catch-all ignore field
[ https://issues.apache.org/jira/browse/SOLR-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17238818#comment-17238818 ] Andreas Hubold commented on SOLR-15018: --- Also mentioned on the solr-user mailing list: [https://mail-archives.apache.org/mod_mbox/lucene-solr-user/202011.mbox/%3Cad098efe-c895-bf5d-078d-f214731f841d%40coremedia.com%3E] A possible workaround is to remove the catch-all "ignored" field, and replace it with a custom UpdateRequestProcessor that removes unknown fields (except those that are used for nested documents). I'm not sure, but another possible workaround might be to add an unused schema field (stored="true") with the name of the nested document. I didn't test that because it would be quite confusing, as also said in the reference guide [https://lucene.apache.org/solr/guide/8_6/indexing-nested-documents.html#schema-configuration] {quote}Even though child documents are provided as field values syntactically and with SolrJ, it’s a matter of syntax and it isn’t an actual field in the schema. Consequently, the field need not be defined in the schema and probably shouldn’t be as it would be confusing. There is no child document field type, at least not yet. {quote} > Atomic update deletes child documents if schema has catch-all ignore field > -- > > Key: SOLR-15018 > URL: https://issues.apache.org/jira/browse/SOLR-15018 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update >Affects Versions: 8.6.3 >Reporter: Andreas Hubold >Priority: Major > Labels: AtomicUpdate, ChildDocuments, NestedDocuments > > Nested child documents disappear when some unrelated fields of a parent > document are atomically updated, if the schema contains a catch-all dynamic > field to ignore unknown fields like: > {noformat} > > class="solr.StrField" /> {noformat} > {{DistributedUpdateProcessor#getUpdatedDocument}} tries to reconstruct the > original document, but it does not receive nested documents from > {{RealTimeComponent#getInputDocument}}. Nested documents are correctly found > in the index but get lost when {{RealTimeGetComponent#toSolrInputDocument}} > creates a SolrInputDocument for it. The problematic code is: > {code:java} > SchemaField sf = schema.getFieldOrNull(fname); > if (sf != null) { > if ((!sf.hasDocValues() && !sf.stored()) || schema.isCopyFieldTarget(sf)) > continue; > } {code} > The code finds the "ignored" SchemaField as matching field for the nested > document name (loaded from _nest_path_). Because of that they're not added to > the SolrInputDocument. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org