[jira] [Commented] (SOLR-15018) Atomic update deletes child documents if schema has catch-all ignore field

2021-01-05 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17259253#comment-17259253
 ] 

David Smiley commented on SOLR-15018:
-

Perhaps we should have a more structured schema that reflects child doc 
relations, which would solve this.  This idea has been thrown around a bit in 
conversations around nested docs.  I've also been torn in retrospect if the 
child documents ought to have been put into a map "to the side" of the normal 
field values.  Pros/cons.  Perhaps that should have been done?  It's not too 
late for a refactor of that nature; 9.0 is on the horizon.  Or maybe an easier 
way to iterate so that you pre-filter them out if you don't want them?  Shrug.

In the mean time, "patches welcome" – either a fix here or code to detect the 
problem and throw an exception instead of silently delete.

> Atomic update deletes child documents if schema has catch-all ignore field
> --
>
> Key: SOLR-15018
> URL: https://issues.apache.org/jira/browse/SOLR-15018
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update
>Affects Versions: 8.6.3
>Reporter: Andreas Hubold
>Priority: Major
>  Labels: AtomicUpdate, ChildDocuments, NestedDocuments
>
> Nested child documents disappear when some unrelated fields of a parent 
> document are atomically updated, if the schema contains a catch-all dynamic 
> field to ignore unknown fields like:
> {noformat}
> 
>  class="solr.StrField" /> {noformat}
> {{DistributedUpdateProcessor#getUpdatedDocument}} tries to reconstruct the 
> original document, but it does not receive nested documents from 
> {{RealTimeComponent#getInputDocument}}. Nested documents are correctly found 
> in the index but get lost when {{RealTimeGetComponent#toSolrInputDocument}} 
> creates a SolrInputDocument for it. The problematic code is:
> {code:java}
> SchemaField sf = schema.getFieldOrNull(fname);
> if (sf != null) {
>   if ((!sf.hasDocValues() && !sf.stored()) || schema.isCopyFieldTarget(sf)) 
> continue;
> } {code}
> The code finds the "ignored" SchemaField as matching field for the nested 
> document name (loaded from _nest_path_). Because of that they're not added to 
> the SolrInputDocument.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15018) Atomic update deletes child documents if schema has catch-all ignore field

2020-11-29 Thread David Smiley (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17240483#comment-17240483
 ] 

David Smiley commented on SOLR-15018:
-

FWIW, I don't recommend a "*" ignored catch-all dynamic field except maybe in 
some early prototyping scenario as a sort of TODO before you lock down the 
schema.

> Atomic update deletes child documents if schema has catch-all ignore field
> --
>
> Key: SOLR-15018
> URL: https://issues.apache.org/jira/browse/SOLR-15018
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update
>Affects Versions: 8.6.3
>Reporter: Andreas Hubold
>Priority: Major
>  Labels: AtomicUpdate, ChildDocuments, NestedDocuments
>
> Nested child documents disappear when some unrelated fields of a parent 
> document are atomically updated, if the schema contains a catch-all dynamic 
> field to ignore unknown fields like:
> {noformat}
> 
>  class="solr.StrField" /> {noformat}
> {{DistributedUpdateProcessor#getUpdatedDocument}} tries to reconstruct the 
> original document, but it does not receive nested documents from 
> {{RealTimeComponent#getInputDocument}}. Nested documents are correctly found 
> in the index but get lost when {{RealTimeGetComponent#toSolrInputDocument}} 
> creates a SolrInputDocument for it. The problematic code is:
> {code:java}
> SchemaField sf = schema.getFieldOrNull(fname);
> if (sf != null) {
>   if ((!sf.hasDocValues() && !sf.stored()) || schema.isCopyFieldTarget(sf)) 
> continue;
> } {code}
> The code finds the "ignored" SchemaField as matching field for the nested 
> document name (loaded from _nest_path_). Because of that they're not added to 
> the SolrInputDocument.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (SOLR-15018) Atomic update deletes child documents if schema has catch-all ignore field

2020-11-25 Thread Andreas Hubold (Jira)


[ 
https://issues.apache.org/jira/browse/SOLR-15018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17238818#comment-17238818
 ] 

Andreas Hubold commented on SOLR-15018:
---

Also mentioned on the solr-user mailing list: 
[https://mail-archives.apache.org/mod_mbox/lucene-solr-user/202011.mbox/%3Cad098efe-c895-bf5d-078d-f214731f841d%40coremedia.com%3E]

A possible workaround is to remove the catch-all "ignored" field, and replace 
it with a custom UpdateRequestProcessor that removes unknown fields (except 
those that are used for nested documents).

I'm not sure, but another possible workaround might be to add an unused schema 
field (stored="true") with the name of the nested document. I didn't test that 
because it would be quite confusing, as also said in the reference guide 
[https://lucene.apache.org/solr/guide/8_6/indexing-nested-documents.html#schema-configuration]
{quote}Even though child documents are provided as field values syntactically 
and with SolrJ, it’s a matter of syntax and it isn’t an actual field in the 
schema. Consequently, the field need not be defined in the schema and probably 
shouldn’t be as it would be confusing. There is no child document field type, 
at least not yet.
{quote}

> Atomic update deletes child documents if schema has catch-all ignore field
> --
>
> Key: SOLR-15018
> URL: https://issues.apache.org/jira/browse/SOLR-15018
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update
>Affects Versions: 8.6.3
>Reporter: Andreas Hubold
>Priority: Major
>  Labels: AtomicUpdate, ChildDocuments, NestedDocuments
>
> Nested child documents disappear when some unrelated fields of a parent 
> document are atomically updated, if the schema contains a catch-all dynamic 
> field to ignore unknown fields like:
> {noformat}
> 
>  class="solr.StrField" /> {noformat}
> {{DistributedUpdateProcessor#getUpdatedDocument}} tries to reconstruct the 
> original document, but it does not receive nested documents from 
> {{RealTimeComponent#getInputDocument}}. Nested documents are correctly found 
> in the index but get lost when {{RealTimeGetComponent#toSolrInputDocument}} 
> creates a SolrInputDocument for it. The problematic code is:
> {code:java}
> SchemaField sf = schema.getFieldOrNull(fname);
> if (sf != null) {
>   if ((!sf.hasDocValues() && !sf.stored()) || schema.isCopyFieldTarget(sf)) 
> continue;
> } {code}
> The code finds the "ignored" SchemaField as matching field for the nested 
> document name (loaded from _nest_path_). Because of that they're not added to 
> the SolrInputDocument.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org