[
https://issues.apache.org/jira/browse/SOLR-17120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17808217#comment-17808217
]
Christine Poerschke commented on SOLR-17120:
--------------------------------------------
Thanks [~casmith] for the detailed report on the user mailing list and for
proactively proceeding to open this issue!
Here's some notes from how I'm reading/interpreting the issue and the code:
* You mentioned the stacktrace is 8.11.2 and we see the NPE at
UpdateLog.java:962 i.e.
[https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.2/solr/core/src/java/org/apache/solr/update/UpdateLog.java#L962]
and if {{olderDoc}} was null then we'd have gotten a NPE at line 959 already
and therefore {{olderDoc.getFieldValues(fieldName)}} must have returned null,
as you mentioned.
* {{SolrInputDocument.getFieldValues}} will return null if the field is not set
*
[https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.2/solr/solrj/src/java/org/apache/solr/common/SolrInputDocument.java#L121-L127]
*
[https://solr.apache.org/guide/solr/latest/indexing-guide/partial-document-updates.html#atomic-updates]
documents about setting to null to remove a value.
* You mention use of setting to null.
* Here's some nearby code also calling {{SolrInputDocument.getFieldValues}}
*
[https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.2/solr/core/src/java/org/apache/solr/update/processor/AtomicUpdateDocumentMerger.java#L338-L339]
*
[https://github.com/apache/lucene-solr/blob/releases/lucene-solr/8.11.2/solr/core/src/java/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.java#L430-L431]
Based on the analysis above I think {{newerDoc.addField(fieldName, null)}}
would be incorrect i.e. {{newerDoc}} doesn't have the field and it's not
supposed to get it, hence skipping for {{fieldName}} rather than adding of
{{null}} e.g.
{code:java}
- for (Object val : olderDoc.getFieldValues(fieldName)) {
+ Collection<Object> values = olderDoc.getFieldValues(fieldName);
+ if (values == null) continue;
+ for (Object val : values) {
{code}
Having said all that ... I'm not very familiar with the partial update
functionality and what puzzles me slightly is that {{olderDoc.getFieldNames()}}
returned the {{fieldName}} but then {{olderDoc.getFieldValues()}} returned no
corresponding value ... though maybe that's something to do with multiple
partial updates to the same document in succession and corresponding update log
entries etc. etc. – would love to hear insights from others on this.
> NullPointerException in UpdateLog.applyOlderUpdates in solr 6-6.9.4 involving
> partial updates
> ---------------------------------------------------------------------------------------------
>
> Key: SOLR-17120
> URL: https://issues.apache.org/jira/browse/SOLR-17120
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: update
> Affects Versions: 6.6.2, 8.11.2, 9.4
> Environment: The issue occurred on Linux, CentOS 7.9, with the
> following JDK version:
> {noformat}
> openjdk version "11.0.20" 2023-07-18 LTS
> OpenJDK Runtime Environment (Red_Hat-11.0.20.0.8-1.el7_9) (build
> 11.0.20+8-LTS)
> OpenJDK 64-Bit Server VM (Red_Hat-11.0.20.0.8-1.el7_9) (build 11.0.20+8-LTS,
> mixed mode, sharing){noformat}
> Reporter: Calvin Smith
> Priority: Major
>
> I mailed the solr-users mailing list about this issue, but didn't get any
> responses there, so am creating this issue. The subject of the email thread
> for additional context was "NullPointerException in
> UpdateLog.applyOlderUpdates under solr 8&9 involving partial updates and high
> update load" - link:
> [https://lists.apache.org/thread/n9zm4gocl7cf073syy1159dy6ojjrywl]
> I'm seeing a Solr HTTP 500 error when performing a partial update of a
> document that turns out to triggered by there having been a recent update of
> the same document that included a partial update that set a field to
> {{{}null{}}}. I've observed the behavior in versions 6.6.2, 8.11.2, and
> 9.4.0, which are the only 3 versions I've tried.
> To give an example, an update doc like
>
> {code:java}
> {
> "id": "123",
> "camera_unit": {"set": null}
> }{code}
>
> followed shortly thereafter (not sure of exact timing, but I was using a
> {{commitWithin}} of 600s and the subsequent updates were less than 20 seconds
> later), after some other updates had happened for different documents, there
> was another update of the same document, like
>
> {code:java}
> {
> "id": "123",
> "playlist": {
> "set": [
> 12345
> ]
> },
> "playlist_index_321": {
> "set": 0
> }
> }{code}
>
> This later update may, but doesn't always, cause the
> {{{}NullPointerException{}}}, so there is some other factor such as the state
> of the {{tlog}} that also has to be satisfied for the error to occur.
> The exception is thrown by the following code in {{UpdateLog.java}}
> ({{{}org.apache.solr.update.UpdateLog{}}}):
>
> {code:java}
> /** Add all fields from olderDoc into newerDoc if not already present in
> newerDoc */
> private void applyOlderUpdates(
> SolrDocumentBase<?, ?> newerDoc, SolrInputDocument olderDoc,
> Set<String> mergeFields) {
> for (String fieldName : olderDoc.getFieldNames()) {
> // if the newerDoc has this field, then this field from olderDoc can be
> ignored
> if (!newerDoc.containsKey(fieldName)
> && (mergeFields == null || mergeFields.contains(fieldName))) {
> for (Object val : olderDoc.getFieldValues(fieldName)) {
> newerDoc.addField(fieldName, val);
> }
> }
> }
> }{code}
>
> The exception is due to the inner for statement trying to iterate over the
> {{null}} value being returned by {{{}olderDoc.getFieldValues(fieldName){}}}.
> When I change that method to the following:
>
> {code:java}
> /** Add all fields from olderDoc into newerDoc if not already present in
> newerDoc */
> private void applyOlderUpdates(
> SolrDocumentBase<?, ?> newerDoc, SolrInputDocument olderDoc,
> Set<String> mergeFields) {
> for (String fieldName : olderDoc.getFieldNames()) {
> // if the newerDoc has this field, then this field from olderDoc can be
> ignored
> if (!newerDoc.containsKey(fieldName)
> && (mergeFields == null || mergeFields.contains(fieldName))) {
> Collection<Object> values = olderDoc.getFieldValues(fieldName);
> if (values == null) {
> newerDoc.addField(fieldName, null);
> } else {
> for (Object val : values) {
> newerDoc.addField(fieldName, val);
> }
> }
> }
> }
> }{code}
>
> Then after rebuilding the solr-core JAR with {{./gradlew devFull}} and
> restarting Solr with that custom jar file, I can no longer reproduce the
> error.
> I'm not familiar with the Solr codebase though and am not at all sure that
> {{newerDoc.addField(fieldName, null)}} is what should be done there.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]