[
https://issues.apache.org/jira/browse/SOLR-17120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17808335#comment-17808335
]
Calvin Smith commented on SOLR-17120:
-------------------------------------
Thanks Christine for noting on the mailing list that I created the issue and
for the very helpful summary and analysis above.
My reasoning for setting the field to {{null}} was that if the {{olderDoc }}for
the first partial update doc that was added was
{
"id": "123",
"camera_unit": {"set": null}
}
Then the{{ null }}value reflects that it should be removed from the document
(so that's why the name is returned by olderDoc.getFieldNames(){{{}). {}}}The
comment on the {{applyOlderUpdates }}method suggests that the purpose is to
merge all the fields from the older doc to the newer one unless they're already
present, so I thought that maybe the field that should be removed should also
be merged in to the newer doc too, or else the fact that the field should be
removed in the document that is ultimately saved might get lost. I don't know
this code at all though, so it might be that the fields set to {{null}} in the
older doc don't actually need to be merged in to the newer doc, like you
suggested.
I'll try with your change and see if I can confirm that the field is still
removed like it's supposed to be, although it's a bit difficult to test because
it's not reliably reproducible, and I'll have to catch it after it would have
happened and before some other later update of the same document hasn't
possibly set the field that was nulled to a non-null value, which may rely some
luck.
> NullPointerException in UpdateLog.applyOlderUpdates in solr 6-6.9.4 involving
> partial updates
> ---------------------------------------------------------------------------------------------
>
> Key: SOLR-17120
> URL: https://issues.apache.org/jira/browse/SOLR-17120
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: update
> Affects Versions: 6.6.2, 8.11.2, 9.4
> Environment: The issue occurred on Linux, CentOS 7.9, with the
> following JDK version:
> {noformat}
> openjdk version "11.0.20" 2023-07-18 LTS
> OpenJDK Runtime Environment (Red_Hat-11.0.20.0.8-1.el7_9) (build
> 11.0.20+8-LTS)
> OpenJDK 64-Bit Server VM (Red_Hat-11.0.20.0.8-1.el7_9) (build 11.0.20+8-LTS,
> mixed mode, sharing){noformat}
> Reporter: Calvin Smith
> Priority: Major
>
> I mailed the solr-users mailing list about this issue, but didn't get any
> responses there, so am creating this issue. The subject of the email thread
> for additional context was "NullPointerException in
> UpdateLog.applyOlderUpdates under solr 8&9 involving partial updates and high
> update load" - link:
> [https://lists.apache.org/thread/n9zm4gocl7cf073syy1159dy6ojjrywl]
> I'm seeing a Solr HTTP 500 error when performing a partial update of a
> document that turns out to triggered by there having been a recent update of
> the same document that included a partial update that set a field to
> {{{}null{}}}. I've observed the behavior in versions 6.6.2, 8.11.2, and
> 9.4.0, which are the only 3 versions I've tried.
> To give an example, an update doc like
>
> {code:java}
> {
> "id": "123",
> "camera_unit": {"set": null}
> }{code}
>
> followed shortly thereafter (not sure of exact timing, but I was using a
> {{commitWithin}} of 600s and the subsequent updates were less than 20 seconds
> later), after some other updates had happened for different documents, there
> was another update of the same document, like
>
> {code:java}
> {
> "id": "123",
> "playlist": {
> "set": [
> 12345
> ]
> },
> "playlist_index_321": {
> "set": 0
> }
> }{code}
>
> This later update may, but doesn't always, cause the
> {{{}NullPointerException{}}}, so there is some other factor such as the state
> of the {{tlog}} that also has to be satisfied for the error to occur.
> The exception is thrown by the following code in {{UpdateLog.java}}
> ({{{}org.apache.solr.update.UpdateLog{}}}):
>
> {code:java}
> /** Add all fields from olderDoc into newerDoc if not already present in
> newerDoc */
> private void applyOlderUpdates(
> SolrDocumentBase<?, ?> newerDoc, SolrInputDocument olderDoc,
> Set<String> mergeFields) {
> for (String fieldName : olderDoc.getFieldNames()) {
> // if the newerDoc has this field, then this field from olderDoc can be
> ignored
> if (!newerDoc.containsKey(fieldName)
> && (mergeFields == null || mergeFields.contains(fieldName))) {
> for (Object val : olderDoc.getFieldValues(fieldName)) {
> newerDoc.addField(fieldName, val);
> }
> }
> }
> }{code}
>
> The exception is due to the inner for statement trying to iterate over the
> {{null}} value being returned by {{{}olderDoc.getFieldValues(fieldName){}}}.
> When I change that method to the following:
>
> {code:java}
> /** Add all fields from olderDoc into newerDoc if not already present in
> newerDoc */
> private void applyOlderUpdates(
> SolrDocumentBase<?, ?> newerDoc, SolrInputDocument olderDoc,
> Set<String> mergeFields) {
> for (String fieldName : olderDoc.getFieldNames()) {
> // if the newerDoc has this field, then this field from olderDoc can be
> ignored
> if (!newerDoc.containsKey(fieldName)
> && (mergeFields == null || mergeFields.contains(fieldName))) {
> Collection<Object> values = olderDoc.getFieldValues(fieldName);
> if (values == null) {
> newerDoc.addField(fieldName, null);
> } else {
> for (Object val : values) {
> newerDoc.addField(fieldName, val);
> }
> }
> }
> }
> }{code}
>
> Then after rebuilding the solr-core JAR with {{./gradlew devFull}} and
> restarting Solr with that custom jar file, I can no longer reproduce the
> error.
> I'm not familiar with the Solr codebase though and am not at all sure that
> {{newerDoc.addField(fieldName, null)}} is what should be done there.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]