[
https://issues.apache.org/jira/browse/SOLR-15417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17349916#comment-17349916
]
Takashi Sasaki commented on SOLR-15417:
---------------------------------------
{quote} # In your first example, you call it Batch request. Sorry I'm very new
to solr so does Batch request mean, more than 1 doc added into 1 request?{quote}
Sorry, the term "batch" is not defined in the Solr reference document. The code
that makes HTTP requests for multiple documents at once is referred to as a
batch for convenience.
{quote} # doc.addField("hasUserAssertions", new HashMap<String, Object>() \{{
put("set", false); }}); // this makes sure update only succeeds when record
with specified id exists
you should set it to true to see how many succeeded.
{quote}
Oops, my bad. I posted the wrong code.
Here is the correct code.
Code:
{code:java}
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import org.apache.solr.client.solrj.SolrClient;
import org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient;
import org.apache.solr.client.solrj.impl.HttpSolrClient;
import org.apache.solr.client.solrj.request.UpdateRequest;
import org.apache.solr.client.solrj.response.UpdateResponse;
import org.apache.solr.common.SolrInputDocument;
import static java.lang.System.*;
public class Reproduce {
public static void main(String[] args) throws Exception {
SolrClient solrClient = new
HttpSolrClient.Builder().withBaseSolrUrl("http://localhost:8983/solr/techproducts").build();
// SolrClient solrClient = new
ConcurrentUpdateSolrClient.Builder("http://localhost:8983/solr/techproducts").withThreadCount(4).withQueueSize(500).build();
List<String> idList = List.of("TWINX2048-3200PRO", "VS1GB400C3",
"VDBDB1A16", "MA147LL/A", "F8V7067-APL-KIT");
List<SolrInputDocument> batch = new ArrayList<>();
for(int idx = 1; idx <= idList.size(); idx++) {
SolrInputDocument doc = new SolrInputDocument();
if (idx == 3) {
doc.addField("id", idList.get(idx - 1) + "_invalid");
} else {
doc.addField("id", idList.get(idx - 1));
}
doc.addField("hasUserAssertions", new HashMap<String, Object>() {{
put("set", true); }});
// this makes sure update only succeeds when record with specified id
exists
doc.addField("_version_", 1);
out.println("Added solr doc for record: " + doc.get("id"));
batch.add(doc);
}
UpdateRequest updateRequest = new UpdateRequest();
updateRequest.setAction(UpdateRequest.ACTION.COMMIT, false, false);
updateRequest.setParam("failOnVersionConflicts", "false");
updateRequest.add(batch); // List<SolrInputDocument> batch
updateRequest.lastDocInBatch();
try {
UpdateResponse process = updateRequest.process(solrClient);
out.println("xhk205 process = " + process.toString());
} catch (Exception e) {
out.println("Failed to update solr doc, error message: " +
e.getMessage());
}
}
}
{code}
Output:
{code:java}
Added solr doc for record: id=TWINX2048-3200PRO
Added solr doc for record: id=VS1GB400C3
Added solr doc for record: id=VDBDB1A16_invalid
Added solr doc for record: id=MA147LL/A
Added solr doc for record: id=F8V7067-APL-KIT
Failed to update solr doc, error message: Error from server at
http://localhost:8983/solr/techproducts: Document not found for update.
id=VDBDB1A16_invalid
{code}
Query:
[http://localhost:8983/solr/techproducts/select?fq=hasUserAssertions:true&q=*:*]
{code:java}
{responseHeader: {status: 0,QTime: 0,params: {q: "*:*",fq:
"hasUserAssertions:true"}},response: {numFound: 0,start: 0,docs: [ ]}}
{code}
{quote} # There's a failOnVersionConflicts=false so theoretically only the
invalid id update will fail.{quote}
I understand the problem.
That is indeed the expected behavior.
I'll look into it in more detail when I get some time...
[https://solr.apache.org/guide/8_2/updating-parts-of-documents.html]
{quote}When documents are added/updated in batches even a single version
conflict may lead to rejecting the entire batch. Use the parameter
{{failOnVersionConflicts=false}} to avoid failure of the entire batch when
version constraints fail for one or more documents in a batch.
{quote}
> exception in updateRequest caused all subsequent update fail
> ------------------------------------------------------------
>
> Key: SOLR-15417
> URL: https://issues.apache.org/jira/browse/SOLR-15417
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: UpdateRequestProcessors
> Affects Versions: 8.5.1
> Reporter: xuanyu huang
> Priority: Minor
>
> Hi there,
> I'm using solrj 8.8.2 for a 8.5.1 solr server. I have a list of records and
> in a for loop I construct an updateRequest to update each record.
> Code looks like this
> {code:java}
> for (Map<String, Object> map : maps) {
> if (map.containsKey("record_uuid")) {
> UpdateRequest updateRequest = new UpdateRequest();
> updateRequest.setAction( UpdateRequest.ACTION.COMMIT, false, false);
> SolrInputDocument doc = new SolrInputDocument();
> if (idx == 3) {
> doc.addField("id", map.get("record_uuid") + "_invalid");
> } else {
> doc.addField("id", map.get("record_uuid"));
> }
> idx++;
> doc.addField("hasUserAssertions", new HashMap<String, Object>() {{
> put("set", true); }});
> // this makes sure update only succeeds when record with specified id
> exists
> doc.addField("_version_", 1);
> logger.debug("Added solr doc for record: " + doc.get("id"));
> updateRequest.add(doc);
> try {
> updateRequest.setParam("failOnVersionConflicts", "false");
> UpdateResponse process = updateRequest.process(solrClient);
> System.out.println("xhk205 process = " + process.toString());
> } catch (Exception e) {
> logger.error("Failed to update solr doc, error message: " +
> e.getMessage(), e);
> }
> }{code}
> There are 5 requests in total and I intentionally set the id in 3rd request
> to be an invalid id so that updateRequet for 3rd record should fail. (This is
> to mimic the situation where the record to be updated no longer exists in
> solr, so I only want those updates with a valid id to succeed, those updates
> with an invalid id should fail/rejected instead of creating a new reocrd in
> solr, so I used __version__=1).
>
> Also I used the syntax to do partial update.
> The variable doc looks like this
> {code:java}
> {
> "id":"2d4b625d-8809-461f-b19b-d0c963e038ed",
> "hasUserAssertions":{"set":true}
> }
> {code}
>
> {color:#de350b}Since each update is put into its own request, I suppose only
> the 3rd request will fail because there's no record with that id and I've set
> __version__{color} {color:#de350b}to 1. But the reality is, only the first 2
> records were updated and other 3 not.{color}
> {color:#de350b}When I queried in solr admin console after the update, with
> [http://localhost:8983/solr/biocache/select?fq=hasUserAssertions:true&q=*:*]
> there were only 2 records returned instead of 4.{color}
>
> Below is the log of IntelliJ IDEA:
>
> {code:java}
> - Added solr doc for record: id=429cfa88-2e18-46b0-ab9f-f4efd9e36c3c
> xhk205 process = {NOTE=the request is processed in a background stream}
> - Added solr doc for record: id=5a80561b-a68d-46a3-a59b-03d267f35d0e
> xhk205 process = {NOTE=the request is processed in a background stream}
> - Added solr doc for record: id=ff2dcbee-9c05-491f-91a8-9f1fec348546_invalid
> xhk205 process = {NOTE=the request is processed in a background stream}
> - Added solr doc for record: id=baf7af1f-1525-403a-95bf-e28e432f1b12
> xhk205 process = {NOTE=the request is processed in a background stream}
> - Added solr doc for record: id=4ea76605-c262-409b-845e-213f11ea4e34
> xhk205 process = {NOTE=the request is processed in a background stream}{code}
> {code:java}
> 2021-05-19 14:12:16,827 ERROR: [ConcurrentUpdateSolrClient] - error
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
> from server at http://localhost:8983/solr/biocache: Conflict request:
> http://localhost:8983/solr/biocache/update?commit=true&softCommit=false&waitSearcher=false&failOnVersionConflicts=false&wt=javabin&version=2
> Remote error message: Document not found for update.
> id=ff2dcbee-9c05-491f-91a8-9f1fec348546_invalid at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:394)
> at
> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:191)
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0{code}
>
>
> {color:#de350b}The 3rd update obviously caused an exception. But why 4th and
> 5th updates didn't succeed? Is it possible that this exception caused solr
> client or server in some non-useable state so all subsequent updates
> failed?{color}
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]