[ 
https://issues.apache.org/jira/browse/SOLR-12708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650196#comment-16650196
 ] 

Mano Kovacs commented on SOLR-12708:
------------------------------------

[~varunthacker], thank you for the explanation! I am a still a bit behind, I 
think I don't understand one part.

bq. CreateShardCmd is one core admin API call . So the response is either a 
success or failure. Hence the if-else block covers it.
I don't understand this part. I used this command as base, as similarly to the 
restore command; create shard operation involves adding multiple replicas. I 
think this addReplica command in {{CreateShardCmd}} is called multiple due to 
[this 
for-cycle|http://github.mtv.cloudera.com/CDH/lucene-solr/blob/cdh6.x/solr/core/src/java/org/apache/solr/cloud/api/collections/CreateShardCmd.java#L91].
 I assume this is the reason why multiple failures could be catched.

bq. This way we process requests and responses are very complicated for some 
reason and we should improve it in general . But do you see what I am seeing 
here?
Not sure. I think it is more beneficial to see every failure, instead of the 
first/last one. Especially since they are executed parallel and might have 
side-effects that require cleanup.

> Async collection actions should not hide failures
> -------------------------------------------------
>
>                 Key: SOLR-12708
>                 URL: https://issues.apache.org/jira/browse/SOLR-12708
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Admin UI, Backup/Restore
>    Affects Versions: 7.4
>            Reporter: Mano Kovacs
>            Assignee: Varun Thacker
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Async collection API may hide failures compared to sync version. 
> [OverseerCollectionMessageHandler::processResponses|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/cloud/api/collections/OverseerCollectionMessageHandler.java#L744]
>  structures errors differently in the response, that hides failures from most 
> evaluators. RestoreCmd did not receive, nor handle async addReplica issues.
> Sample create collection sync and async result with invalid solrconfig.xml:
> {noformat}
> {
> "responseHeader":{
> "status":0,
> "QTime":32104},
> "failure":{
> "localhost:8983_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://localhost:8983/solr: Error CREATEing SolrCore 
> 'name4_shard1_replica_n1': Unable to create core [name4_shard1_replica_n1] 
> Caused by: The content of elements must consist of well-formed character data 
> or markup.",
> "localhost:8983_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://localhost:8983/solr: Error CREATEing SolrCore 
> 'name4_shard2_replica_n2': Unable to create core [name4_shard2_replica_n2] 
> Caused by: The content of elements must consist of well-formed character data 
> or markup.",
> "localhost:8983_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://localhost:8983/solr: Error CREATEing SolrCore 
> 'name4_shard1_replica_n2': Unable to create core [name4_shard1_replica_n2] 
> Caused by: The content of elements must consist of well-formed character data 
> or markup.",
> "localhost:8983_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
>  from server at http://localhost:8983/solr: Error CREATEing SolrCore 
> 'name4_shard2_replica_n1': Unable to create core [name4_shard2_replica_n1] 
> Caused by: The content of elements must consist of well-formed character data 
> or markup."}
> }
> {noformat}
> vs async:
> {noformat}
> {
> "responseHeader":{
> "status":0,
> "QTime":3},
> "success":{
> "localhost:8983_solr":{
> "responseHeader":{
> "status":0,
> "QTime":12}},
> "localhost:8983_solr":{
> "responseHeader":{
> "status":0,
> "QTime":3}},
> "localhost:8983_solr":{
> "responseHeader":{
> "status":0,
> "QTime":11}},
> "localhost:8983_solr":{
> "responseHeader":{
> "status":0,
> "QTime":12}}},
> "myTaskId2709146382836":{
> "responseHeader":{
> "status":0,
> "QTime":1},
> "STATUS":"failed",
> "Response":"Error CREATEing SolrCore 'name_shard2_replica_n2': Unable to 
> create core [name_shard2_replica_n2] Caused by: The content of elements must 
> consist of well-formed character data or markup."},
> "status":{
> "state":"completed",
> "msg":"found [myTaskId] in completed tasks"}}
> {noformat}
> Proposing adding failure node to the results, keeping backward compatible but 
> correct result.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to