[ 
https://issues.apache.org/jira/browse/SOLR-15221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296329#comment-17296329
 ] 

Michael Gibney commented on SOLR-15221:
---------------------------------------

I was initially (and still am) inclined to think that this should be addressed 
in favor of consistently propagating errors to the client. But my first stab at 
"fixing" it (I tried several different ways of adding the relevant {{error}} to 
{{errorsForClient}} in 
[DistributedZkUpdateProcessor.doDistribFinish()|https://github.com/apache/lucene-solr/blob/99a4bbf3a0ab93/solr/core/src/java/org/apache/solr/update/processor/DistributedZkUpdateProcessor.java#L1075-L1100]),
 while mostly successful, reliably errors on exactly one test 
([HttpPartitionOnCommitTest.test|https://github.com/apache/lucene-solr/blob/99a4bbf3a0ab93/solr/core/src/test/org/apache/solr/cloud/HttpPartitionOnCommitTest.java#L181-L197])
 in the existing test suite. This makes me think that I'm missing something wrt 
replica recovery or something, so I'm not sure how to proceed.

I've attached  [^SOLR-15221-initial-tests.patch]  with several tests that 
demonstrate existing behavior as-is (succeed) and one {{AwaitsFix}} test that 
fails on asserting consistency in responses (with different replicas throwing 
errors).

> Distributed commit errors are not propagated to the initiating client 
> ----------------------------------------------------------------------
>
>                 Key: SOLR-15221
>                 URL: https://issues.apache.org/jira/browse/SOLR-15221
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: master (9.0)
>            Reporter: Michael Gibney
>            Priority: Minor
>         Attachments: SOLR-15221-initial-tests.patch
>
>
> Distributed commit errors are not currently propagated back to the client 
> that initially issued the commit command. So, any commit (e.g., issued via 
> {{CloudSolrClient}}, {{curl}} to Http API, etc.) responds with Http status 
> code {{200}}, API status {{0}}, as long as the commit to the "local" core 
> arbitrarily associated with the request succeeds. This happens no matter how 
> many distributed commits succeed or fail (at least, to other leader replicas 
> -- I've only tested w/ replication factor 1 at the moment).
> Inconsistency -- i.e. an error on an arbitrarily-determined "local" replica 
> propagates propagates to the client, but an error on all other replicas does 
> not -- is the focus of this issue; but this issue is raised with no 
> preconceived notions wrt _how_ the inconsistency should be resolved.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to