[ 
https://issues.apache.org/jira/browse/SOLR-18188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18091284#comment-18091284
 ] 

Chris M. Hostetter commented on SOLR-18188:
-------------------------------------------

{quote}The JettySolrRunner proxy ordering change from the June 5 commit 
(8c49326e71b) is a red herring — it only affects JettySolrRunner.stop(), not 
test level proxy.close() calls.
{quote}
For these tests, I was actually more suspicious of the June-9 
[31405c513d3|https://gitbox.apache.org/repos/asf?p=solr.git;h=31405c513d3] 
change (and it's June-16 backport to 10X): "Remove Apache HttpClient usage from 
test infra and tests"

Grepping [the archive of my daily 
reports|https://fucit.org/solr-jenkins-reports/reports/archive/daily/] ...
 * the first failure of ReplicationFactorTest on main was 2026-06-10, and it 
didn't fail on 10x until 2026-06-17
 * the first failure of RecoveryAfterSoftCommitTest on main was 2026-06-11, and 
it didn't fail on 10x until 2026-06-18

 
----
{quote}RandomizingCloudSolrClientBuilder randomizes shardLeadersOnly. When 
randomized to false, the CloudSolrClient falls through to LBSolrClient for 
routing, which can pick the partitioned replica's endpoint.
{quote}
Given that the purpose of both of these tests is to see how solr behaves when a 
non-leader replica can't be reached by the leader, i would think it certainly 
makes sense (in the long term) to change them to use a CLoudSolrClient with 
{{shardLeadersOnly=true}} explicitly set (or a client that explicitly sends 
directly to the leader)

... *_BUT_* ....

If the reason these seeds aren't reproducible is in fact because the closed 
proxy is *sometimes* getting picked by the client when 
{{{}shardLeadersOnly==false{}}}, that implies that these tests could be 
(temporarily) patched to use an {{UpdateRequest}} that has:
 * {{setRequestType(QUERY)}} to return {{QUERY}}
 * {{setPreferredNodes(nodeWithClosedProxy)}}

...IIUC a patch like that applied to main should cause the "current" code to 
fail reliably with ClosedChannelException, and make it possible to debug 
exactly what change in the code base caused the test failures, and make it 
possible to figure out why the client use to retry in these situations, and no 
longer does ?

 
----
{quote} I'm very skeptical that ClosedChannelException can be considered to 
only occur for a connection opening, which is what 
{{org.apache.solr.client.solrj.impl.LBSolrClient#isConnectException}} is 
supposed to be limited to.
{quote}
Hmmm... except I think it's important to consider the "intent" of 
{{isConnectException}} – particularly given the time/context it was written: 
HTTP1.

IIUC the "goal" of that method was to say "this is something that failed at the 
HTTP connection level, before we could even send the request, therefore it is 
always retr-yable" – with HTTP1 that was basically just 
{{{}java.net.ConnectException{}}}. But with HTTP2's added complexity, where 
"channel != connection" maybe there are some siutations where it _does_ make 
sense to retry on {{ClosedChannelException}} ? ... i'm not sure.

> solr-test-framework: Remove Apache HttpClient usages
> ----------------------------------------------------
>
>                 Key: SOLR-18188
>                 URL: https://issues.apache.org/jira/browse/SOLR-18188
>             Project: Solr
>          Issue Type: Task
>          Components: test-framework, Tests
>            Reporter: David Smiley
>            Assignee: David Smiley
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 10.1
>
>          Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> As of this writing, the last usages of Apache HttpClient are in Solr's tests. 
>  This issue aims to remove it completely.  But it's a lot of work.
> Some possible steps:
>  * Remove tests for our HttpSolrClient & friends (fundamentally based on 
> Apache HttpClient)
>  * Replace usages of HttpSolrClient.getHttpClient with 
> HttpJettySolrClient.getHttpClient
>  * Replace usages of HttpSolrClient.getBaseURL by introducing a new base 
> client that has this method.  Or access similarly from Jetty when easily 
> available.
>  * of course, stop using HttpSolrClient & friends.  Maybe class-by-class.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to