Thanks for documenting this David, we actually ran into an issue with #A this week, so it's great to see y'all are already working on it.
I agree with your recommendations, and think we can pare things down quite a bit. The local-node request idea is also cool! - Houston On Thu, Sep 5, 2024 at 2:26 PM David Smiley <dsmi...@apache.org> wrote: > There are too many ways to submit an intra-cluster request within Solr. > It’s been gnawing at me so I did an audit and made recommendations: > > There are too many ways to submit an intra-cluster request within Solr. > It’s been gnawing at me so I did an audit and made recommendations: > > A. Creating new Http SolrClient (with or without Cloud) from > org.apache.solr.update.UpdateShardHandler#getDefaultHttpClient — very > popular, 14 usages. > Recommendation: This is a poor place for such a client; will be moved > shortly in SOLR-16503 to CoreContainer and return an Http2SolrClient > instead of HttpClient. > > B. org.apache.solr.client.solrj.cloud.SolrCloudManager#request (default > impl is essentially a CloudLegacySolrClient). 3 usages. > I’m suspicious we need this; either embrace or abandon IMO. > Recommendation: embrace but migrate to getCloudSolrClient and add > getHttpSolrClient too, the latter sourced from CoreContainer. Then allows > requestAsync usage and hopefully will also have a URL-specific method for > SOLR-17256 > right next to it, there’s an httpRequest method that is now unused. > Recommendation: remove > > C. org.apache.solr.update.UpdateShardHandler#getUpdateOnlyHttpClient for > “updates” only. 2 usages > > D. org.apache.solr.update.UpdateShardHandler#getRecoveryOnlyHttpClient for > “recovery” only. 4 usages. > > E. org.apache.solr.client.solrj.io.SolrClientCache wow, 86 usages!! 68 are > in solrj-streaming since this is where this thing originated from, thus > leaving 18 usages elsewhere but I’m over-counting. The unique classes > using (excluding streaming, excluding a benchmark that tests streaming too) > are 6 usages (classes). > > * Some usages are generic to a configurable SolrCloud cluster > (solrj-streaming definitely): > ReindexCollection and SolrReporter (metrics). > This is the primary purpose/value of SolrClientCache; should continue. > * Some of the other cases just need a cluster-local CloudSolrClient: > SchemaDesigner (2 classes), CollectionsRepairEventListener. > Recommendation: Use SolrCloudManager. > * The other 4 usages are to get an Http SolrClient for a URL. > Recommendation: Use CoreContainer.getDefaultHttpSolrClient moving from > UpdateShardHandler (or similar recommended in SolrCloudManager). Requires > solution to SOLR-17256 like an Http2SolrClient specific request method > specifying an additional URL parameter. > > F. org.apache.solr.handler.component.HttpShardHandler / ShardHandler. > Originally built for distributed-search, which requires supporting multiple > replicas that are equivalent to serve a request for a shard. Later a > number of Overseer/cluster commands started using/abusing it. There are > some API tech-debt / issues I see here as a result of this. Now that > Http2SolrClient has async support (it didn’t when HttpShardHandler was > first created), I wonder if some Overseer usages should use that instead. > I think it could be simpler. > Recommendation: Add javadocs warning users to avoid using it outside of > distributed search. Point out Http2SolrClient.requestAsync. > > As an aside, I want to mention that EmbeddedSolrServer is pretty cool; I > could imagine expanding its use within Solr in some cases to avoid an HTTP > layer and disconnected caller chain. Like to issue an admin call, even a > SolrCloud one (e.g. to move a shard. But it doesn't do collection > routing. It could be neat if the node-local CloudSolrClient proposed above > could somehow detect such an admin call and route through locally to the > node. > > ~ David Smiley > Apache Lucene/Solr Search Developer > http://www.linkedin.com/in/davidwsmiley >