Hi all, Recently in a 7.4.0 test cluster, we ran into SOLR-12814 <https://issues.apache.org/jira/browse/SOLR-12814> which we fixed by slightly increasing the request header size. However, there were some other log messages along with the "URI size >8192" message which we thought were related, but have not abated since increasing the header size. A full shutdown of the solr processes and bringing them back up one at a time did not solve the issue.
The overseer node seems to not be authenticating any of the requests to /solr/admin/metrics on any node (including itself). Every minute, there are two warning per node 10/17/2018, 7:53:45 AM WARN SolrClientNodeStateProvider could not get tags from node host1:port1_solr 10/17/2018, 7:53:45 AM WARN SolrClientNodeStateProvider could not get tags from node host1:port1_solr 10/17/2018, 7:53:46 AM WARN SolrClientNodeStateProvider could not get tags from node host2:port2_solr 10/17/2018, 7:53:46 AM WARN SolrClientNodeStateProvider could not get tags from node host2:port2_solr 10/17/2018, 7:53:46 AM WARN SolrClientNodeStateProvider could not get tags from node host3:port3_solr 10/17/2018, 7:53:46 AM WARN SolrClientNodeStateProvider could not get tags from node host3:port3_solr There are two slightly different stack traces that appear with each pair: https://pastebin.com/2Z1C5rXr The warning message possibly comes from solrj.impl.SolrClientNodeStateProvider.fetchMetrics which both of the attempted requests call in their stack trace. However, we already have a 7.4.0 production cluster running that also has security enabled with similar replica density where we have not seen this issue. *Test:* -- 10 collections (9 with 2 shards, 1 with 43 shards) -- replication factor of 2 for all collections -- 3 hosts with 40 or 41 replicas each *Production:* -- 9 collections with 14 shards -- replication factor of 2 for all collections -- 7 hosts with 36 replicas each I've enabled TRACE logging in our test environment on most options related to metrics and authentication. So far the only new message I've gotten is the challenge from the target server for the necessary credentials right before the warning and stack trace. 2018-10-17 12:20:46.368 DEBUG (MetricsHistoryHandler-8-thread-1) [ ] o.a.h.i.a.HttpAuthenticator Authentication required 2018-10-17 12:20:46.368 DEBUG (MetricsHistoryHandler-8-thread-1) [ ] o.a.h.i.a.HttpAuthenticator host3:port3 requested authentication I suspect the creation and balancing of the large collection on test might have something to do with it since the problem started happening after that. Are there any other specific log settings I should turn on that might produce some useful information? Thanks, Chris