Re: CloudSolrClient getDocCollection

2019-02-10 Thread Erick Erickson
bq. But I would assume it should still be ok. The number of watchers should still not be gigantic. This assumption would need to be rigorously tested before I'd be comfortable. I've spent quite a bit of time with unhappy clients chasing down issues in the field where 1> it takes hours to

Re: CloudSolrClient getDocCollection

2019-02-10 Thread Hendrik Haddorp
I opened now https://issues.apache.org/jira/browse/SOLR-13239 for the problem I observed. Well, who can really be sure about those things. But I would assume it should still be ok. The number of watchers should still not be gigantic. I have setups with about 2000 collections each but far less

Re: CloudSolrClient getDocCollection

2019-02-09 Thread Erick Erickson
Jason's comments are exactly why there _is_ a state.json per collection rather than the single clusterstate.json in the original implementation. Hendrik: yes, please do open a JIRA for the condition you observed, especially if you can point to the suspect code. There have been intermittent issues

Re: CloudSolrClient getDocCollection

2019-02-08 Thread Hendrik Haddorp
Hi Jason, thanks for your answer. Yes, you would need one watch per state.json and thus one watch per collection. That should however not really be a problem with ZK. I would assume that the Solr server instances need to monitor those nodes to be up to date on the cluster state. Using

Re: CloudSolrClient getDocCollection

2019-02-08 Thread Jason Gerlowski
Hi Henrik, I'll try to answer, and let others correct me if I stray. I wasn't around when CloudSolrClient was written, so take this with a grain of salt: "Why does the client need that timeout?Wouldn't it make sense to use a watch?" You could probably write a CloudSolrClient that uses