Here's some rough notes after running the unit tests, reviewing some of the code (though not understanding it), and reviewing the wiki page http://wiki.apache.org/solr/SolrCloud
We need a protocol in the URL, otherwise it's inflexible I'm overwhelmed with all the ?? question areas of the document. The page is huge, which signals to me maybe we're trying to do too much Revamping distributed search could be in a different branch (this includes partial results) Having a single solrconfig and schema for each core/shard in a collection won't work for me. I need to define each core externally, and I don't want Solr-Cloud to manage this, how will this scenario work? A host is about the same as node, I don't see the difference, or enough of one Cluster resizing and rebalancing can and should be built externally and hopefully after an initial release that does the basics well Collection is a group of cores? I like the model -> reality system. However how does the versioning work? We need to know what the conversion progress is? How will the queuing of in-progress alterations work (this seems hard, I'd rather focus on this, make it work well, than mess with other things like load balancing in the first release? i.e. if this doesn't work well, Solr-Cloud isn't production ready for me) Shard Identification, this falls under too ambitious right now IMO I think we need a wiki page of just the basics of core/shard management, implement that, then build all the rest of the features on top... Otherwise this thing feels like it's going to be a nightmare to test and deploy in production.