Well I have created issue SOLR-4114 on the subject. Patch comming up.

Regards, Per Steffensen

Per Steffensen skrev:
Mark Miller skrev:
The Collections API was fairly rushed - so that 4.0 had something easier than 
the CoreAdmin API.
Yes I see. Our collection-creation code is more sophisticated than yours. We probably would like to migrate to the Solr Collection API now anyway - to be using it already when features are added later.
Due to that, it has a variety of limitations:

1. It only picks instances for a collection one way - randomly from the list of 
live instances. This means it's no good for multiple shards on the same 
instance. You should have enough instances to satisfy numShards X 
replicationFactor (although just being short on replicationFactor will 
currently just use what is there)
Well I think it shuffles the list of live-nodes and the begin assigning shard from one end. That is ok for us for now. But it will not start over in the list of live-nodes when there are more shards (shards * replica) than instances. This could easily be acheived, without making a very fancy allocation algorithm
2. It randomly chooses which instances to use rather than allowing manual 
specification or looking at existing cores.
A manual spec would be nice to be able to control everything if you really want to. But you probably also want to make different built-in shard-allocation-strategies that can be used out-of-the-box. E.g. a "AlwaysAssignNextShardToInstanceWithFewestShardsAlready"-strategy, but there are also other concerns that might be more interesting for people to have build into assignment algorithms - e.g. a rack-aware algorithm that assign replica of the same slice to instances running on different "racks".
3. You cannot get responses of success or failure other than polling for the 
expected results later.
Well we do that anyway, and will keep doing that in our own code for now.
Someone has a patch up for 3 that I hope to look at soon - others have 
contributed bug fixes that will be in 4.1. We still need to add the ability to 
control placement in other ways though.

I would say there are def plans, but I don't personally know exactly when I'll 
find the time for it, if others don't jump in.
Well I would like to jump in with respect to making support for running several shards of the same collection on the same instance, it is just so damn hard to get you to commit stuff :-) and we really dont want to have too many differences in our Solr compared to Apache Solr (and we have enough already - SOLR-3178 etc.). It seems like this feature with several shards on same instance is the only missing feature of the Collection API before we can "live with it".
- Mark
Regards, Per Steffensen

Reply via email to