Mark Miller skrev:
The Collections API was fairly rushed - so that 4.0 had something easier than 
the CoreAdmin API.
Yes I see. Our collection-creation code is more sophisticated than yours. We probably would like to migrate to the Solr Collection API now anyway - to be using it already when features are added later.
Due to that, it has a variety of limitations:

1. It only picks instances for a collection one way - randomly from the list of 
live instances. This means it's no good for multiple shards on the same 
instance. You should have enough instances to satisfy numShards X 
replicationFactor (although just being short on replicationFactor will 
currently just use what is there)
Well I think it shuffles the list of live-nodes and the begin assigning shard from one end. That is ok for us for now. But it will not start over in the list of live-nodes when there are more shards (shards * replica) than instances. This could easily be acheived, without making a very fancy allocation algorithm
2. It randomly chooses which instances to use rather than allowing manual 
specification or looking at existing cores.
A manual spec would be nice to be able to control everything if you really want to. But you probably also want to make different built-in shard-allocation-strategies that can be used out-of-the-box. E.g. a "AlwaysAssignNextShardToInstanceWithFewestShardsAlready"-strategy, but there are also other concerns that might be more interesting for people to have build into assignment algorithms - e.g. a rack-aware algorithm that assign replica of the same slice to instances running on different "racks".
3. You cannot get responses of success or failure other than polling for the 
expected results later.
Well we do that anyway, and will keep doing that in our own code for now.

Someone has a patch up for 3 that I hope to look at soon - others have 
contributed bug fixes that will be in 4.1. We still need to add the ability to 
control placement in other ways though.

I would say there are def plans, but I don't personally know exactly when I'll 
find the time for it, if others don't jump in.
Well I would like to jump in with respect to making support for running several shards of the same collection on the same instance, it is just so damn hard to get you to commit stuff :-) and we really dont want to have too many differences in our Solr compared to Apache Solr (and we have enough already - SOLR-3178 etc.). It seems like this feature with several shards on same instance is the only missing feature of the Collection API before we can "live with it".
- Mark
Regards, Per Steffensen
On Nov 26, 2012, at 4:57 AM, Per Steffensen <[email protected]> wrote:

Hi

Before upgrading to Solr 4.0.0 we used to handle our collection creation 
ourselves, by creating each shards through the low-level CoreAdmin API. We used 
to create multiple shards under the same collection on each Solr server. 
Performance tests has shown that this is a good idea, and it is also a good 
idea for easy elasticity later on - it is much easier to move an entire 
existing shards from one Solr server to another one that just joined the cluter 
than it is to split an exsiting shard among the Solr that used to run it and 
the new Solr.

Now we are trying to migrate to the Solr Collection API for creation of collections, but 
it seems like it will not accept multiple shards under the same collection running on the 
same Solr server. E.g. if we have 4 Solr servers and ask to have a collection created 
with 8 shards, all 8 shards will be "created" but only 4 of them will acutally 
run - one on each Solr server.

Is the a good reason why Solr does not allow multiple shards under the same collection to 
run on the same Solr server, or is it just made this way "by coincidence"? In 
general I seek info on the matter - is it planed for later? etc.

Thanks!

Regards, Per Steffensen

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Reply via email to