On 1/10/2011 5:03 PM, Dennis Gearon wrote:
What I seem to see suggested here is to use different cores for the things you
suggested:
   different types of documents
   Access Control Lists

I wonder how sharding would work in that scenario?

Sharding has nothing to do with that scenario at all. Different cores are essentially _entirely seperate_. While it can be convenient to use different cores like this, it means you don't get ANY searches that 'join' over multiple 'kinds' of data in different cores.

Solr is not great at handling hetereogenous data like that. Putting it in seperate cores is one solution, although then they are entirely seperate. If that works, great. Another solution is putting them in the same index, but using mostly different fields, and perhaps having a 'type' field shared amongst all of your 'kinds' of data, and then always querying with an 'fq' for the right 'kind'. Or if the fields they use are entirely different, you don't even need the fq, since a query on a certain field will only match a certain 'kind' of document.

Solr is not great at handling complex queries over data with hetereogenous schemata. Solr wants you to to flatten all your data into one single set of documents.

Sharding is a way of splitting up a single index (multiple cores are _multiple indexes_) amongst several hosts for performance reasons, mostly when you have a very large index. That is it. The end. if you have multiple cores, that's the same as having multiple solr indexes (which may or may not happen to be on the same machine). Any one or more of those cores could be sharded if you want. This is a seperate issue.



Reply via email to