Re: Improving Solr performance

Jonathan Rochkind Mon, 10 Jan 2011 14:37:14 -0800

On 1/10/2011 5:03 PM, Dennis Gearon wrote:

What I seem to see suggested here is to use different cores for the things you
suggested:
   different types of documents
   Access Control Lists


I wonder how sharding would work in that scenario?

Sharding has nothing to do with that scenario at all. Different coresare essentially _entirely seperate_. While it can be convenient to usedifferent cores like this, it means you don't get ANY searches that'join' over multiple 'kinds' of data in different cores.

Solr is not great at handling hetereogenous data like that. Putting itin seperate cores is one solution, although then they are entirelyseperate. If that works, great. Another solution is putting them inthe same index, but using mostly different fields, and perhaps having a'type' field shared amongst all of your 'kinds' of data, and then alwaysquerying with an 'fq' for the right 'kind'. Or if the fields they useare entirely different, you don't even need the fq, since a query on acertain field will only match a certain 'kind' of document.

Solr is not great at handling complex queries over data withhetereogenous schemata. Solr wants you to to flatten all your data intoone single set of documents.

Sharding is a way of splitting up a single index (multiple cores are_multiple indexes_) amongst several hosts for performance reasons,mostly when you have a very large index. That is it. The end. if youhave multiple cores, that's the same as having multiple solr indexes(which may or may not happen to be on the same machine). Any one or moreof those cores could be sharded if you want. This is a seperate issue.

Re: Improving Solr performance

Reply via email to