Hi, If your index is supposed to handle only public information, i.e. public RSS feeds, then I don't see a need for multiple cores.
I would probably try to handle this on the query side only. Imagine this scenario: User A registers RSS-X and RSS-Y (the application starts pulling and indexing these feeds) User B registers RSS-Z (the application starts pulling feed Z) User C registers RSS-X and RSS-Z (the application does nothing, as these are already being indexed) When searching, add a filter to each user's queries. Solr will handle MANY terms in such a filter, and it is not likely that a human user subscribes to more than say a few 100 feeds. So for user C, the query would look like .../solr/select?q=foo bar&fq=feedID:(RSS-X OR RSS-Z) -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 10. nov. 2010, at 03.00, Adam Estrada wrote: > Thanks a lot for all the tips, guys! I think that we may explore both > options just to see what happens. I'm sure that scalability will be a huge > mess with the core-per-user scenario. I like the idea of creating a user ID > field and agree that it's probably the best approach. We'll see...I will be > sure to let the list know what I find! Please don't stop posting your > comments everyone ;-) My inquiring mind wants to know... > > Adam > > On Tue, Nov 9, 2010 at 7:34 PM, Jonathan Rochkind <rochk...@jhu.edu> wrote: > >> If storing in a single index (possibly sharded if you need it), you can >> simply include a solr field that specifies the user ID of the saved thing. >> On the client side, in your application, simply ensure that there is an fq >> parameter limiting to the current user, if you want to limit to the current >> user's stuff. Relevancy ranking should work just as if you had 'seperate >> cores', there is no relevancy issue. >> >> It IS true that when your index gets very large, commits will start taking >> longer, which can be a problem. I don't mean commits will take longer just >> because there is more stuff to commit -- the larger the index, the longer an >> update to a single document will take to commit. >> >> In general, i suspect that having dozens or hundreds (or thousands!) of >> cores is not going to scale well, it is not going to make good use of your >> cpu/ram/hd resources. Not really the intended use case of multiple cores. >> >> However, you are probably going to run into some issues with the single >> index approach too. In general, how to deal with "multi-tenancy" in Solr is >> an oft-asked question that there doesn't seem to be any "just works and does >> everything for you without needing to think about it" solution for in solr. >> Judging from past thread. I am not a Solr developer or expert. >> >> ________________________________________ >> From: Markus Jelsma [markus.jel...@openindex.io] >> Sent: Tuesday, November 09, 2010 6:57 PM >> To: solr-user@lucene.apache.org >> Cc: Adam Estrada >> Subject: Re: Using Multiple Cores for Multiple Users >> >> Hi, >> >>> All, >>> >>> I have a web application that requires the user to register and then >> login >>> to gain access to the site. Pretty standard stuff...Now I would like to >>> know what the best approach would be to implement a "customized" search >>> experience for each user. Would this mean creating a separate core per >>> user? I think that this is not possible without restarting Solr after >> each >>> core is added to the multi-core xml file, right? >> >> No, you can dynamically manage cores and parts of their configuration. >> Sometimes you must reindex after a change, the same is true for reloading >> cores. Check the wiki on this one [1]. >> >>> >>> My use case is this...User A would like to index 5 RSS feeds and User B >>> would like to index 5 completely different RSS feeds and he is not >>> interested at all in what User A is interested in. This means that they >>> would have to be separate index cores, right? >> >> If you view documents within an rss feed as a separate documents, you can >> assign an user ID to those documents, creating a multi user index with rss >> documents per user, or group or whatever. >> >> Having a core per user isn't a good idea if you have many users. It takes >> up >> additional memory and disk space, doesn't share caches etc. There is also >> more maintenance and your need some support scripts to dynamically create >> new >> cores - Solr currently doesn't create a new core directory structure. >> >> But, reindexing a very large index takes up a lot more time and resources >> and >> relevancy might be an issue depending on the rss feeds' contents. >> >>> >>> What is the best approach for this kind of thing? >> >> I'd usually store the feeds in a single index and shard if it's too many >> for a >> single server with your specifications. Unless the demands are too >> specific. >> >>> >>> Thanks in advance, >>> Adam >> >> [1]: http://wiki.apache.org/solr/CoreAdmin >> >> Cheers >>