We're trying to implement a large scale domain specific web email application, and so far solr performance on the search side is really doing well for us.
There are two limitations that I can't seem to get around however, and was hoping for some advice. 1. We would like to do bulk tagging on large query result sets (ie, if you have 1M emails, do a search, and then you wish to apply a tag to the result set of, say, 250k results). I've tried many approaches, but the closest support I could see was the update field functionality in SOLR-139. Is there any other way to separate the very dynamic metadata (tags and other fields) abstracted away from the static documents themselves? I've researched joining against a metadata database, but unfortunately the join logic for large results is just too bulky to be perform well at scale. Also have even looked at postgres tsearch2, but that also breaks down with a large number of emails. 2. We're assuming we'll have thousands of users with independent data; any good way to partition multiple indexes with solr? With Lucene we could just save those in independent directories, and cache the index while the user session is active. I saw some configurations on tomcat that would allow multiple instances, but that's probably not practical for lots of concurrent users. Thanks for any tips; would love to use Solr (or Lucene), but haven't been able to get around issue 1 yet for large numbers of emails in a timely response. We've really looked at the gamut here, including solr, lucene, postgres (tsearch2), sphinx, xapian, couchdb(!), and more. ab