Short version: Is there a way to either do partial updates to documents (update/add one or two fields only), or to search across multiple documents grouped by a (non-unique) key stored in a field?
Long version: I've run into an issue with the way I'm indexing documents for a new product, and figure that somebody else has run into the same problem. In a nutshell, we're building a system that deals with a lot of incoming and outgoing text documents (email, word docs, short comments, etc), grouped together by some common factor (basically, email threads), and want to do full-text search across those threads. We've settled on Solr, of course. :) Right now, I'm adding each new incoming/outgoing message as a new document, and can search just fine, unless I want to look for multiple terms that span documents. So, "foo" is in the first document, "bar" is in the second, and although they both have a 'thread_id' field identifying them as belonging to the same group, searching for "+foo +bar" doesn't yield results (which is not surprising). Now, I can modify the code to store one document for each group of messages without too much work. But as I understand it, this means that for every new message coming in, I need to hand an aggregate of all previous messages to the indexer, because Solr will re-create the document (which indexes the entire group of messages) when I do update/add. Since there can be some fairly large files sitting in there (50-100M in some cases), I'd rather not have to shove that down Solr's pipe every time something changes. So, first question, is what I think I know about update/add correct? Second, if so, is there a way that I can update single-valued fields and append new multivalued fields, without having to re-index the whole document? Third, am I just totally wrong about the way I'm trying to do this, and is there a better way? Thanks-in-advance!