Thanks Mark. That's sort of what I was thinking of doing. On Thu, Apr 8, 2010 at 10:33 AM, Mark Miller <markrmil...@gmail.com> wrote:
> On 04/08/2010 09:23 AM, Rich Cariens wrote: > >> Are there any best practices or built-in support for keeping track of >> what's >> been indexed in a Solr application so as to support a full rebuild? I'm >> not >> indexing from a single source, but from many, sometimes arbitrary, sources >> including: >> >> 1. A document repository that fires events (containing a URL) when new >> >> documents are added to the repo; >> 2. A book-marking service that fires events containing URLs when users >> of >> >> that service bookmark a URL; >> 3. More services that raise events that make Solr update docs indexed >> via >> >> (1) or (2) with additional metadata (think user comments, tagging, >> etc). >> >> I'm looking at ~200M documents for the initial launch, with around 30K new >> docs every day, and many thousands of metadata events every day. >> >> Do any of you Solr gurus have any suggestions or guidance you can share >> with >> me? >> >> Thanks in advance, >> Rich >> >> >> > > Pump everything through an UpdateProcessor that writes out SolrXML as docs > go by? > > -- > - Mark > > http://www.lucidimagination.com > > > >