Thanks Mark.  That's sort of what I was thinking of doing.

On Thu, Apr 8, 2010 at 10:33 AM, Mark Miller <markrmil...@gmail.com> wrote:

> On 04/08/2010 09:23 AM, Rich Cariens wrote:
>
>> Are there any best practices or built-in support for keeping track of
>> what's
>> been indexed in a Solr application so as to support a full rebuild?  I'm
>> not
>> indexing from a single source, but from many, sometimes arbitrary, sources
>> including:
>>
>>    1. A document repository that fires events (containing a URL) when new
>>
>>    documents are added to the repo;
>>    2. A book-marking service that fires events containing URLs when users
>> of
>>
>>    that service bookmark a URL;
>>    3. More services that raise events that make Solr update docs indexed
>> via
>>
>>    (1) or (2) with additional metadata (think user comments, tagging,
>> etc).
>>
>> I'm looking at ~200M documents for the initial launch, with around 30K new
>> docs every day, and many thousands of metadata events every day.
>>
>> Do any of you Solr gurus have any suggestions or guidance you can share
>> with
>> me?
>>
>> Thanks in advance,
>> Rich
>>
>>
>>
>
> Pump everything through an UpdateProcessor that writes out SolrXML as docs
> go by?
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>

Reply via email to