Re: Distributed Indexing

Lance Norskog Tue, 01 Feb 2011 19:52:37 -0800

Another use case is that N indexers operate independently, all pulling
data from the  same database. Each has a separate query to get the
documents in its policy.


On Tue, Feb 1, 2011 at 12:38 PM, Upayavira <[email protected]> wrote:
>
> On Tue, 01 Feb 2011 19:04 +0000, "Alex Cowell" <[email protected]> wrote:
>
> I noticed there is a comment in the
> org.apache.solr.servlet.DirectSolrConnection class which reads, "//Find a
> way to turn List<ContentStream> into File/SolrDocument". Did anyone find a
> way to do this?
>
> Turns out that comment was left over from some experimenting one of our team
> was doing. But I suppose the question still stands.
>
> Addressing the "retrieve the unique ID from the document" issue, does it
> matter if the unique ID you do the hash on is the actual uniqueKey of the
> document? Surely as long as you generate some value unique for each document
> to index (for example, the name of the doc/stream + the current time) it
> would still distribute the documents as we expect?
>
>
> Well, one requirement I've heard for this is for it to be deterministic.
> That is, a document will always go to the same shard, and you can work out
> at any point in time where a particular document is.
>
> Once you've parsed the document to a SolrInputDocument, surely you can get
> the ID/uniqueKey out? I'll do some digging tomorrow AM.
>
> Upayavira
>
> ---
> Enterprise Search Consultant at Sourcesense UK,
> Making Sense of Open Source



-- 
Lance Norskog
[email protected]

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Distributed Indexing

Reply via email to