On Tue, 01 Feb 2011 19:04 +0000, "Alex Cowell" <[email protected]> wrote:
I noticed there is a comment in the org.apache.solr.servlet.DirectSolrConnection class which reads, "//Find a way to turn List<ContentStream> into File/SolrDocument". Did anyone find a way to do this? Turns out that comment was left over from some experimenting one of our team was doing. But I suppose the question still stands. Addressing the "retrieve the unique ID from the document" issue, does it matter if the unique ID you do the hash on is the actual uniqueKey of the document? Surely as long as you generate some value unique for each document to index (for example, the name of the doc/stream + the current time) it would still distribute the documents as we expect? Well, one requirement I've heard for this is for it to be deterministic. That is, a document will always go to the same shard, and you can work out at any point in time where a particular document is. Once you've parsed the document to a SolrInputDocument, surely you can get the ID/uniqueKey out? I'll do some digging tomorrow AM. Upayavira --- Enterprise Search Consultant at Sourcesense UK, Making Sense of Open Source
