On Tue, Nov 9, 2010 at 10:39 AM, Christopher Gross <cogr...@gmail.com> wrote:
> I'm trying to use Solr to store information from a few different sources in
> one large index.  I need to create a unique key for the Solr index that will
> be unique per document.  If I have 3 systems, and they all have a document
> with id=1, then I need to create a "uniqueId" field in my schema that
> contains both the system name and that id, along the lines of: "sysa1",
> "sysb1", and "sysc1".  That way, each document will have a unique id.
>
> I added this to my schema.xml:
>
>  <copyField source="source" dest="uniqueId"/>
>  <copyField source="id" dest="uniqueId"/>
>
>
> However, after trying to insert, I got this:
> java.lang.Exception: ERROR: multiple values encountered for non multiValued
> copy field uniqueId: sysa
>
> So instead of just appending to the uniqueId field, it tried to do a
> multiValued.  Does anyone have an idea on how I can make this work?
>
> Thanks!
>
> -- Chris
>

Chris,

Depending on how you insert your documents into SOLR will determine
how to create your unique field. If you are POST'ing the data via
HTTP, then you would be responsible for building your unique id (i.e.,
your program/language would use string concatenation to add the unique
id to the output before it gets to the update handler in SOLR). If
you're using the DataImportHandler, then you can use the
TemplateTransformer
(http://wiki.apache.org/solr/DataImportHandler#TemplateTransformer) to
dynamically build your unique id at document insertion time.

For example, we here at bizjournals use SOLR and the DataImportHandler
to index our documents. Like you, we run the risk of two or more ids
clashing, and thus overwriting a different type of document. As such,
we take two or three different fields and combine them together using
the TemplateTransformer to generate a more unique id for each document
we index.

With respect to the multiValued option, that is used more for an
array-like structure within a field. For example, if you have a blog
entry with multiple tag keywords, you would probably want a field in
SOLR that can contain the various tag keywords for each blog entry;
this is where multiValued comes in handy.

I hope that this helps to clarify things for you.

- Ken Stanley

Reply via email to