The unique key is an auto-incremented int in the db.  Sorry for having
given the impression that user_id is the unique key per document.  This is
a table of events that are happening as users interact with our system.
It just so happens that we were inserting individual records for each user
before we even began to think about using something like Solr.  Now,
however, it seems to me that we should be able to ask questions like "give
me all records for user "2002" that have this string value "more" in data2,
across this time stamp range [ .... ].  Several simultaneously inserted
rows into the db are exactly the same aside from the user_ids.  I just want
to know beforehand if I can still maintain exact matches for a user if the
user_id becomes a string of concatenated user id values.

>From what you are saying it sounds like the "user_id_str" is really all I
need.  It is tokenized and allows for partial searches.  I just want to
make sure that "2002 15000 45" when tokenized doesn't allow "20" to
partially match the token "2002".

On Fri, Jun 7, 2013 at 12:57 PM, Jack Krupansky <j...@basetechnology.com>wrote:

> In that case, you will need to keep two copies of the user ID, one which
> is a single, complete string, and one which is a tokenized field
> text/TextField so that you can do a keyword search against it. Use the
> string/StrField as the main copy and then use a <copyField> directive in
> the schema to copy from the main copy to the other copy.
>
> So, maybe "user_id" is the full unique key - you would have to specify,
> the full exact key to query against it, or use wildcards for partial
> matches, and "user" or "user_id_str" would be the tokenized text version
> that would allow a simple search by partial value, such as "2002".
>
> Even so, I'm still not convinced that you have given us your complete
> requirements. Is the user_id in fact the unique key for the documents?
>
>

Reply via email to