Re: Are docs updated based on comparing the id before analysis?

Shawn Heisey Thu, 05 Feb 2015 10:23:10 -0800

On 2/5/2015 10:57 AM, Erick Erickson wrote:
> Thanks for confirming I'm not completely crazy.
>
> I don't think it's A Good Thing to _require_ that all ID normalization
> be done on the client, it'd have to be done both at index and query
> time, too much chance for things to get out of sync. Although I guess
> this is _actually_ what happens with the string type. Hmmmm.  So I'm
> -1 on <2> above as it would require this.
>
> And having <uniqueKey>s that are text fields _is_ fraught with danger
> if you tokenize it, but KeywordTokenizer doesn't.


<snip>

> Personally I feel like this is a JIRA, but I can see arguments the
> other way as I'm not entirely sure what you'd do if multiple tokens
> came out of the analysis chain. Maybe fail the document at index time?
>
> What _is_ unreasonable IMO is that we allow this surprising behavior,
> so regardless of the above I'm +1 on keeping users from being
> surprised by this behavior....

My earlier statements were written with the assumption that the current
behavior exists because it is difficult to allow the desired behavior. 
I believe that if it were easy to do, it would have already been done.

If it's possible to allow what we both think is rational user
expectation (case-insensitive uniqueKey values), I agree that we need to
allow it.  Whether or not it's readily achievable is the question.

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Are docs updated based on comparing the id before analysis?

Reply via email to