Erick Erickson created SOLR-7085:
------------------------------------
Summary: Add a comment to the schema.xml file(s) warning against
applying analysis chains to the <uniqueKey> field.
Key: SOLR-7085
URL: https://issues.apache.org/jira/browse/SOLR-7085
Project: Solr
Issue Type: Improvement
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
If you apply index-time transformations to the <uniqueKey> field, very
interesting things happen, all of them bad.
1> the doc doesn't get updated
2> Docs are routed to shards based on the original form of the ID field.
I stopped looking there. There are much bigger fish to fry than trying to apply
an index-time analysis chain to the <uniqueKey> so a comment in the schema.xml
seems all that is necessary.
Trying to change this at a code level would be a nightmare I suspect. Consider
routing by a secondary field for instance and N+1 other places this would pop
out.
Limited _query_ time transformations are OK, they just have to match the
indexing program's transformations, about the only one I'd recommend is
lowercasing, but others are possible if you're brave as long as they match the
indexing program's transformations.
My "rule of thumb" I was trying to apply here is that "anything a human enters
in your search app should not be a case-sensitive when searching" and it can be
enforced easily enough.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]