[
https://issues.apache.org/jira/browse/LUCENE-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974977#comment-13974977
]
Manuel Lenormand commented on LUCENE-5620:
------------------------------------------
My answer regards a Solr usecase but as it uses the Lucene filters I think it
can contribute to the discussion.
On one of our morphology projects we discussed the field splitting issue. We
wanted to enable a stemmed an non stemmed search for these different languages,
mainly for advanced users who wish to control their search terms.
The drawbacks of the field splitting were
1) QParser flexibility- (not being forced to use a dismax defType in order to
query multiple fields in a single query.
2) "readability" - the developer / user could see in a single place all the
terms a query could match in an indexed document via the admin UI without
asking him to understand a parsedQuery string or the qf param.
3) term position - enabling a phrase query that would match "originalTerm
stemmedTerm". Enabling it in a splitted field would mean saving the original
term (dictionary and posting) twice,
3) perf (more of an anecdote) - as the terms were generally suffix stemmed we
had good chances of loading the same term block and posting list to memory as
they should be sequential.
I do agree a PreserveOriginalSnapshot could be a good resolution
> LowerCaseFilter.preserveOriginal
> --------------------------------
>
> Key: LUCENE-5620
> URL: https://issues.apache.org/jira/browse/LUCENE-5620
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Mike Sokolov
> Attachments: LUCENE-5620.patch
>
>
> Following closely the model of LUCENE-5437 (which worked on
> ASCIIFoldingFilter), this patch adds the ability to preserve the original
> token to LowerCaseFilter. This is useful if you want an all-lowercase search
> term to match without regard to case, while search terms with uppercase
> letters match in a case-sensitive manner.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]