[ https://issues.apache.org/jira/browse/OAK-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17238213#comment-17238213 ]
Thomas Mueller commented on OAK-9145: ------------------------------------- So there is a workaround (I'm not saying this means the issue doesn't need to be fixed). I see the following problem with the patch: * I'm afraid the change would result in backward compatibility issues... changing the default behavior is problematic. What about using a different version number to enable the new behavior? * There is currently no test case. > OakAnalyzer applies LowerCaseFilter and WordDelimiterFilter in wrong order > -------------------------------------------------------------------------- > > Key: OAK-9145 > URL: https://issues.apache.org/jira/browse/OAK-9145 > Project: Jackrabbit Oak > Issue Type: Bug > Components: indexing, jcr, lucene > Environment: Discovered while performing DAM searches in Adobe > Experience Manager. > Reporter: Dave Hughes > Assignee: Thomas Mueller > Priority: Minor > Labels: easyfix, pull-request-available > > I believe OakAnalyzer applies LowerCaseFilter and WordDelimiterFilter in the > wrong order. WordDelimiterFilter is invoked with the GENERATE_WORD_PARTS > flag, which splits camelCase/PascalCase into multiple terms, but since the > LowerCaseFilter is applied first, the mixed-case is lost and the terms can't > be split. > Searching for savings, the damAssetLucene index (which uses the default > OakAnalyzer) does not find an asset named savingsAccount.svg. > Upon configuring the index's analyzers (/oak:index/damAssetLucene/analyzers) > to apply WordDelimiterFilter before LowerCaseFilter, the correct behaviour > was seen. > {noformat} > { > "jcr:primaryType": "nt:unstructured", > "default": { > "jcr:primaryType": "nt:unstructured", > "tokenizer": { > "jcr:primaryType": "nt:unstructured", > "name": "Standard" > }, > "filters": { > "jcr:primaryType": "nt:unstructured", > "WordDelimiter": {"jcr:primaryType": "nt:unstructured"}, > "LowerCase": {"jcr:primaryType": "nt:unstructured"} > } > } > } > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)