[
https://issues.apache.org/jira/browse/OAK-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dominique Jäggi closed OAK-4575.
--------------------------------
bulk close 1.0.33
> Oak 1.0.x fulltext search with ideographic space (U+3000) as separator
> ----------------------------------------------------------------------
>
> Key: OAK-4575
> URL: https://issues.apache.org/jira/browse/OAK-4575
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: query
> Affects Versions: 1.0.32
> Reporter: Thomas Mueller
> Assignee: Thomas Mueller
> Fix For: 1.0.33
>
>
> In Oak 1.0, the Lucene index uses its own tokenizer. That tokenizer doesn't
> support ideographic space (U+3000) as word separator.
> In Oak 1.2 and later, the Lucene tokenizer is used, which works as expected.
> Backporting all relevant changed from Oak 1.2 to the 1.0 branch would be a
> lot of changes, and the risk of regression would be high (too high in my
> view). An alternative is to add support for the ideographic space in the
> query engine (replace it with a regular space character). Please note the
> behavior is still not exactly the same as with Oak 1.2, but as for this exact
> use case it is expected to work correctly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)