[ 
https://issues.apache.org/jira/browse/OAK-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dominique Jäggi closed OAK-4575.
--------------------------------

bulk close 1.0.33

> Oak 1.0.x fulltext search with ideographic space (U+3000) as separator
> ----------------------------------------------------------------------
>
>                 Key: OAK-4575
>                 URL: https://issues.apache.org/jira/browse/OAK-4575
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: query
>    Affects Versions: 1.0.32
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>             Fix For: 1.0.33
>
>
> In Oak 1.0, the Lucene index uses its own tokenizer. That tokenizer doesn't 
> support ideographic space (U+3000) as word separator.
> In Oak 1.2 and later, the Lucene tokenizer is used, which works as expected.
> Backporting all relevant changed from Oak 1.2 to the 1.0 branch would be a 
> lot of changes, and the risk of regression would be high (too high in my 
> view). An alternative is to add support for the ideographic space in the 
> query engine (replace it with a regular space character). Please note the 
> behavior is still not exactly the same as with Oak 1.2, but as for this exact 
> use case it is expected to work correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to