[
https://issues.apache.org/jira/browse/OAK-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271950#comment-15271950
]
Tommaso Teofili commented on OAK-4348:
--------------------------------------
PoC : https://github.com/tteofili/jackrabbit-oak/tree/joshua
> Cross language search via SMT
> -----------------------------
>
> Key: OAK-4348
> URL: https://issues.apache.org/jira/browse/OAK-4348
> Project: Jackrabbit Oak
> Issue Type: New Feature
> Components: query
> Reporter: Tommaso Teofili
> Assignee: Tommaso Teofili
> Fix For: 1.6
>
>
> It would be interesting to investigate usage of statistical machine
> translation toolkits (like Apache Joshua) in order to enable cross language
> search, so that query can be eventually expanded to search over translated
> terms too.
> Example:
> - enable spanish to english translation
> - perform full text search for "hola"
> - query engine looks for translations for "hola"
> - SMT returns "hello"
> - query engine add an additional (UNION) clause for the translated term
> - the query performed by Oak becomes "hello OR hola"
> - both results for english and spanish terms get returned
> This of course should be configurable.
> Note that the integration may happen also via Apache Tika which provides a
> Translator API.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)