Tommaso Teofili commented on OAK-4348:

+1 Vikas, it's still work in progress and I definitely agree checking 
{{IndexDefinition}} would save us some unnecessary calls.
Other than that the cost is usually quite cheap (did a test today with a 
Spanish to English language pack) and I couldn't notice the difference while 
performing queries (so I expect it to be in the order of a few milliseconds).

> Cross language search via SMT
> -----------------------------
>                 Key: OAK-4348
>                 URL: https://issues.apache.org/jira/browse/OAK-4348
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: query
>            Reporter: Tommaso Teofili
>            Assignee: Tommaso Teofili
>             Fix For: 1.6
> It would be interesting to investigate usage of statistical machine 
> translation toolkits (like Apache Joshua) in order to enable cross language 
> search, so that query can be eventually expanded to search over translated 
> terms too.
> Example: 
> - enable spanish to english translation
> - perform full text search for "hola" 
> - query engine looks for translations for "hola"
> - SMT returns "hello"
> - query engine add an additional (UNION) clause for the translated term
> - the query performed by Oak becomes "hello OR hola"
> - both results for english and spanish terms get returned
> This of course should be configurable.
> Note that the integration may happen also via Apache Tika which provides a 
> Translator API.

This message was sent by Atlassian JIRA

Reply via email to