[ 
https://issues.apache.org/jira/browse/LUCENE-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12996413#comment-12996413
 ] 

Kevin Brubeck Unhammer commented on LUCENE-1284:
------------------------------------------------

A little update: The Java port of lttoolbox has been complete for some time 
now, and the port of apertium-tagger at least does disambiguation (training of 
models is not supported yet though):

{noformat}
$ echo 'jeg' |apertium-destxt-j |lt-proc-j  nb-nn.automorf.bin | 
apertium-tagger-j -g nb-nn.prob -f
^jeg/jeg<prn><p1><mf><sg><nom>/jeg<n><nt><sg><ind>$^./.<sent><clb>$[][
]
{noformat}

The GsoC student Stephen Tigner is working at the moment on making sure they 
are all usable as libraries; from what I understand there is just minor cleanup 
work left on that. 

I can't say anything on license issue though. Other than Stephen Tigner, the 
most active contributor on the port is Jacob Nordfalk.

> Set of Java classes that allow the Lucene search engine to use morphological 
> information developed for the Apertium open-source machine translation 
> platform (http://www.apertium.org)
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1284
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1284
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: contrib/analyzers
>         Environment: New feature developed under GNU/Linux, but it should 
> work in any other Java-compliance platform
>            Reporter: Felipe Sánchez Martínez
>            Assignee: Otis Gospodnetic
>         Attachments: apertium-morph.0.9.0.tgz
>
>
> Set of Java classes that allow the Lucene search engine to use morphological 
> information developed for the Apertium open-source machine translation 
> platform (http://www.apertium.org). Morphological information is used to 
> index new documents and to process smarter queries in which morphological 
> attributes can be used to specify query terms.
> The tool makes use of morphological analyzers and dictionaries developed for 
> the open-source machine translation platform Apertium (http://apertium.org) 
> and, optionally, the part-of-speech taggers developed for it. Currently there 
> are morphological dictionaries available for Spanish, Catalan, Galician, 
> Portuguese, 
> Aranese, Romanian, French and English. In addition new dictionaries are being 
> developed for Esperanto, Occitan, Basque, Swedish, Danish, 
> Welsh, Polish and Italian, among others; we hope more language pairs to be 
> added to the Apertium machine translation platform in the near future.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to