On 26 October 2012 13:57, Raymond HS <[email protected]> wrote:
> Hi Jim,
>
> For the Antara website, I think most of their stories are not translations
> (more like comparable than parallel). But I believe there are some of them
> that are direct translations. Actually it will be good if Bitextor can use
> some linguistic information (like bilingual dictionary) during the alignment
> process. :)

IIRC, Bitextor only uses document structure. If you already have a set
of aligned documents, Hunalign can use a dictionary to improve
existing sentence alignments, and maligna can additionally create IBM
Model 1 models.

Finding parallel document pairs in comparable corpora is a less
researched problem, but Felipe's doctrans project
(http://code.google.com/p/doctrans/) happily does that - you'll need a
phrase table from Moses to use it, though.


-- 
<Sefam> Are any of the mentors around?
<jimregan> yes, they're the ones trolling you

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to