Dear All,

I implemented a data selection tool for domain adaptation based on
Invitation Model as described in:
Hoang, Cuong and Sima'an, Khalil (2014): Latent Domain Translation Models
in Mix-of-Domains Haystack, Proceedings of COLING 2014, the 25th
International Conference on Computational Linguistics,
http://www.aclweb.org/anthology/C14-1182.pdf

The developed tool is available at the following Github repository:

*https://github.com/amirkamran/InvitationModel*
<https://github.com/amirkamran/InvitationModel>


Invitation based data selection approach exploits in-domain data (both
monolingual and bilingual) as prior to guide word alignment and phrase pair
estimates in the large mix-domain corpus. As a by-product, accurate
estimates for P(D|e,f) of the mixed-domain sentences are produced (with D
being either in-domain or out-of-domain), which can be used to rank the
sentences in mix-domain according to their relevance to in-domain corpus.

This work has been conducted at ILLC (Institute for Logic, Language and
Computation, University of Amsterdam) https://www.illc.uva.nl as part of
the project "Data-Powered Domain-Specific Translation Services On Demand",
supported by the grant "STW Open Technologieprogramma".


Regards
Amir Kamran
Research Programmer
Institute of Logic, Language and Computation (ILLC)
University of Amsterdam
_______________________________________________
Mt-list site list
[email protected]
http://lists.eamt.org/mailman/listinfo/mt-list

Reply via email to