The ACCURAT project (http://www.accurat-project.eu/) is pleased to announce the 
release of ACCURAT Toolkit - a collection of tools for comparable corpora 
collection and multi-level alignment and information extraction from comparable 
corpora.



By using the ACCURAT Toolkit, users may obtain:

- Comparable corpora from the Web (current news corpora, filtered Wikipedia 
corpora, and narrow domain focussed corpora);

- Comparable document alignments;

- Semi-parallel sentence/phrase mapping from comparable corpora (for SMT 
training purposes or other tasks);

- Translated terminology extracted and mapped from bilingual comparable corpora;

- Translated named entities extracted and mapped from bilingual comparable 
corpora.



The toolkit is open source and freely available. It can be downloaded from the 
ACCURAT Web Site at http://www.accurat-project.eu/ under the terms of the 
Apache 2.0 licence.



The ACCURAT project has received funding from the European Community’s Seventh 
Framework Programme (FP7/2007-2013) under Grant Agreement n° 248347.



=-=-=-= REFERENCES =-=-=-=





ACCURAT D2.6 2012. Toolkit for multi-level alignment and information extraction 
from comparable corpora. 
(http://www.accurat-project.eu/uploads/Deliverables/ACCURAT%20D2.6%20Toolkit%20for%20multi-level%20alignment%20and%20information%20extraction%20from%20comparable%20corpora%20v3.0.pdf).





ACCURAT D3.5 2012. Tools for building comparable corpus from the Web. 
(http://www.accurat-project.eu/uploads/Deliverables/ACCURAT%20D3.5%20Tools%20for%20building%20comparable%20corpus%20from%20the%20Web%20v3.0.pdf).



Pinnis, M., Ion, R., Ştefănescu, D., Su, F., Skadiņa, I., Vasiļjevs, A., & 
Babych, B. (2012). ACCURAT Toolkit for Multi-Level Alignment and Information 
Extraction from Comparable Corpora. Proceedings of the ACL 2012 System 
Demonstrations (pp. 91–96). Association for Computational Linguistics. Jeju, 
South Korea.

Best regards,
Mārcis Pinnis
Researcher, Tilde
www.tilde.eu

_______________________________________________
Mt-list mailing list

Reply via email to