In case anyone is interested:
In 2010, I wrote my master's thesis on a subset of the data.
Post-Editing of Statistical Machine Translation – A crosslinguistic
analysis of the temporal, technical and cognitive effort
https://www.ukp.tu-darmstadt.de/fileadmin/user_upload/Group_UKP/beinborn_mscthesis_final.pdf
The edit categorization scheme on page 47 might be of interest for
further analyses.
Best regards,
Lisa
--------------------------------------------------------------------
Lisa Beinborn
Doctoral Researcher
Ubiquitous Knowledge Processing (UKP) Lab
FB 20 / Computer Science Department
Technische Universität Darmstadt
Hochschulstr. 10, D-64289 Darmstadt, Germany
phone [+49] (0)6151 16-7477, room S2/02/B117
[email protected]
www.ukp.tu-darmstadt.de
Web Research at TU Darmstadt (WeRC): www.werc.tu-darmstadt.de
GRK 1994: Adaptive Preparation of Information from Heterogeneous Sources
(AIPHES): www.aiphes.tu-darmstadt.de
--------------------------------------------------------------------
On 27.04.2015 09:06, "Венцислав Жечев (Ventsislav Zhechev)" wrote:
Dear all,
It is my pleasure to announce the release of the Autodesk Post-Editing
Data corpus with the ISLRN 290-859-676-529-5
(http://www.islrn.org/resources/identify_islrn/).
This resource contains parallel English source–MT/TM target segments
post-edited into several languages (Simplified and Traditional
Chinese, Czech, French, German, Hungarian, Italian, Japanese, Korean,
Polish, Brazilian Portuguese, Russian, Spanish) with between 30000 and
410000 segments per language. Its main intended use is for research in
automatic quality estimation of Machine Translation output. The
provided data are predominantly software user manual content with some
segments coming from marketing and education materials. They cover
the portfolio of Autodesk products from various domains, notably
architecture, engineering, civil engineering, simulation, computer
graphics, media and entertainment. The content was translated in the
period 2012.11.12 to 2014.09.23.
The corpus is available from
https://autodesk.box.com/Autodesk-PostEditing and more information is
available in the included Readme file. The data are released under
a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0
International License (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Regards,
*Dr. Ventsislav Zhechev*
Computational Linguist, Certified ScrumMaster®
Platform Architecture and Technologies
Localisation Services
*MAIN* +41 32 723 91 22
*FAX* +41 32 723 93 99
_http://VentsislavZhechev.eu_
*Autodesk, Inc.*
Rue de Puits-Godet 6
2000 Neuchâtel, Switzerland
_www.autodesk.com <http://www.autodesk.com/>_
--
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support