Hi everyone, I just wanted to ask your general opinion about the different systems on English-German pair and the tree based models. I was translating a test set with 4 different systems, namely, phrase based, hierarchical phrase-based, string to tree and tree to tree. I was hoping to maybe improve the "phrase based" results by using tree based models, as I thought I would get the verbs better aligned and placed in the German translations more accuratly, however I saw the scores going worse and worse on BLEU, almost halving the original (phrase based) score with a tree-to-tree system.
Does this make sense in general? or does it look like I am doing something wrong? (the systems were built with default settings and Bitpar for German, and Collins for English were used to annotate the data). I also realized that the rule table of tree-to-tree model was almost one third of the string-to-tree model in size. Do you think this is normal? Before I delve into some articles and ideas for improving German translations in more experimental ways, do you have any tips for this language pair in general? Thank you in advance. Regards, Arda Tezcan ________________________________ From: "[email protected]" <[email protected]> To: [email protected] Sent: Mon, August 30, 2010 6:05:04 PM Subject: Moses-support Digest, Vol 46, Issue 40 Send Moses-support mailing list submissions to [email protected] To subscribe or unsubscribe via the World Wide Web, visit http://mailman.mit.edu/mailman/listinfo/moses-support or, via email, send a message with subject or body 'help' to [email protected] You can reach the person managing the list at [email protected] When replying, please edit your Subject line so it is more specific than "Re: Contents of Moses-support digest..." Today's Topics: 1. Re: Some Problems with TagAligner (Felipe S?nchez Mart?nez) 2. Re: Some Problems with TagAligner (Jimmy O'Regan) 3. EXTENDED DEADLINE CFP: Workshop "Bringing MT to the User: Research on Integrating MT in the Translation Industry" (????????? ????? (Ventsislav Zhechev)) ---------------------------------------------------------------------- Message: 1 Date: Sun, 29 Aug 2010 19:53:45 +0200 From: Felipe S?nchez Mart?nez <[email protected]> Subject: Re: [Moses-support] Some Problems with TagAligner To: amin farajian <[email protected]> Cc: [email protected] Message-ID: <[email protected]> Content-Type: text/plain; charset=UTF-8; format=flowed Hello Amin, The developer responsible for bitextor is Miquel Espl? (in copy). I hope it will be able to help you. Cheers -- Felipe El 29/08/10 12:23, amin farajian escribi?: > Hello All, > > I'm a newbie in Linux, I wanted to use Bitextor but while trying to > install it I found that TagAligner is also needed. I downloaded > TagAligner and tried to install it.All things seemed OK and I installed > bitextor. but when I try to run it, this message appears: > > bitextor: error while loading shared libraries: libtagaligner3-3.1.so.0: > cannot open shared object file: No such file or directory > > when I saw this message I tried to reinstall TagAligner and this time I > read the messages carefully. when I use make install to install it some > messages like appear on the screen: > > Libraries have been installed in: > /usr/local/lib > > If you ever happen to want to link against installed libraries > in a given directory, LIBDIR, you must either use libtool, and > specify the full pathname of the library, or use the `-LLIBDIR' > flag during linking and do at least one of the following: > - add LIBDIR to the `LD_LIBRARY_PATH' environment variable > during execution > - add LIBDIR to the `LD_RUN_PATH' environment variable > during linking > - use the `-Wl,-rpath -Wl,LIBDIR' linker flag > - have your system administrator add LIBDIR to `/etc/ld.so.conf' > > I tried to set these variables but nothing happend. what's going wrong? > > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support -- Felipe S?nchez Mart?nez Departamento de Lenguajes y Sistemas Inform?ticos Universidad de Alicante, E-03071 Alicante (Spain) Tel.: +34 965 903 400, ext: 2966 Fax: +34 965 909 326 http://www.dlsi.ua.es/~fsanchez ------------------------------ Message: 2 Date: Sun, 29 Aug 2010 19:29:01 +0100 From: "Jimmy O'Regan" <[email protected]> Subject: Re: [Moses-support] Some Problems with TagAligner To: amin farajian <[email protected]> Cc: [email protected] Message-ID: <[email protected]> Content-Type: text/plain; charset=UTF-8 On 29 August 2010 11:23, amin farajian <[email protected]> wrote: > Hello All, > I'm a newbie in Linux, I wanted to use Bitextor but while trying to install > it I found that TagAligner is also needed. I downloaded TagAligner and tried > to install it.All things seemed OK and I installed bitextor. but when I try > to run it, this message appears: > > bitextor: error while loading shared libraries: libtagaligner3-3.1.so.0: > cannot open shared object file: No such file or directory > > when I saw this message I tried to reinstall TagAligner and this time I read > the messages carefully. when I use make install to install it some messages > like appear on the screen: > > Libraries have been installed in: > ?? /usr/local/lib > > If you ever happen to want to link against installed libraries > in a given directory, LIBDIR, you must either use libtool, and > specify the full pathname of the library, or use the `-LLIBDIR' > flag during linking and do at least one of the following: > ?? - add LIBDIR to the `LD_LIBRARY_PATH' environment variable > ???? during execution > ?? - add LIBDIR to the `LD_RUN_PATH' environment variable > ???? during linking > ?? - use the `-Wl,-rpath -Wl,LIBDIR' linker flag > ?? - have your system administrator add LIBDIR to `/etc/ld.so.conf' You have to run ldconfig as root ('sudo ldconfig' on Ubuntu, 'su -c ldconfig' otherwise). -- <Leftmost> jimregan, that's because deep inside you, you are evil. <Leftmost> Also not-so-deep inside you. ------------------------------ Message: 3 Date: Mon, 30 Aug 2010 16:57:23 +0100 From: "????????? ????? (Ventsislav Zhechev)" <[email protected]> Subject: [Moses-support] EXTENDED DEADLINE CFP: Workshop "Bringing MT to the User: Research on Integrating MT in the Translation Industry" To: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] Message-ID: <[email protected]> Content-Type: text/plain; charset=utf-8 ---------------------------------------------------------------------- !!! EXTENDED DEADLINE !!! FINAL CALL FOR PAPERS ---------------------------------------------------------------------- "Bringing MT to the User: Research on Integrating MT in the Translation Industry" Second Joint EM+/CNGL Workshop (JEC 2010) http://web.me.com/emcnglworkshop/JEC2010 ---------------------------------------------------------------------- At the AMTA 2010 conference (http://amta2010.amtaweb.org), the EuroMatrix+ Project (http://www.euromatrixplus.eu) and the Centre for Next Generation Localisation (http://cngl.ie) are organising the Second Joint EM+/CNGL Workshop, titled "Bringing MT to the User: Research on Integrating MT in the Translation Industry". The workshop will take place in Denver, Colorado on 4 November 2010, immediately after the main AMTA 2010 conference. Premise: Recent years have seen a revolution in MT triggered by the emergence of statistical approaches to MT and improvements in translation quality. MT (rule-based, statistical and hybrid) is now available for many languages for free on the Web and is making strong inroads into the corporate localisation and translation industries. Open-source MT solutions are competing with proprietary products. Increasing numbers of translators are post-editing TM/MT output. At the same time, there has been some disconnect between academic research on MT, which (rightly so) focuses on algorithms to increase translation quality, and many of the practical issues that need to be addressed to make MT maximally useful in real translation and localisation scenarios. Objectives: This workshop will bring together MT researchers, developers, industrial users and translators to discuss issues that are most important in real world industrial settings involving MT, but currently not very popular in research circles. Workshop Chairs: Ventsislav Zhechev Philipp Koehn Josef van Genabith Call for Research Papers: For this workshop we solicit full research papers with industry or academic background to highlight the real-world issues that need to be tackled by new research and the recent academic advancements that improve translation quality, as well as novel and successful methods for the integration of Machine Translation with Translation Memories or Localisation Workflows. We will accept research paper submissions (reviewed anonymously) for oral presentation and publication. Papers should present clearly identifiable problem statements, research methodologies and measurable outcomes and evaluation. The papers should follow the submission guidelines for the research track of the main AMTA 2010 Conference (http://amta2010.amtaweb.org/cfp-mt.htm), with the maximum length being 10 pages in US Letter format, including references. Please, do not include your name in the paper text and avoid overt self-references to facilitate the blind review process. If a paper is accepted, at least one author will have to register through the AMTA 2010 website and travel to Denver to present it. Topics include but are not limited to: ? MT/TM in Localisation/Translation Workflows ? MT/TM Combinations ? Post-Editing Support for MT ? MT and Monolingual Post-Editing ? MT Confidence Scores and Post-Editing Effort ? Training Data for MT: Size, Domain and Quality ? Data Cleanup and Preparation for MT ? Meta-Data Mark-Up/Annotation and MT ? Terminology and MT ? Costing/Pricing MT ? MT for Free/for a Fee ? Rule-Based, Statistical and Hybrid MT ? Computing Resources for MT ? MT in the Cloud ? MT and the Crowd ? Smart Learning from Post-Edits ? (Machine) Translation in Context Program Committee: The submitted papers will be reviewed by a mixed industry?academia committee. Industry members: Fred Hollowood (Symantec), Johann Roturier (Symantec), Dag Schmidtke (Microsoft), Dion Wiggins (Asia Online), Jaap van der Meer (TAUS), Manuel Tom?s Carrasco Ben?tez (DGT of the EC), Daniel Grasmick (Lucy Software), Marc Dymetman (XRCE), Nicholas Stroppa (Google), Tony O?Dowd (Alchemy), Jean Senellart (Systran) Academic members: Michael Carl (CBS Denmark), Eiichiro Sumita (NICT Japan), Julien Bourdaillet (University of Montreal), Mikel Forcada (Universitat d?Alacant), Philipp Koehn (EM+), Hans Uszkoreit (EM+), Josef van Genabith (CNGL, EM+), Andy Way (CNGL), Harold Somers (CNGL), Ventsislav Zhechev (EM+, CNGL) Deadlines (all Samoa time 23:59 GMT -11): 09 September 2010 Full Paper Submissions Due 20 September 2010 Acceptance Notifications Sent Out 27 September 2010 Camera-Ready Papers Due Please, direct all enquiries to Dr. Ventsislav Zhechev at [email protected] For up-to-date information, please visit http://web.me.com/emcnglworkshop/JEC2010 For information about the First Joint EM+/CNGL Workshop, please visit http://www.euromatrixplus.eu/cngl2009 Dr. Ventsislav Zhechev EuroMatrix+ Centre for Next Generation Localisation School of Computing Dublin City University http://VentsislavZhechev.eu ------------------------------ _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support End of Moses-support Digest, Vol 46, Issue 40 *********************************************
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
