Re: [Moses-support] Factored models and xml-input
Update: It still works if there is any number of language models for factor 0. Once I add a single language model for factor 1, it fails. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Factored models and xml-input
I think you're onto something here, Marcin. If I remove all my language models and leave just the translation model, it works for me. Just for testing, what happens if you remove the second phrase table and add a langauge model for factor 1. Usually this kind of setup fails for me with xml-input, regardless if add factors to the XML option or not. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] When to truecase
Hi, we also have an experiment on truecasing, see Table 1 in http://www.statmt.org/wmt13/pdf/WMT08.pdf What works best for us is relying on the casing as guessed by the lemmatizer. (Our lemmatizer recognizes names as separate lemmas and keeps the lemma upcased; which we then cast to the token in the sentence.) Moses recaser was the worst option, it was actually better to lowercase only the source side of the parallel data, i.e. have the main search also pick the casing. Cheers, O. - Original Message - From: Lane Schwartz dowob...@gmail.com To: Philipp Koehn p...@jhu.edu Cc: moses-support@mit.edu Sent: Wednesday, 20 May, 2015 20:50:41 Subject: Re: [Moses-support] When to truecase Got it. So then, how was casing handled in the mbr/mp column? Was all of the data lowercased, then models trained, then recasing applied after decoding? Or something else? On Wed, May 20, 2015 at 1:30 PM, Philipp Koehn p...@jhu.edu wrote: Hi, no, the changes are made incrementally. So the recesed baseline is the previous mbr/mp column. -phi On Wed, May 20, 2015 at 2:01 PM, Lane Schwartz dowob...@gmail.com wrote: Philipp, In Table 2 of the WMT 2009 paper, are the baseline and truecased columns directly comparable? In other words, do the two columns indicate identical conditions other than a single variable (how and/or when casing was handled)? In the baseline condition, how and when was casing handled? Thanks, Lane On Wed, May 20, 2015 at 12:43 PM, Philipp Koehn p...@jhu.edu wrote: Hi, see Section 2.2 in our WMT 2009 submission: http://www.statmt.org/wmt09/pdf/WMT-0929.pdf One practical reason to avoid recasing is the need for a second large cased language model. But there is of course also the practical issue with have a unique truecasing scheme for each data condition, handling of headlines, all-caps emphasis, etc. It would be worth to revisit this issue again under different data conditions / language pairs. Both options are readily available in EMS. Each of the two alternative methods could be improved as well. See for instance: http://www.aclweb.org/anthology/N06-1001 -phi -phi On Wed, May 20, 2015 at 12:31 PM, Lane Schwartz dowob...@gmail.com wrote: Philipp (and others), I'm wondering what people's experience is regarding when truecasing is applied. One option is to truecase the training data, then train your TM and LM using that truecased data. Another option would be to lowercase the data, train TM and LM on the lowercased data, and then perform truecasing after decoding. I assume that the former gives better results, but the latter approach has an advantage in terms of extensibility (namely if you get more data and update your truecase model, you don't have to re-train all of your TMs and LMs). Does anyone have any insights they would care to share on this? Thanks, Lane ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support -- When a place gets crowded enough to require ID's, social collapse is not far away. It is time to go elsewhere. The best thing about space travel is that it made it possible to go elsewhere. -- R.A. Heinlein, Time Enough For Love ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support -- When a place gets crowded enough to require ID's, social collapse is not far away. It is time to go elsewhere. The best thing about space travel is that it made it possible to go elsewhere. -- R.A. Heinlein, Time Enough For Love ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support -- Ondrej Bojar (mailto:o...@cuni.cz / bo...@ufal.mff.cuni.cz) http://www.cuni.cz/~obo ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] When to truecase
Hi, If your system output is lowercase, you could try SRILM's `disambig` tool for predicting the correct casing in a postprocessing step. http://www.speech.sri.com/projects/srilm/manpages/disambig.1.html Cheers, Matthias On Fri, 2015-05-22 at 11:20 +0200, Ondrej Bojar wrote: Hi, we also have an experiment on truecasing, see Table 1 in http://www.statmt.org/wmt13/pdf/WMT08.pdf What works best for us is relying on the casing as guessed by the lemmatizer. (Our lemmatizer recognizes names as separate lemmas and keeps the lemma upcased; which we then cast to the token in the sentence.) Moses recaser was the worst option, it was actually better to lowercase only the source side of the parallel data, i.e. have the main search also pick the casing. Cheers, O. - Original Message - From: Lane Schwartz dowob...@gmail.com To: Philipp Koehn p...@jhu.edu Cc: moses-support@mit.edu Sent: Wednesday, 20 May, 2015 20:50:41 Subject: Re: [Moses-support] When to truecase Got it. So then, how was casing handled in the mbr/mp column? Was all of the data lowercased, then models trained, then recasing applied after decoding? Or something else? On Wed, May 20, 2015 at 1:30 PM, Philipp Koehn p...@jhu.edu wrote: Hi, no, the changes are made incrementally. So the recesed baseline is the previous mbr/mp column. -phi On Wed, May 20, 2015 at 2:01 PM, Lane Schwartz dowob...@gmail.com wrote: Philipp, In Table 2 of the WMT 2009 paper, are the baseline and truecased columns directly comparable? In other words, do the two columns indicate identical conditions other than a single variable (how and/or when casing was handled)? In the baseline condition, how and when was casing handled? Thanks, Lane On Wed, May 20, 2015 at 12:43 PM, Philipp Koehn p...@jhu.edu wrote: Hi, see Section 2.2 in our WMT 2009 submission: http://www.statmt.org/wmt09/pdf/WMT-0929.pdf One practical reason to avoid recasing is the need for a second large cased language model. But there is of course also the practical issue with have a unique truecasing scheme for each data condition, handling of headlines, all-caps emphasis, etc. It would be worth to revisit this issue again under different data conditions / language pairs. Both options are readily available in EMS. Each of the two alternative methods could be improved as well. See for instance: http://www.aclweb.org/anthology/N06-1001 -phi -phi On Wed, May 20, 2015 at 12:31 PM, Lane Schwartz dowob...@gmail.com wrote: Philipp (and others), I'm wondering what people's experience is regarding when truecasing is applied. One option is to truecase the training data, then train your TM and LM using that truecased data. Another option would be to lowercase the data, train TM and LM on the lowercased data, and then perform truecasing after decoding. I assume that the former gives better results, but the latter approach has an advantage in terms of extensibility (namely if you get more data and update your truecase model, you don't have to re-train all of your TMs and LMs). Does anyone have any insights they would care to share on this? Thanks, Lane ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support -- When a place gets crowded enough to require ID's, social collapse is not far away. It is time to go elsewhere. The best thing about space travel is that it made it possible to go elsewhere. -- R.A. Heinlein, Time Enough For Love ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support -- When a place gets crowded enough to require ID's, social collapse is not far away. It is time to go elsewhere. The best thing about space travel is that it made it possible to go elsewhere. -- R.A. Heinlein, Time Enough For Love ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] keep some features fixed when tuning
Thank you all. Can you explain further what does it mean that MERT won't know that the feature exists? Does that mean that the tuneable feature weights are optimized assuming that all non-tuneable feature weights are equal to zero? In fact, in my understanding this should lead to a dramatic decrease of the score Bleu on the tuning set, but this is not what I ended up with in my tests, at least in some cases (decrease of just 0.19 Bleu when adding tuneable=false to PhrasePenalty and Distortion on a tuning set constituted by 574 segments). Vito M. 2015-05-20 14:38 GMT+02:00 Rico Sennrich rico.sennr...@gmx.ch: Matthias Huck mhuck@... writes: Hi Vito, tuneable=false should work. Just my usual caveat: if you use 'tuneable=false', the feature score(s) won't be reported to the n-best list, and MERT/MIRA/PRO won't even know that the feature exists. This is appropriate in some cases (keeping a feature weight at 0, or giving a high penalty to some glue rules to ensure that they are only used if no translation is possible without them), but in other cases, hiding important features causes the optimizer to search the wrong space. best wishes, Rico ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support -- *M**. Vito MANDORINO -- Chief Scientist* [image: Description : Description : lingua_custodia_final full logo] *The Translation Trustee* *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux* *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89* *Email :* *vito.mandor...@linguacustodia.com massinissa.ah...@linguacustodia.com* *Website :* *www.linguacustodia.com http://www.linguacustodia.com/ - www.thetranslationtrustee.com http://www.thetranslationtrustee.com/* ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] keep some features fixed when tuning
Hi Vito Yes, that's basically what happens, and you're right that tuneable=false can be harmful to MERT - hence my warning. I've heard of people trying to keep the weights of a language model fixed through it, and this didn't work at all. MERT (but not MIRA) also supports the option -o to define which features to optimize, and mert-moses.perl accepts the option --activate-features to do the same (it then passes the list of features to MERT). This may be more suitable for some cases. Code contributions for a better way of having fixed weights (that also works for PRO and MIRA) are welcome. best wishes, Rico On 22.05.2015 17:07, Vito Mandorino wrote: Thank you all. Can you explain further what does it mean that MERT won't know that the feature exists? Does that mean that the tuneable feature weights are optimized assuming that all non-tuneable feature weights are equal to zero? In fact, in my understanding this should lead to a dramatic decrease of the score Bleu on the tuning set, but this is not what I ended up with in my tests, at least in some cases (decrease of just 0.19 Bleu when adding tuneable=false to PhrasePenalty and Distortion on a tuning set constituted by 574 segments). Vito M. 2015-05-20 14:38 GMT+02:00 Rico Sennrich rico.sennr...@gmx.ch mailto:rico.sennr...@gmx.ch: Matthias Huck mhuck@... writes: Hi Vito, tuneable=false should work. Just my usual caveat: if you use 'tuneable=false', the feature score(s) won't be reported to the n-best list, and MERT/MIRA/PRO won't even know that the feature exists. This is appropriate in some cases (keeping a feature weight at 0, or giving a high penalty to some glue rules to ensure that they are only used if no translation is possible without them), but in other cases, hiding important features causes the optimizer to search the wrong space. best wishes, Rico ___ Moses-support mailing list Moses-support@mit.edu mailto:Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support -- *M**. Vito MANDORINO -- Chief Scientist* Description : Description : lingua_custodia_final full logo */The Translation Trustee/* *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux* *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89* *Email :vito.mandor...@linguacustodia.com mailto:massinissa.ah...@linguacustodia.com*** *Website :www.linguacustodia.com http://www.linguacustodia.com/ - www.thetranslationtrustee.com http://www.thetranslationtrustee.com/* ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] inc-giza Installation Issue - Incremental Training
Hello, I'm trying to do incremental training with moses, training with multiple translated files. I've been following steps at http://www.statmt.org/moses/?n=Advanced.Incremental , but I'm stuck at installing incremental giza https://code.google.com/p/inc-giza-pp/ . I've been passing through these steps : svn checkout http://inc-giza-pp.googlecode.com/svn/trunk/ inc-giza-pp-read-only cd inc-giza-pp-read-only make But it gives error like (after some successful compilations): g++ -Wall -W -Wno-deprecated -O3 -DNDEBUG -DWORDINDEX_WITH_4_BYTE -DBINARY_SEARCH_FOR_TTABLE optimized/Parameter.o optimized/myassert.o optimized/Perplexity.o optimized/model1.o optimized/model2.o optimized/model3.o optimized/getSentence.o optimized/TTables.o optimized/ATables.o optimized/AlignTables.o optimized/main.o optimized/NTables.o optimized/model2to3.o optimized/collCounts.o optimized/alignment.o optimized/vocab.o optimized/MoveSwapMatrix.o optimized/transpair_model3.o optimized/transpair_model5.o optimized/transpair_model4.o optimized/utility.o optimized/parse.o optimized/reports.o optimized/model3_viterbi.o optimized/model3_viterbi_with_tricks.o optimized/Dictionary.o optimized/model345-peg.o optimized/hmm.o optimized/HMMTables.o optimized/ForwardBackward.o -L/usr/local/lib -lxmlrpc_server_abyss++ -lxmlrpc_server++ -lxmlrpc_server_abyss -lxmlrpc_server -lxmlrpc_abyss -lpthread -lxmlrpc++ -lxmlrpc -lxmlrpc_util -lxmlrpc_xmlparse -lxmlrpc_xmltok -static -o GIZA++ /usr/local/lib/libxmlrpc_abyss.a(conf.o): In function `parseUser': /home/hieu/xmlrpc-c-1.33.17/lib/abyss/src/conf.c:279: warning: Using 'getpwnam' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking /usr/local/lib/libxmlrpc_util.a(lock_pthread.o): In function `destroy': /home/hieu/xmlrpc-c-1.33.17/lib/libutil/lock_pthread.c:41: undefined reference to `pthread_mutex_destroy' /usr/local/lib/libxmlrpc_util.a(lock_pthread.o): In function `xmlrpc_lock_create_pthread': /home/hieu/xmlrpc-c-1.33.17/lib/libutil/lock_pthread.c:63: undefined reference to `pthread_mutex_init' collect2: error: ld returned 1 exit status make[1]: *** [GIZA++] Error 1 make[1]: Leaving directory `/home/hieu/inc-giza-pp-read-only/GIZA++-v2' make: *** [gizapp] Error 2 I've already installed xmlrpc-c-1.33.17 library, and I tried to change -lpthread s order in Makefile at GIZA++2-v2 directory -with some hope-, but it didn't work. My system is Ubuntu 14.04. Am I missing something ? Thanks in advance. Barış ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support