Re: [Moses-support] Reg. Giza++ crash reported by EMS
Hi Sriram (and all MacOsX users) are you using a MacOsX machine? So probably the error is due to the fact that by default the OS has a case-insensitive filesystem so A3.final files are wrongly overwritten by a3.final ones The easiest way to solve this issue is the following: (found on the web but I've lost the link) 1) edit the GIZA source file GIZA++-v2/model3.cpp and change the lines 321-322 alignfile = Prefix + .A3. + number ; test_alignfile = Prefix + .tst.A3. + number ; as follows alignfile = Prefix + .UA3. + number ; test_alignfile = Prefix + .tst.UA3. + number ; 2) recompile GIZA 3) call the Moses training script train-model.perl adding the following parameter --giza-extension UA3.final This works for giza-pp-v1.0.2.tar.gz, but similar changes can be done for other versions. best regards, Nicola Bertoldi On Mar 17, 2011, at 5:17 PM, Barry Haddow wrote: Hi Sriram GIZA has output an error message, which may mean your alignmenmts are faulty. You should search for 'error' in TRAINING_run-giza.3.STDERR, and remember that it may appear in uppercase in this file. If you want to try continuing with the alignments that were produced, then you can force ems to use them by adding something like giza-alignment = $working-dir/training/giza.12 giza-alignment-inverse = $working-dir/training/giza-inverse.12 to the TRAINING section, best regards - Barry On Thursday 17 March 2011 15:57, Sriram V wrote: Hello, When I run ems/experiment.perl, giza++ runs well in both the directions and produces the corresponding *.A3.final.gz files. However, it is reported that those steps have crashed. Subsequently, the following components do not run. Any ideas about what could have gone wrong here ? TRAINING_run-giza.3.STDERR.digest error error TRAINING_run-giza-inverse.3.STDERR.digest error error Here are the last few lines of the file TRAINING_run-giza.3.STDERR 7 8 9 NTable contains 286060 parameter. Executing: rm -f ...working-dir//training/giza-inverse.3/de-en.A3.final.gz Executing: gzip .../working-dir//training/giza-inverse.3/de-en.A3.final Regards, Sriram -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Reg. Giza++ crash reported by EMS
Thanks Nicola. I am using it on Fedora. So, I guess the problem is elsewhere. Regards, Sriram On Fri, Mar 18, 2011 at 10:08 AM, Nicola Bertoldi berto...@fbk.eu wrote: Hi Sriram (and all MacOsX users) are you using a MacOsX machine? So probably the error is due to the fact that by default the OS has a case-insensitive filesystem so A3.final files are wrongly overwritten by a3.final ones The easiest way to solve this issue is the following: (found on the web but I've lost the link) 1) edit the GIZA source file GIZA++-v2/model3.cpp and change the lines 321-322 alignfile = Prefix + .A3. + number ; test_alignfile = Prefix + .tst.A3. + number ; as follows alignfile = Prefix + .UA3. + number ; test_alignfile = Prefix + .tst.UA3. + number ; 2) recompile GIZA 3) call the Moses training script train-model.perl adding the following parameter --giza-extension UA3.final This works for giza-pp-v1.0.2.tar.gz, but similar changes can be done for other versions. best regards, Nicola Bertoldi On Mar 17, 2011, at 5:17 PM, Barry Haddow wrote: Hi Sriram GIZA has output an error message, which may mean your alignmenmts are faulty. You should search for 'error' in TRAINING_run-giza.3.STDERR, and remember that it may appear in uppercase in this file. If you want to try continuing with the alignments that were produced, then you can force ems to use them by adding something like giza-alignment = $working-dir/training/giza.12 giza-alignment-inverse = $working-dir/training/giza-inverse.12 to the TRAINING section, best regards - Barry On Thursday 17 March 2011 15:57, Sriram V wrote: Hello, When I run ems/experiment.perl, giza++ runs well in both the directions and produces the corresponding *.A3.final.gz files. However, it is reported that those steps have crashed. Subsequently, the following components do not run. Any ideas about what could have gone wrong here ? TRAINING_run-giza.3.STDERR.digest error error TRAINING_run-giza-inverse.3.STDERR.digest error error Here are the last few lines of the file TRAINING_run-giza.3.STDERR 7 8 9 NTable contains 286060 parameter. Executing: rm -f ...working-dir//training/giza-inverse.3/de-en.A3.final.gz Executing: gzip .../working-dir//training/giza-inverse.3/de-en.A3.final Regards, Sriram -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Reg. Giza++ crash reported by EMS
Hi Sriram You have a version of GIZA++ which doesn't support cooccurrence files. To add support for cooccurrence files, you need to edit the GIZA++ Makefile and add the flag -DBINARY_SEARCH_FOR_TTABLE to CFLAGS_OPT. Then you should rebuild GIZA++ and rerun the alignment. I'm not sure why cooccurrence file support is switched off by default in GIZA++. best regards - Barry On Friday 18 March 2011 11:02, Sriram V wrote: Thanks Barry. I located the error in TRAINING_run-giza.3.STDERR. It was - ERROR: parameter 'coocurrencefile' does not exist. WARNING: ignoring unrecognized option: -CoocurrenceFile ERROR: parameter 'optmldatdbuserssriramdeenmosesexptsworkingdirtraininggiza1endecooc' does not exist. -- I am trying to see how to get past this. Any suggestions ? Regards, Sriram On Thu, Mar 17, 2011 at 5:17 PM, Barry Haddow bhad...@inf.ed.ac.uk wrote: Hi Sriram GIZA has output an error message, which may mean your alignmenmts are faulty. You should search for 'error' in TRAINING_run-giza.3.STDERR, and remember that it may appear in uppercase in this file. If you want to try continuing with the alignments that were produced, then you can force ems to use them by adding something like giza-alignment = $working-dir/training/giza.12 giza-alignment-inverse = $working-dir/training/giza-inverse.12 to the TRAINING section, best regards - Barry On Thursday 17 March 2011 15:57, Sriram V wrote: Hello, When I run ems/experiment.perl, giza++ runs well in both the directions and produces the corresponding *.A3.final.gz files. However, it is reported that those steps have crashed. Subsequently, the following components do not run. Any ideas about what could have gone wrong here ? TRAINING_run-giza.3.STDERR.digest error error TRAINING_run-giza-inverse.3.STDERR.digest error error Here are the last few lines of the file TRAINING_run-giza.3.STDERR 7 8 9 NTable contains 286060 parameter. Executing: rm -f ...working-dir//training/giza-inverse.3/de-en.A3.final.gz Executing: gzip .../working-dir//training/giza-inverse.3/de-en.A3.final Regards, Sriram -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] Confusion Networks and Moses Decoders: Questions
Hi, I have been trying to use the confusion network input to the moses decoder. Although I was able to run an example with the documentation available in the website, but could not find answer to a few questions that came up. 1. The input format suggest putting a word and its probability in a single line. Is it possible to put in multiple sentences in a single input file ? 2. What is the significance of the weight-i option that is needed in the command line, how does it actually affect the selection or the scores ? It would be great if you could provide me with the answers or better still, point me towards any material having the same. Thanks and regards, Pratyush Banerjee ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Reg. Giza++ crash reported by EMS
It works ! Thanks very much. - Sriram On Fri, Mar 18, 2011 at 12:11 PM, Barry Haddow bhad...@inf.ed.ac.uk wrote: Hi Sriram You have a version of GIZA++ which doesn't support cooccurrence files. To add support for cooccurrence files, you need to edit the GIZA++ Makefile and add the flag -DBINARY_SEARCH_FOR_TTABLE to CFLAGS_OPT. Then you should rebuild GIZA++ and rerun the alignment. I'm not sure why cooccurrence file support is switched off by default in GIZA++. best regards - Barry On Friday 18 March 2011 11:02, Sriram V wrote: Thanks Barry. I located the error in TRAINING_run-giza.3.STDERR. It was - ERROR: parameter 'coocurrencefile' does not exist. WARNING: ignoring unrecognized option: -CoocurrenceFile ERROR: parameter 'optmldatdbuserssriramdeenmosesexptsworkingdirtraininggiza1endecooc' does not exist. -- I am trying to see how to get past this. Any suggestions ? Regards, Sriram On Thu, Mar 17, 2011 at 5:17 PM, Barry Haddow bhad...@inf.ed.ac.uk wrote: Hi Sriram GIZA has output an error message, which may mean your alignmenmts are faulty. You should search for 'error' in TRAINING_run-giza.3.STDERR, and remember that it may appear in uppercase in this file. If you want to try continuing with the alignments that were produced, then you can force ems to use them by adding something like giza-alignment = $working-dir/training/giza.12 giza-alignment-inverse = $working-dir/training/giza-inverse.12 to the TRAINING section, best regards - Barry On Thursday 17 March 2011 15:57, Sriram V wrote: Hello, When I run ems/experiment.perl, giza++ runs well in both the directions and produces the corresponding *.A3.final.gz files. However, it is reported that those steps have crashed. Subsequently, the following components do not run. Any ideas about what could have gone wrong here ? TRAINING_run-giza.3.STDERR.digest error error TRAINING_run-giza-inverse.3.STDERR.digest error error Here are the last few lines of the file TRAINING_run-giza.3.STDERR 7 8 9 NTable contains 286060 parameter. Executing: rm -f ...working-dir//training/giza-inverse.3/de-en.A3.final.gz Executing: gzip .../working-dir//training/giza-inverse.3/de-en.A3.final Regards, Sriram -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Confusion Networks and Moses Decoders: Questions
Pratyush Banerjee pbanerjee@... writes: The input format suggest putting a word and its probability in a single line. Is it possible to put in multiple sentences in a single input file ? I'm only familiar with the lattice format, which might also serve your purpose. Its format is typically one sentence per line. What is the significance of the weight-i option that is needed in the command line, how does it actually affect the selection or the scores ? This is a normal feature weight. You can also include it in your moses config and tune it through MERT. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] Trying to do fancy things with LMs; need some advice.
Hello all, I am trying to do something rather fancy with Moses by modifying the way Moses uses LMs. What I want to do is somewhat akin to the LanguageModelSkip.h code that is in the repository, in that I want to score sequences over only certain factors from the string (to extend the reach and, hopefully, the approximation to syntactic or dependency LMs). What I have is a way of getting a single label for each entry in the phrase table (yes, sounds crazy, but I managed to pull it off). I have distributed this label (identically) to each word in the MT phrase, and so I want to feed the LM the syntactic label factor of (1) the first word in the current phrase and (2) the label factors of the first words of the n-1 previous *phrases* (NOT *words*) in the search hypothesis that the current phrase is extending. This will essentially tell it the syntactic labels of the n phrases that make up the current search hypothesis. This seems like it should be straightforward. I know I'll need to override the Evaluate and CalcScore member functions of the LanguageModel.cpp class (they compute the inter-phrase and intra-phrase LM scores, right?), but I also see from some comments in the code that I shouldn't access previous hypotheses directly from the Evaluate function. This apparently will get me in trouble. Instead, I need to pass the n-1 previous phrases into the FFState argument to the Evaluate function. (These comments are in a comment from the online code documentation -- which isn't in my checked-out repos; could be out of date) This is similar to what the IRST LM asynchronous LM idea buys you, but without limiting what is fed to the LM by a fixed-length *word* window (the lmmacroSize parameter in the IRST LM chunkLM config file). The way I plan to implement things, IRST LM and SRILM will both be possible LMs to use on the back end -- all of the work will be done by tracking what the n-1 previous phrases are in each hypothesis. My question, then, is (at least) two-fold: (1) Is this the best way to go about this (where this is my whole crazy idea)? And (2): If so, am I right in thinking that (in addition to adding an LM type to the LanguageModelFactory class) all I need to to is override the Evaluate and CalcScore. Or am I completely off-base? (Or is this not really even possible at all?) Any help is much appreciated. Best, D.N. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Trying to do fancy things with LMs; need some advice.
I think you'd be better off implementing your own StatefulFeatureFunction, bypassing LanguageModel.{h,cpp} which mostly handles n-grams crossing phrase boundaries, and calling the LanguageModelImplementation as the backend. You'll probably want larger beams too. Kenneth On 03/18/11 13:38, Dennis Mehay wrote: Hello all, I am trying to do something rather fancy with Moses by modifying the way Moses uses LMs. What I want to do is somewhat akin to the LanguageModelSkip.h code that is in the repository, in that I want to score sequences over only certain factors from the string (to extend the reach and, hopefully, the approximation to syntactic or dependency LMs). What I have is a way of getting a single label for each entry in the phrase table (yes, sounds crazy, but I managed to pull it off). I have distributed this label (identically) to each word in the MT phrase, and so I want to feed the LM the syntactic label factor of (1) the first word in the current phrase and (2) the label factors of the first words of the n-1 previous *phrases* (NOT *words*) in the search hypothesis that the current phrase is extending. This will essentially tell it the syntactic labels of the n phrases that make up the current search hypothesis. This seems like it should be straightforward. I know I'll need to override the Evaluate and CalcScore member functions of the LanguageModel.cpp class (they compute the inter-phrase and intra-phrase LM scores, right?), but I also see from some comments in the code that I shouldn't access previous hypotheses directly from the Evaluate function. This apparently will get me in trouble. Instead, I need to pass the n-1 previous phrases into the FFState argument to the Evaluate function. (These comments are in a comment from the online code documentation -- which isn't in my checked-out repos; could be out of date) This is similar to what the IRST LM asynchronous LM idea buys you, but without limiting what is fed to the LM by a fixed-length *word* window (the lmmacroSize parameter in the IRST LM chunkLM config file). The way I plan to implement things, IRST LM and SRILM will both be possible LMs to use on the back end -- all of the work will be done by tracking what the n-1 previous phrases are in each hypothesis. My question, then, is (at least) two-fold: (1) Is this the best way to go about this (where this is my whole crazy idea)? And (2): If so, am I right in thinking that (in addition to adding an LM type to the LanguageModelFactory class) all I need to to is override the Evaluate and CalcScore. Or am I completely off-base? (Or is this not really even possible at all?) Any help is much appreciated. Best, D.N. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] about moses server
Hi all, I have one question about moses server. I have tow systems: phrase base and hierarchical translate from english to french. Then, can I combine two systems into one config file and use moses server to demo these -- Thu. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support