Hi, just to say I just proposed a --help option as well right now via a pull request. Hopefully should be integrated too. :-)
Jehan On Sat, Nov 26, 2011 at 1:05 AM, Jehan Pages <[email protected]> wrote: > Hi, > > sorry if I have not been clear. The current version, the one you > likely had from Moses repository is indeed SRILM only. The -lm option > I wrote about is brand new. I wrote it today (then made a pull request > to upstream's Moses repository) and I see it has just been merged into > the main repository like an hour ago. > > So now if you pull the latest code, you'll have this option. When you > wrote your email, it was not yet available. Hence no need to apologize > (note that even if there were this option before, there would be no > need to apologize either by the way! Plus, I am only discovering Moses > and its possibilities as well). > > Also I see you compare it with -lm in train-model.perl. The one I > wrote has a different syntax. > > And yeah I have been fixing as well a --help would be useful in > train-recaser. Maybe I'll write one too, unless someone does it > before! :-) > > Jehan > > On Fri, Nov 25, 2011 at 10:12 PM, Daniel Schaut <[email protected]> > wrote: >> Hi Jehan, >> >> That's a nice idea and thanks for the trick. :) I thought the lm switch >> could only be used in connection with train-model. Apologies for the lack of >> knowledge. ;) >> So, all switches found in the reference >> http://www.statmt.org/moses/?n=FactoredTraining.TrainingParameters >> can be called with train-recaser, too? If yes, this could be mentioned in >> the manual by dropping a line. >> >> Would be nice to add a help switch for train-recaser, too. >> >> Daniel >> >> -----Ursprüngliche Nachricht----- >> Von: [email protected] [mailto:[email protected]] Im >> Auftrag von Jehan Pages >> Gesendet: Freitag, 25. November 2011 03:54 >> An: <[email protected]> >> Betreff: Re: [Moses-support] Train recasing model using IRSTLM >> >> Hi all, >> >> rather than having to search through email archive, as I guess we are not >> the only one who won't use SRILM because it is proprietary (or some other >> reason), I thought the best would be to modify the existing script to be >> able to switch to IRSTLM when desired. I have just made a pull request on >> the Moses repository for updating this train-recaser.perl script. >> >> Description: >> Note that by default, the script will still use SRILM, which prevent from >> breakage any existing script calling the current version of >> train-recaser.perl. >> To use IRSTLM instead of SRILM, only adding "-lm irstlm" on the command line >> is enough. >> In case build-lm.sh is not in $PATH, there is also a new option -build-lm >> which allows one to specify the given path of the script to use (with >> build-lm.sh command line syntax). >> >> I think this should be better in long term. :-) >> >> Jehan >> >> On Sun, Nov 13, 2011 at 12:58 AM, Daniel Schaut <[email protected]> >> wrote: >>> Dear all, >>> >>> >>> >>> I’m having some difficulties to train the recasing model with IRSTLM. >>> I changed the train-recaser script according to >>> >>> http://www.mail-archive.com/[email protected]/msg01934.html >>> >>> but this results in an error which I don’t know how to fix. >>> >>> >>> >>> Error log: >>> >>> ---------------------------------------------------------------------- >>> - >>> >>> (4) Training recasing model @ Sat Nov 12 14:49:06 CET 2011 >>> >>> /home/user/mosestools/scripts-20111024-1127/training/train-model.perl >>> --root-dir /home/user/moses/work/recaser --model-dir >>> /home/user/moses/work/recaser --first-step 4 --alignment a --corpus >>> /home/user/moses/work/recaser/aligned --f lowercased --e cased >>> --max-phrase-length 1 --lm >>> 0:3:/home/user/moses/work/recaser/cased.irstlm.gz:1 -scripts-root-dir >>> /home/user/moses/mosestools/scripts-20111024-1127 >>> >>> Can't exec >>> "/home/user/mosestools/scripts-20111024-1127/training/train-model.perl >>> ": No such file or directory at ./train-recaser.perl line 95. >>> >>> >>> >>> (11) Cleaning up @ Sat Nov 12 14:49:06 CET 2011 >>> >>> ---------------------------------------------------------------------- >>> - >>> >>> >>> >>> Then instead of using build-lm.sh, I gave it another try calling >>> compile-lm >>> directly: >>> >>> my $cmd = "/home/user/moses/mosestools/irstlm-5.60.03/bin/compile-lm >>> $CORPUS /dev/stdout | gzip -c > $DIR/cased.irstlm.gz >>> >>> where $CORPUS is a gzip iARPA file. >>> >>> >>> >>> Error log: >>> >>> ---------------------------------------------------------------------- >>> - >>> >>> (3) Preparing data for training recasing model @ Sat Nov 12 15:11:26 >>> CET >>> 2011 >>> >>> /home/nexoc/moses/work/recaser/aligned.lowercased >>> >>> utf8 "\x8B" does not map to Unicode at ./train-recaser.perl line 64, >>> <CORPUS> line 1. >>> >>> Malformed UTF-8 character (fatal) at ./train-recaser.perl line 70, >>> <CORPUS> line 1. >>> >>> ---------------------------------------------------------------------- >>> - >>> >>> >>> >>> Please see full error logs attached for more information. >>> >>> >>> >>> Could anyone give me a hint on how to train a recasing model with >>> either build-lm.sh or compile-lm? Help is very much appreciated. >>> >>> >>> >>> Thanks, >>> >>> Daniel >>> >>> >>> >>> _______________________________________________ >>> Moses-support mailing list >>> [email protected] >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> >>> >> >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
