[Moses-support] How to get 0.91v

2012-11-14 Thread Henry Hu
Hi folks, We are planning to build a system on Moses 0.91 version. When I got to the github page of 0.91 by clicking the link on Moses site, I was confused how to download this version. First I cloned it from git://github.com/moses-smt/mosesdecoder.git Got 112M files on mosesdecoer directory.

[Moses-support] corpus tokenisation error

2012-11-14 Thread OYELEKE ODOJE
Hi Barry, I downloaded the French-English europal parliament corpus for the translation system from the provided link wget http://www.statmt.org/wmt12/training-parallel.tgz. However, the training folder contain German - English corpus. The files in the training folders are 

Re: [Moses-support] SRILM vs IRSTLM

2012-11-14 Thread Philipp Koehn
Hi, I encountered the same problem when using msb and pruned singletons on large corpora (Europarl). SRILM's ngram complaints about no bow for prefix of ngram Here a Czech example: grep 'schválení těchto' /home/pkoehn/experiment/wmt12-en-cs/lm/europarl.lm.38 -2.35639schválení těchto

[Moses-support] Two salaried PhD positions in corpus-based machine translation at Saarland University - Marie Curie ITN project EXPERT

2012-11-14 Thread Mihaela Vela
Department 4.6 Applied Linguistics, Translation and Interpreting, Saarland University, Germany is inviting applications for*two* three-year Early Stage Researcher pre-doctoral positions in corpus-based approaches to machine translation. The positions are part of the new EU

Re: [Moses-support] SRILM vs IRSTLM

2012-11-14 Thread Nicola Bertoldi
Modified ShiftBeta (aka modified Kenser Ney) does not considered the real counts for computing probabilties, but the corrected counts, which basically are the number of different successors of a n-gram. Hence in this case your bigram schválení těchto occurs always before zpráv, and hence it

Re: [Moses-support] SRILM vs IRSTLM

2012-11-14 Thread Kenneth Heafield
Yep it's a pain and I've had to write a fair amount of code to work around this. By default, SRI prunes n-grams of order 3 or above if the adjusted count is 1. For the highest order, the adjusted count is the raw count. For all other orders, the adjusted count is the number of unique words

Re: [Moses-support] SRILM vs IRSTLM

2012-11-14 Thread Jonathan Clark
Nicola, On an unrelated note, could you say why the smoothing technique is called Modified ShiftBeta in IRSTLM. I know it was originally called Improved Kneser-Ney and sometimes Simplified Kneser-Ney (Interspeech 2008), which hinted that it varied from the original description of Modified

Re: [Moses-support] corpus tokenisation error

2012-11-14 Thread Philipp Koehn
Hi, mrodoje@ubuntu:~/corpus$ ~/mosesdecoder/scripts/tokenizer/tokenizer.perl -l en ~/corpus/training/europal-v7.de-en.en \ ~/corpus/europal-v7.de-en.tok.en but got the following error bash: /home/mrodoje/corpus/training/europal-v7.de-en.en: No such file or directory How do I solve the

[Moses-support] word alignments

2012-11-14 Thread Hieu Hoang
The decoder flags for word alignments have been cleaned up a little. They were overlapping or didn't work. These are the ones that work: -print-alignment-info-in-n-best [true/false] -alignment-output-file [filename] The following were deleted: -use-alignment-info Can be inferred if

Re: [Moses-support] word alignments

2012-11-14 Thread Marcin Junczys-Dowmunt
Hi Hieu, Is the boolean parameter corresponding to -use-alignment-info in StaticData still present? W dniu 14.11.2012 20:56, Hieu Hoang pisze: The decoder flags for word alignments have been cleaned up a little. They were overlapping or didn't work. These are the ones that work:

[Moses-support] howto disable distortion model in MoSES

2012-11-14 Thread saeed farzi
Dear members, I wanna add my reordering model in moses and i need disable distortion model . would you plz tell me how to do this. tnx in advance, -- S.Farzi, Ph.D. Student Natural Language Processing Lab, School of Electrical and Computer Eng., Tehran University

Re: [Moses-support] word alignments

2012-11-14 Thread Hieu Hoang
hi marcin no. There's a similar boolean, m_needAlignmentInfo but it's not quite the same. It won't be turned on if outputting word alignment is not requested. I've changed your compact pt code to use this but you may wanna check it. If you can give me a small set of commands and excamples