Hi folks,
We are planning to build a system on Moses 0.91 version. When I got to
the github page of 0.91 by clicking the link on Moses site, I was
confused how to download this version. First I cloned it from
git://github.com/moses-smt/mosesdecoder.git
Got 112M files on mosesdecoer directory.
Hi Barry,
I downloaded the French-English europal parliament corpus for the translation
system from the provided link wget
http://www.statmt.org/wmt12/training-parallel.tgz.
However, the training folder contain German - English corpus.
The files in the training folders are
Hi,
I encountered the same problem when using msb and
pruned singletons on large corpora (Europarl).
SRILM's ngram complaints about no bow for prefix of ngram
Here a Czech example:
grep 'schválení těchto' /home/pkoehn/experiment/wmt12-en-cs/lm/europarl.lm.38
-2.35639schválení těchto
Department 4.6
Applied Linguistics, Translation and Interpreting, Saarland
University, Germany
is inviting applications for*two* three-year Early Stage Researcher
pre-doctoral positions in corpus-based approaches to machine
translation. The positions are part of the new EU
Modified ShiftBeta (aka modified Kenser Ney) does not considered the real
counts for computing probabilties, but the corrected counts, which basically
are the number of different successors of a n-gram.
Hence in this case your bigram schválení těchto occurs always before zpráv,
and hence it
Yep it's a pain and I've had to write a fair amount of code to work
around this. By default, SRI prunes n-grams of order 3 or above if the
adjusted count is 1. For the highest order, the adjusted count is the
raw count. For all other orders, the adjusted count is the number of
unique words
Nicola,
On an unrelated note, could you say why the smoothing technique is
called Modified ShiftBeta in IRSTLM. I know it was originally called
Improved Kneser-Ney and sometimes Simplified Kneser-Ney (Interspeech
2008), which hinted that it varied from the original description of
Modified
Hi,
mrodoje@ubuntu:~/corpus$ ~/mosesdecoder/scripts/tokenizer/tokenizer.perl
-l en ~/corpus/training/europal-v7.de-en.en \
~/corpus/europal-v7.de-en.tok.en
but got the following error
bash: /home/mrodoje/corpus/training/europal-v7.de-en.en: No such file or
directory
How do I solve the
The decoder flags for word alignments have been cleaned up a little. They
were overlapping or didn't work.
These are the ones that work:
-print-alignment-info-in-n-best [true/false]
-alignment-output-file [filename]
The following were deleted:
-use-alignment-info
Can be inferred if
Hi Hieu,
Is the boolean parameter corresponding to -use-alignment-info in
StaticData still present?
W dniu 14.11.2012 20:56, Hieu Hoang pisze:
The decoder flags for word alignments have been cleaned up a little.
They were overlapping or didn't work.
These are the ones that work:
Dear members,
I wanna add my reordering model in moses and i need disable distortion
model . would you plz tell me how to do this.
tnx in advance,
--
S.Farzi, Ph.D. Student
Natural Language Processing Lab,
School of Electrical and Computer Eng.,
Tehran University
hi marcin
no. There's a similar boolean,
m_needAlignmentInfo
but it's not quite the same. It won't be turned on if outputting word
alignment is not requested.
I've changed your compact pt code to use this but you may wanna check it.
If you can give me a small set of commands and excamples
12 matches
Mail list logo