Then which numbers do I use for IRSTLM and SRILM?
On Thu, 29 Apr 2021 at 7:10 PM Hieu Hoang wrote:
>
> On 4/29/2021 5:27 AM, Marwa Gaser wrote:
>
> Hello,
>
> In the baseline training, what do the numbers in the below line represent?
>
>
> 3 for the 3-gram?
>
> yes
>
> How about 0 and 8?
>
> 0
On 4/29/2021 5:27 AM, Marwa Gaser wrote:
Hello,
In the baseline training, what do the numbers in the below line
represent?
3 for the 3-gram?
yes
How about 0 and 8?
0 means that the LM over the surface words. If your output has other
factors, eg. Je|PRO suis|VB etudiant|ADJ, you can
Hello,
In the baseline training, what do the numbers in the below line represent?
3 for the 3-gram? How about 0 and 8?
-lm 0:3:$HOME/lm/news-commentary-v8.fr-en.blm.en:8
___
Moses-support mailing list
Moses-support@mit.edu
Thanks Philipp and Kenneth!
So, does this mean that finding the weights and log-linear interpolation of
LMs is actually implemented in KenLM, but there is no ready-made,
higher-level script to use this functionality, as there is for SRILM
(interpolate-lm.perl)?
@Kenneth Since KenLM is already
It's new. There are some rough edges like memory budgeting. Also, I'd
argue there is less need for a script since there is one integrated
program that takes models, tunes, and generates the combined model
(though you can split it into steps if you'd like).
Another thing to note: you'll need to
Oh also, use a small -S argument to the interpolate program because it
doesn't quite budget memory properly yet.
On 06/28/2016 05:08 PM, Kenneth Heafield wrote:
> Log-linear interpolation is in KenLM in the lm/interpolate directory.
> You'll want to get KenLM from github.com/kpu/kenlm and compile
Log-linear interpolation is in KenLM in the lm/interpolate directory.
You'll want to get KenLM from github.com/kpu/kenlm and compile with Eigen.
Tuning log-linear weights is super slow, but applying them is reasonably
fast. In total the tuning + applying weights time is comparable to SRILM.
Hi all
I have trained several language models and would like to combine them with
interpolate-lm.perl:
https://github.com/moses-smt/mosesdecoder/blob/master/scripts/ems/support/interpolate-lm.perl
As the language model tool, I always use KenLM, but looking at the code of
interpolate-lm.perl, it
Hi all,
I have a question regarding LMs.
Let's take the example of news.2014.shuffle.en
When we process it through punctuation normalization for english
language, it will for instance put a " " before an apostrophe
"it is'nt" = > "it is 'nt"
BUT it contains some noise, for instance there is
Hi,
I tend to fix it in the tokenization script, or I would solve this in some
pre-processing scripts if there are any obvious patterns in the noise.
--
Dingyuan
2015年11月26日 21:09於 "Vincent Nguyen" 寫道:
> Hi all,
>
> I have a question regarding LMs.
>
> Let's take the example of
make a copy of
LM/SkeletonLM.*
Look at the code and change it to do whatever you want
On 02/11/2015 08:57, Vu Thuong Huyen wrote:
Dear Hieu,
I want to integrate new language model (like: recurrent neural network
language model) into Moses. Could you tell me how to do?
I’m new in
when you compile with IRSTLM, you must get the latest version. The
latest version is 5.80.08, from
http://sourceforge.net/projects/irstlm/files/
On 01/08/2015 12:17, kalu mera wrote:
Dear Members,
I am trying to create a language model creation, I entered this command
Dear Members,
I am trying to create a language model creation, I entered this command
kalumera@kalumera-Satellite-C50-A534:~/mosesdecoder$ ./bjam
--with-boost=~/workspace/temp/boost_1_55_0 -j4
but the build failed
Please check the attachment for the command i entered and the error, and
help
Hi Janez,
Seth syggested you the right fix
I just checked the IRSTLM documentation
http://sourceforge.net/apps/mediawiki/irstlm/index.php?title=Estimating_gigantic_models
and the correct notation is reported there.
Could you please tell me from where do you get the wrong information
So that I
I just checked the IRSTLM documentation
http://sourceforge.net/apps/mediawiki/irstlm/index.php?title=Estimating_gigantic_models
and the correct notation is reported there.
Could you please tell me from where do you get the wrong information
So that I correct it.
Nicola
(on behalf of
Hi Nicola
When I tried with irstlm 5.80.03, the version mentioned on the Moses
baseline page (http://www.statmt.org/moses/?n=Moses.Baseline), it did
not like the yes. Has there been a change in irstlm? I can check again.
There has been some history with this argument. You can see in the wiki
Hi Janez
In my opinion there are two things that need to be somehow described
or corrected in the Moses baseline:
1. Notify the user about the location of the Giza++ utilities
(mosesdecoder/tools or mosesdecoder/giza++) and need to rename the
folders to the one used in command.
The
Hi,
I'm a beginnner in Linux... so I like to see things happening in the
foreground :)
If there is no need to change the command... don't do that... ;)
Obviously I overlooked and did not go through the section of copying the
giza++ utilities to the tools directory.
Thank you for your help.
First four commands were executed successfuly. The last one failed. Here
is the result after entering the following command line:zzz at
zzz-laptop:~/lm$ ~/moses/irstlm/bin/compile-lm --text
news-commentary-v8.fr-en.lm.en.gz news-commentary-v8.fr-en.arpa.en
inpfile:
Hello!
We are moving slowly through the Moses MT preparation task. We came to the
Language model Training. We are following the Moses Baseline.
The language model (LM) is used to ensure fluent output, so it is built
with the target language (i.e English in this case). The IRSTLM
documentation
maybe you should run the Moses wrapper script
scripts/generic/trainlm-irst2.perl
which executes the irstlm script for you
On 29 June 2013 14:32, Mehndi Bhargava mehndi.bharg...@gmail.com wrote:
when i run the following command:
~/irstlm/bin/add-start-end.sh
when i run the following command:
~/irstlm/bin/add-start-end.sh ~/corpus/news-commentary-v7.fr-en.true.en
news-commentary-v7.fr-en.sb.en export IRSTLM=$HOME/irstlm;
~/irstlm/bin/build-lm.sh -i news-commentary-v7.fr-en.sb.en -t ./tmp -p \ -s
improved-kneser-ney -o news-commentary-v7.fr-en.lm.en
Hi,
I am here by attaching a file. I'm trying to work on that data.
Upto truecasing step, it worked correctly.
But in language model training step, language model file is not created.
When I tried to work with the data provided by moses website, it
worked correctly.
Can you please tell me, why
It looks like you have too little data to build a language model. If
you continue to have the problem after using more data, please post the
command you ran and the output. There are at least four different ways
to build a language model described in
Hello Guys,
In order to build a language model , Am I supposed to use a different
corpus from the one I will use in training the translation model ?? Thank
you in advance
___
Moses-support mailing list
Moses-support@mit.edu
is OK.
** **
*发件人:* moses-support-boun...@mit.edu [mailto:
moses-support-boun...@mit.edu] *代表 *sara hamza
*发送时间:* 2012年4月19日 17:17
*收件人:* Moses-support@mit.edu
*主题:* [Moses-support] Language Model
** **
Hello Guys,
In order to build a language model , Am I supposed
I don't know how to build language model including POS only . I don't want
to include surface and lamma form.
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
Dear Moses,
Can I interest you in an ARPA language model filter?
http://kheafield.com/code/mt/filter.html . It enforces phrase and
sentence-level constraints, not just vocabulary. You might have to
modify your perl scripts.
Kenneth
___
Thank you very much for your answer.
With regard to this I have a few more questions:
1) How is the conditional probability of an n-gram is calculated ?
2) If some n-gram is not present in the language model, does it mean that
its conditional probability is 0 ?
3) What are backoff weights ?
: [Moses-support] Language model
Thank you very much for your answer.
With regard to this I have a few more questions:
1) How is the conditional probability of an n-gram is calculated ?
2) If some n-gram is not present in the language model, does it mean that its
conditional probability is 0 ?
3
Hi,
Could you please explain about the format of .lm file generated by the
script ngram-count. For example, I got .lm file that starts with:
\data\
ngram 1=76288
ngram 2=1644644
ngram 3=1410926
ngram 4=1393383
ngram 5=1071864
\1-grams:
-2.815075 ! -1.648233
-3.10526
Hi,
here is a description of the ARPA format used for language model :
http://www.speech.sri.com/projects/srilm/manpages/ngram-format.5.html
Michael Zuckerman wrote:
Hi,
Could you please explain about the format of .lm file generated by the
script ngram-count. For example, I got .lm file
Michael Zuckerman wrote:
Could you please explain about the format of .lm file generated by
the script ngram-count.
http://www.speech.sri.com/projects/srilm/manpages/ngram-format.5.html
- John D. Burger
MITRE
___
Moses-support mailing list
:* [EMAIL PROTECTED] [mailto:
[EMAIL PROTECTED] *On Behalf Of *Jie Wu
*Sent:* 05 December 2007 16:42
*To:* moses-support@mit.edu
*Subject:* [Moses-support] Language Model in Moses under VS 05
Hi,
I have two questions:
1. I am studying Moses and found out that in VS05, Moses uses the internal
@mit.edu
Subject: Re: [Moses-support] Language Model in Moses under VS 05
Hi, Hieu,
Thanks for the reply. But I found out that if Moses of Windows version
only supports the internal language model, that implies that loading
language models in binary format is not supported. I am kind of
confused
=FactoredTraining.HomePage
Hieu Hoang
www.hoang.co.uk/hieu
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jie Wu
Sent: 05 December 2007 16:42
To: moses-support@mit.edu
Subject: [Moses-support] Language Model in Moses under VS 05
Hi,
I have two questions
36 matches
Mail list logo