hi krishna
On 04/10/2012 18:16, krishna nanda wrote:
Hello Hieu,
I have some questions on getting moses to work on native windows:
1. I see that under other builds, there is a moses visual studio
project. Is that project up to date?
i'm not sure. I was told it was working about a month ago. It should be
reasonably up-to-date
2. From this thread:
http://comments.gmane.org/gmane.comp.nlp.moses.user/6991
I understand that moses scripts for training/tuning have not been
ported to native windows
correct. It requires cygwin
.
And from here: http://www.statmt.org/moses/?n=Moses.FAQ#ntoc9
I understand that moses decoder works fine in native windows.
The visual studio project compiles the decoder natively. See Q1.
So, if I build a binarized language model, binarized phrase table and
binarized reordering table in linux, would it be ok to use these
binarized models with the decoder in native windows?
Phrase-table and lexical reordering- no problem. LM - only ken LM. There
is no visual studio project for IRSTLM
3. Lastly, I also saw in the above thread that you are porting KenLM
to windows. That would be very helpful, as I was looking to build
binarized language models in native windows.
it's included.
NB - Both KenLM and IRSTLM uses memory mapping. Therefore, to use binary
LM bigger than 2GB, a 64-bit OS has to be used.
NB2. Cygwin is 32-bit, even on 64-bit windows.
Thank you Hieu
Krishna
On Mon, Sep 24, 2012 at 7:31 AM, krishna nanda
<[email protected] <mailto:[email protected]>> wrote:
Hello Hieu,
It was more out of curiosity. Since when we build moses, we do
specify one of the four language models (Ken/IRST/SRI/Rand), I was
wondering how easy it is to modify moses to accept some other
language model.
When you say: "The Moses framework is allow anyone to write their
own LM wrapper so that it can be used in the decoder"
do you mean if I do have an ARPA file created from some language
model, I can use it with moses like how I use IRSTLM with moses,
just that the ARPA file is not created using IRSTLM.
I am experimenting creating a small ARPA file from raw n gram data
and feeding it to moses. I have n grams (1,2,3) and their
probabilities and backoff weights. I formatted them as required by
ARPA
(http://www.speech.sri.com/projects/srilm/manpages/ngram-format.5.html)
But, when I try to binarize the ARPA file in moses, I get the
following error:
*
*
*"The context of every 3-gram should appear as a 2-gram"*
Since the 2 grams and 3 grams are extracted from the same data, I
was not sure why the above error message would not be true. I
traced the error to the file "lm/search_hashed.cc". The ARPA
formatting in itself seems ok to me. I am not sure what I might be
missing.
Thanks for your time Hieu
Krishna
On Tue, Sep 18, 2012 at 5:31 PM, Hieu Hoang
<[email protected] <mailto:[email protected]>> wrote:
glad to hear that it's working with cygwin. If anyone out
there who's willing to occasionally test Moses on cygwin and
report any problems, I will be very grateful.
I'm curious, what is the benefit of the MS Web LM and MITLM?
The Moses framework is allow anyone to write their own LM
wrapper so that it can be used in the decoder. If these other
LM have advantages, it'll be good to incorporate them.
On 18/09/2012 04:36, krishna nanda wrote:
Hello Hieu,
Thanks for your reply. Yes, I managed to run moses in cygwin
without any problems. No changes were required.
I have another question:
I saw that moses has support for Ken/IRST/SRI/Rand Language
models.
But is there an easy way to consume other language models
like the Microsoft web language model or MITLM in moses?
Thank you
Krishna
On Fri, Aug 24, 2012 at 10:01 PM, Hieu Hoang
<[email protected] <mailto:[email protected]>>
wrote:
the last 2 number (4 31) are not scores and are ignored.
They are counts that was used to calculate the probabilities.
there's no option to calculate joint probabilities. i
suppose you need to calculate p(s) or p(t) which can be
done, but may require a lot of memory. try it yourself
and add it to moses if it works.
so you managed to run moses in cygwin all the way to
getting a bleu score? was there anything you need to change?
On 23/08/2012 16:57, krishna nanda wrote:
Hello Hoang and Barry,
Thanks a lot for your reply. I was able to install moses
and run it in cygwin. I have some quick questions:
I found the format (contents) of the phrase table here:
http://www.statmt.org/moses/?n=FactoredTraining.ScorePhrases
according to which, there are 5 scores for a phrase pair.
But, the phrase table I generated from the news corpus
has 7 scores for a phrase pair like this:
,au cours de ||| ,as of ||| .25 2.36556e-06 0.0333581
0.00128335 2.718 ||| 4 31
I was not clear on the above format.
Secondly, is there an option to also generate joint
probabilities of phrase pairs?
Thank you
Krishna
On Sun, Aug 19, 2012 at 11:12 PM, Hieu Hoang
<[email protected]
<mailto:[email protected]>> wrote:
running Moses on cygwin should be the same as
running it on linux or mac. If you have any
problems, please get back to us.
the document you pointed to is old now, i've changed
the website to reflect that.
On 17/08/2012 08:45, krishna nanda wrote:
Hello,
I am looking to install Moses in Cygwin. However, I
found the document on the website under "windows
installation" to be not up to date:
http://www.statmt.org/moses/?n=Development.GetStarted
It still uses "regenerate-makefiles.sh",which is
not there in the latest source from github. Is
there an updated version?
Thank you
Krishna
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support