Dear Rico,
I tried using KENLM and NPLM for three language
pairs. And I came across series of questions . I am listing it
one by one. It would be great if you could guide me.
1) I did testing for NPLM with different vocabulary sizes and
training epochs. But, the bleu score, I gained from NPLM
integrated with KENLM is smaller than the one I trained with
KENLM. In all the three language pairs I get a standard
difference of three.
Eg: English to Hindi (KENLM-17.43, NPLM+KENLM-14.27)
Tamil to Hindi (KENLM-16.66,NPLM+KENLM-13.53)
Marathi to Hindi (KENLM-29.42,NPLM+KENLM-25.76)
The sentence count is 103502. unigram count is 89919. I gave
vocabulary size as 89000,89700,89850 with validation size
200,200,100 respectively and with different learning rate and
epocs. However, I am getting Bleu score of NPLM and KENLM is lesser.
2)The Bleu score of the model having perplexity about 385 has
higher Bleu score than the one having pp around 564 . Is this
rite model. I mean the model with lower perplexity seems to give
better Bleu score. Where am I doing worng.
3) I used query script for KENLM model. I found perplexity to
3.4xx. But, the Bleu score of KENLM alone in decoding phase gives
Blue of 16.66 for English to HIndi MT. But, when combined with
NPLM I get only 13.53.
On Sun, Sep 20, 2015 at 8:07 PM, Sanjanashree Palanivel
<[email protected] <mailto:[email protected]>> wrote:
Dear Rico,
Thanks a lot for your excellent guidance.
On Sat, Sep 19, 2015 at 9:10 PM, Rico Sennrich
<[email protected] <mailto:[email protected]>> wrote:
Hi Sanjanasri,
we have seen improvements in BLEU from having both KENLM
and NPLM in our system. Things can go wrong during
training though (e.g. a bad choice of hyperparameters
(vocabulary size, number of training epochs)). I
recommend using a development set during NPLM training,
and comparing perplexity scores with those obtained from
KENLM.
maybe somebody else can help you with the phrase table
normalization. NPLM doesn't have binarization.
best wishes,
Rico
On 19/09/15 08:11, Sanjanashree Palanivel wrote:
Dear Rico,
I did necessary changes and I trained
language model succesfully. The language model of nplm
gives me lesser BLEU score when compared to KENLM. But,
when I used two models together accuracy is greater than
the one I got in NPLM alone but lesser than KENLM. I am
just trying to tune it by changing the parameters. So
far the accuracy is getting improved but not close to
KENLM accuracy. Is that worthy to do because its taking
quite a long time to train.
I also tried to binarize the phrase table following
this
http://www.statmt.org/moses/?n=Advanced.RuleTables#ntoc3, and
compilation with moses is done succesfully. But. when i run
processPhraseTableMin -threads 3 -in train/model/phrase-table.gz
-nscores 4 -out binarised-model/phrase-table
I am getting segmentation fault. I dont know what is worng. Is
there something todo with threads
Also how to binarize nplm model
On Fri, Sep 18, 2015 at 11:27 AM, Sanjanashree Palanivel
<[email protected] <mailto:[email protected]>>
wrote:
Dear Rico,
Thanks a lot. Will do the necessary
changes
On Thu, Sep 17, 2015 at 1:54 PM, Rico Sennrich
<[email protected] <mailto:[email protected]>>
wrote:
Hi Sanjanasri,
if you first compiled moses without the option
'--with-nplm', and then add the option later,
the build system isn't smart enough to know
which files it needs to recompile. if you change
one of the compile options, use the option '-a'
to force recompilation from scratch.
best wishes,
Rico
On 16/09/15 06:30, Sanjanashree Palanivel wrote:
Dear Rico,
I did the following steps
1. Installed NPLM and trained a language model
2. I compiled it with Moses with the
command ./bjam --with-nplm=path/to/nplm
./bjam
--with-nplm=/home/sanjana/Documents/SMT/NPLM/nplm
Tip: install tcmalloc for faster threading.
See BUILD-INSTRUCTIONS.txt for more
information.
warning: No toolsets are configured.
warning: Configuring default toolset "gcc".
warning: If the default is wrong, your
build may not work correctly.
warning: Use the "toolset=xxxxx" option to
override our guess.
warning: For more configuration options,
please consult
warning:
http://boost.org/boost-build2/doc/html/bbv2/advanced/configuration.html
NOT BUILDING MOSES SERVER!
Performing configuration checks
- Shared Boost : yes (cached)
- Static Boost : yes (cached)
...patience...
...patience...
...found 4823 targets...
SUCCESS
3. I added the the following lines to the
moses.ini file
NeuralLM factor=0 name=LM1 order=5
path=/path/to/nplmmodel
LM1= 0.5
Then i did testing. and end up with the error
On Tue, Sep 15, 2015 at 8:43 PM, Rico Sennrich
<[email protected]
<mailto:[email protected]>> wrote:
Hi Sanjanasri,
this error occurs when Moses was compiled
without the option '--with-nplm'.
best wishes,
Rico
On 15.09.2015 15 <tel:15.09.2015%2015>:08,
Sanjanashree Palanivel wrote:
Dear Rico,
I updated moses and NPLM has
been compiled succesfully with moses.
However, when I perform decoding I am
getting an error.
Defined parameters (per moses.ini or
switch):
config:
/home/sanjana/Documents/SMT/ICON15/Health/BL/Ta_H/model/moses.ini
distortion-limit: 6
feature: UnknownWordPenalty
WordPenalty PhrasePenalty
PhraseDictionaryMemory
name=TranslationModel0 num-features=4
path=/home/sanjana/Documents/SMT/ICON15/Health/BL/Ta_H/model/phrase-table.gz
input-factor=0 output-factor=0
Distortion KENLM lazyken=0 name=LM0
factor=0
path=/home/sanjana/Documents/SMT/LM/Hindi/monolin80k.hi1.bin
order=3 NeuralLM factor=0 name=LM1
order=3
path=/home/sanjana/Documents/SMT/LM/Hindi/hin_out.txt
input-factors: 0
mapping: 0 T 0
weight: Distortion0= 0.136328 LM0=
0.135599 LM1= 0.5 WordPenalty0=
-0.488892 PhrasePenalty0= 0.0826147
TranslationModel0= 0.0104273 0.0663914
0.0254094 0.0543384
UnknownWordPenalty0= 1
line=UnknownWordPenalty
FeatureFunction: UnknownWordPenalty0
start: 0 end: 0
line=WordPenalty
FeatureFunction: WordPenalty0 start: 1
end: 1
line=PhrasePenalty
FeatureFunction: PhrasePenalty0 start:
2 end: 2
line=PhraseDictionaryMemory
name=TranslationModel0 num-features=4
path=/home/sanjana/Documents/SMT/ICON15/Health/BL/Ta_H/model/phrase-table.gz
input-factor=0 output-factor=0
FeatureFunction: TranslationModel0
start: 3 end: 6
line=Distortion
FeatureFunction: Distortion0 start: 7
end: 7
line=KENLM lazyken=0 name=LM0 factor=0
path=/home/sanjana/Documents/SMT/LM/Hindi/monolin80k.hi1.bin
order=3
FeatureFunction: LM0 start: 8 end: 8
line=NeuralLM factor=0 name=LM1
order=3
path=/home/sanjana/Documents/SMT/LM/Hindi/hin_out.txt
Exception: moses/FF/Factory.cpp:349 in
void
Moses::FeatureRegistry::Construct(const string&,
const string&) threw
UnknownFeatureException because `i ==
registry_.end()'.
Feature name NeuralLM is not registered.
I added following 2 lines in my moses file
NeuralLM factor=0 name=LM1 order=5
path=/path/to/nplmmodel
LM1= 0.5
On Tue, Sep 15, 2015 at 5:06 PM,
Sanjanashree Palanivel
<[email protected]
<mailto:[email protected]>> wrote:
Thank you for your earnest response. I
will update moses and I will try
On Tue, Sep 15, 2015 at 4:22 PM, Rico
Sennrich <[email protected]
<mailto:[email protected]>> wrote:
Hello Sanjanasri,
this looks like a version mismatch
between Moses and NPLM.
Specifically, you're using an
older Moses commit that is only
compatible with nplm 0.2 (or
specifically, Kenneth's fork at
https://github.com/kpu/nplm ).
If you use the latest Moses
version from
https://github.com/moses-smt/mosesdecoder
, and the latest nplm version from
https://github.com/moses-smt/nplm
, it should work.
best wishes,
Rico
On 15.09.2015 08
<tel:15.09.2015%2008>:24,
Sanjanashree Palanivel wrote:
Dear all,
I tried building language model
using NPLM. Llanguage model was
build succesfully, but, when I
tried to compile NPLM with Moses
using "./bjam
--with-nplm=path/to/nplm" I am
getting an error. I am using
boost 1.55. I am attaching the
log file for reference. I dont
know where I went wrong. Any help
would be appreciated.
--
Thanks and regards,
Sanjanasri J.P
_______________________________________________
Moses-support mailing list
[email protected]
<mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
<mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
--
Thanks and regards,
Sanjanasri J.P
--
Thanks and regards,
Sanjanasri J.P
_______________________________________________
Moses-support mailing list
[email protected]
<mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
--
Thanks and regards,
Sanjanasri J.P
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
--
Thanks and regards,
Sanjanasri J.P
--
Thanks and regards,
Sanjanasri J.P
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
--
Thanks and regards,
Sanjanasri J.P
--
Thanks and regards,
Sanjanasri J.P