Hello again,
I have checked the encoding of all my LM, TM, tune and test set files and
they are all utf-8.
So, encoding does not seem to be the issue.
I also verified that there were no mistakes involving the source and target
side while making and using the LM's.
I am retraining all my model files and will try this again.

Regards.

On Tue, Jul 7, 2015 at 12:43 PM, Raj Dabre <[email protected]> wrote:

> Hello Rico,
> Now that you mention it I also performed an additional test.
> I took a translation and obtained the perplexity score by querying the
> kenlm and nplm from the command line. In this case the difference between
> the scores was not that large.
> It might be an encoding issue.
> I will check again and let you know.
>
> However the data I am using to train the LM's (KENLM, NPLM and BILM) is
> the  same as I am using to train. I should also mention that I did no
> tokenization etc before training the LM's and the TM.
> Thanks for your replies.
> Regards.
>
> On Tue, Jul 7, 2015 at 1:18 AM, Rico Sennrich <[email protected]>
> wrote:
>
>>  Hi Raj,
>>
>> the information you provide is pretty vague, so I'm just making some wild
>> guesses here:
>>
>> it could be a user error, for instance an inconsistency between the
>> training sets used for training BilingualNPLM and the phrase table. Check
>> that the same version of the corpus (including tokenization, truecasing
>> etc.) was used for training, and that you did not mix up source and target
>> language. Also check that the settings during training are consistent with
>> those in the moses.ini file.
>>
>> it's possible that some of the settings (vocabulary size, number of
>> training epochs, or similar) are unsuitable for your task. For example,
>> since you have a relatively small training corpus, you may need more epochs
>> of training to get good results (use a validation set to see if model
>> perplexity converges).
>>
>> please double-check that there were no problems with the unicode-handling
>> of Japanese/Chinese characters, and that the encoding of your vocabulary
>> files matches that of the translation model, and the decoder input. We have
>> never experienced such problems, but they could arise for some system
>> configurations.
>>
>> best wishes,
>> Rico
>>
>>
>>
>> On 06.07.2015 16:31, Raj Dabre wrote:
>>
>>    Hello Rico,
>>  I trained both mono as well as bilingual LM's.
>>  Both seemed ineffective.
>>  As I mentioned before, I am working with Chinese-Japanese and the domain
>> is paper abstracts.
>>  I did check the n-best lists and I saw a significant difference between
>> the LM scores when comparing the runs for KenLm and NPLM.
>>  What could have gone wrong during the training?
>>  Regards.
>>
>> On Mon, Jul 6, 2015 at 10:53 PM, Rico Sennrich <[email protected]>
>> wrote:
>>
>>>  Hello Raj,
>>>
>>> can you please clarify if you tried to train a monolingual LM
>>> (NeuralLM), a bilingual LM (BilingualNPLM), or both? Our previous
>>> experiences with BilingualNPLM are mixed, and we observed improvements for
>>> some tasks and language pairs, but not for others. See for instance:
>>>
>>> Alexandra Birch, Matthias Huck, Nadir Durrani, Nikolay Bogoychev and
>>> Philipp Koehn. 2014. Edinburgh SLT and MT System Description for the IWSLT
>>> 2014 Evaluation. Proceedings of IWSLT 2014.
>>>
>>> To help debugging, you can check the scores in the n-best lists of the
>>> tuning runs. If the NPLM features give much higher costs than KenLM
>>> (trained on the same data), this can indicate that something went wrong
>>> during training.
>>>
>>> best wishes,
>>> Rico
>>>
>>> On 06.07.2015 14:29, Raj Dabre wrote:
>>>
>>>      Dear all,
>>>  I have checked out the latest version of moses and nplm and compiled
>>> moses successfully with the --with-nplm option.
>>>  I got a ton of warnings during compilation but in the end it all worked
>>> out and all the desired binaries were created. Simply executing the moses
>>> binary told me the the BilingualNPLM and NeuralLM features were available.
>>>
>>>  I trained an NPLM model based on the instructions here:
>>> http://www.statmt.org/moses/?n=FactoredTraining.BuildingLanguageModel#ntoc33
>>>  The corpus size I used was about 600k lines (for Chinese-Japanese;
>>> Target is Japanese)
>>>
>>>  I then integrated the resultant language model (after 10 iterations)
>>> into the decoding process by moses.ini
>>>
>>>  I initiated tuning (standard parameters) and I got no errors, which
>>> means that the neural language model (NPLM) was recognized and queried
>>> appropriately.
>>>  I also ran tuning without a language model.
>>>
>>>  The strange thing is that the tuning and test BLEU scores for both
>>> these cases are almost the same. I checked the weights and saw that the LM
>>> was assigned a very low weight.
>>>
>>>  On the other hand when I used KENLM on the same data.... I had
>>> comparatively higher BLEU scores.
>>>
>>>  Am I missing something? Am I using the NeuralLM in an incorrect way?
>>>
>>>  Thanks in advance.
>>>
>>>
>>>
>>> --
>>>   Raj Dabre.
>>> Doctoral Student,
>>>  Graduate School of Informatics,
>>> Kyoto University.
>>>  CSE MTech, IITB., 2011-2014
>>>
>>>
>>>
>>>  _______________________________________________
>>> Moses-support mailing 
>>> [email protected]http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>
>>
>> --
>>   Raj Dabre.
>> Doctoral Student,
>>  Graduate School of Informatics,
>> Kyoto University.
>>  CSE MTech, IITB., 2011-2014
>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
>
> --
> Raj Dabre.
> Doctoral Student,
> Graduate School of Informatics,
> Kyoto University.
> CSE MTech, IITB., 2011-2014
>
>


-- 
Raj Dabre.
Doctoral Student,
Graduate School of Informatics,
Kyoto University.
CSE MTech, IITB., 2011-2014
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to