Hi Arththika,

(1) In language modelling,
   how IRSTLM split the dictionary which is extracted from corpus into 3 
dictionaries?
   how to calculate n-gram counts?



I would like to answer your first question
as a responsible of the IRSLTM tookit

If not clear, please reply privately to me only.


I suppose you are using the build-lm.sh script from IRSTLM

The script split  the dictionary, sorted according the 1-grams frequency,
in such a way that the global frequency of each part is  balanced.

In this way the corresponding partitions of the n-grams are balanced as well.
the n-gram partition is built by taking into consideration the first token,

Not sure what do you mean with the second part of the question.

best regards,
Nicola




On Jan 20, 2014, at 7:34 PM, Arththika Paramanathan wrote:

Hi,

(2) And, If English is the foreign language, what I want to change, (in 
train-model.perl file)

(3) can anyone tell me that how to use a perl module? I want to use this module 
named Locale-Maketext-Lexicon-0.97 to extract translatable strings from po 
files.



--
regards,
P.Arththika
_______________________________________________
Moses-support mailing list
[email protected]<mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to