In moses, it assume English as a target language & other language is source language (foreign). So that we can translate a foreign language to English (In my case, Tamil-English). I want to translate English-Tamil. So, what I want to change, (in train-model.perl file/ )
On Wed, Jan 22, 2014 at 8:37 AM, Arththika Paramanathan < [email protected]> wrote: > Hi Nicola, > Thank you for your response. > > I think in LM with IRSTLM, there are 4 or 5 steps. > In step 1, it will split the corpus as 1-gram with it's frequency count > (there is no sorting here) > In step 2, split this dictionary into 3 dictionaries (balanced n-gram > lists). Here, the threshold is approximately the total words divided by 3. > Is it correct? > In step 3, Collect n-gram for each dictionary. ie) for each words in each > spitted dictionary, it search for 3-gram & put them in a separate file. > Then I don't understand the next step (ARPA file). > How to calculate this? > -3.72202 <s> -0.598275 > -3.17795 illegal -0.60206 > -2.42099 folder -0.500602 > -2.53169 name -0.723104 > > Can you please explain me that how to calculate this? > > > > > > > > On Tue, Jan 21, 2014 at 10:46 PM, Nicola Bertoldi <[email protected]> wrote: > >> Hi Arththika, >> >> >> (1) In language modelling, >> how IRSTLM split the dictionary which is extracted from corpus into 3 >> dictionaries? >> how to calculate n-gram counts? >> >> >> >> I would like to answer your first question >> as a responsible of the IRSLTM tookit >> >> If not clear, please reply privately to me only. >> >> >> I suppose you are using the build-lm.sh script from IRSTLM >> >> The script split the dictionary, sorted according the 1-grams frequency, >> in such a way that the global frequency of each part is balanced. >> >> In this way the corresponding partitions of the n-grams are balanced as >> well. >> the n-gram partition is built by taking into consideration the first >> token, >> >> Not sure what do you mean with the second part of the question. >> >> best regards, >> Nicola >> >> >> >> >> On Jan 20, 2014, at 7:34 PM, Arththika Paramanathan wrote: >> >> Hi, >> >> (2) And, If English is the foreign language, what I want to change, (in >> train-model.perl file) >> >> (3) can anyone tell me that how to use a perl module? I want to use this >> module named Locale-Maketext-Lexicon-0.97 to extract translatable strings >> from po files. >> >> >> >> -- >> regards, >> P.Arththika >> _______________________________________________ >> Moses-support mailing list >> [email protected]<mailto:[email protected]> >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> > > > -- > regards, > P.Arththika > -- regards, P.Arththika
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
