Re: [Moses-support] sgm generation for personalized test sets

2015-09-13 Thread Vincent Nguyen
Hi Tom, If this script is intended exactly and only to generate sgm test/dev files from txt file then yes it needs to be amended. 1) line breakers except 0A need to be removed prior to the python execution (byte stream replace) 2) even though XML standard is to replace ' by ' and so on for oth

Re: [Moses-support] Performance issue with Neural LM for English-Hindi SMT

2015-09-13 Thread Raj Dabre
Hi I think that you misinterpreted what Rico said. He said : nplm is used in addition to a back-off LM for best results What he meant is that nplm is not a backoff but actually an additional LM. A kenlm which has backoff weights is an example of backoff LM. To sum up Rico says: Use Kenlm a

Re: [Moses-support] Performance issue with Neural LM for English-Hindi SMT

2015-09-13 Thread Rajnath Patel
ram)=11.30 >>> > neural-lm(3-gram)=12.10 >>> > >>> > Thank you. >>> > >>> > -- >>> > Regards: >>> > Raj Nath Patel >>> > >>> > ___ >>&g

Re: [Moses-support] Performance issue with Neural LM for English-Hindi SMT

2015-09-13 Thread Raj Dabre
__ >> > Moses-support mailing list >> > Moses-support@mit.edu >> > http://mailman.mit.edu/mailman/listinfo/moses-support >> > >> > >> >> >> -- >> Raj Dabre. >> Doctoral Student, >> Graduate S

Re: [Moses-support] Performance issue with Neural LM for English-Hindi SMT

2015-09-13 Thread Rajnath Patel
t; Moses-support mailing list > > Moses-support@mit.edu > > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > > > > -- > Raj Dabre. > Doctoral Student, > Graduate School of Informatics, > Kyoto University. > CSE MTech, IITB., 2011-2014 >

Re: [Moses-support] sgm generation for personalized test sets

2015-09-13 Thread Tom Hoar
Thanks Vincent, Good catch about Python's Unicode processing. This script uses Python's `codecs` library, which treats characters according to their Unicode definitions. So, the function fh.splitlines() splits the string into a list as expected with traditional ASCII cr/lf sequences. In additio

Re: [Moses-support] Moses Training Error - died with signal 6, without coredump

2015-09-13 Thread Jeroen Vermeulen
There was an error about a missing /dev/stderr in there... Did that special file get deleted? Jeroen On September 12, 2015 2:58:31 PM GMT+07:00, Hieu Hoang wrote: >i can't see where the problem is, the commands seems to be ok. > >what operating system are you using? where did you get the >m

Re: [Moses-support] Performance issue with Neural LM for English-Hindi SMT

2015-09-13 Thread Rico Sennrich
Hello Raj, Usually, nplm is used in addition to a back-off LM for best results. That being said, your results indicate that nplm is performing poorly. If you have little training data, a smaller vocabulary size and more training epochs may be appropriate. I would advise to provide a developme

Re: [Moses-support] Performance issue with Neural LM for English-Hindi SMT

2015-09-13 Thread Raj Dabre
Hi, I have had a similar experience with NPLM. Do you perhaps have a small corpus? On Sun, Sep 13, 2015 at 6:51 PM, Rajnath Patel wrote: > Hi all, > > I have tried Neural LM(nplm) with phrase based English-Hindi SMT, but > translation quality is kind of not good as compared to n-gram LM(scores a

[Moses-support] Performance issue with Neural LM for English-Hindi SMT

2015-09-13 Thread Rajnath Patel
Hi all, I have tried Neural LM(nplm) with phrase based English-Hindi SMT, but translation quality is kind of not good as compared to n-gram LM(scores are given below). I have trained LM for 3-gram and 5-gram with default setting(as mentioned on statmt.org/moses). Kindly suggest, If some one has tr

Re: [Moses-support] sgm generation for personalized test sets

2015-09-13 Thread Vincent Nguyen
in order to use makemteval.py we need to remove 0D and E2 80 A8 from txt files. python handles them as additional line breakers. Le 12/09/2015 22:07, Vincent Nguyen a écrit : > Hi, > > What script do you guys use to generate sgm sets based on txt file ? > > I have tried makemteval.py in contrib