Hi Tom,
If this script is intended exactly and only to generate sgm test/dev
files from txt file then yes it needs to be amended.
1) line breakers except 0A need to be removed prior to the python
execution (byte stream replace)
2) even though XML standard is to replace ' by ' and so on for
oth
Hi I think that you misinterpreted what Rico said.
He said : nplm is used in addition to a back-off LM for best results
What he meant is that nplm is not a backoff but actually an additional LM.
A kenlm which has backoff weights is an example of backoff LM. To sum
up Rico says: Use Kenlm a
ram)=11.30
>>> > neural-lm(3-gram)=12.10
>>> >
>>> > Thank you.
>>> >
>>> > --
>>> > Regards:
>>> > Raj Nath Patel
>>> >
>>> > ___
>>&g
__
>> > Moses-support mailing list
>> > Moses-support@mit.edu
>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>> >
>> >
>>
>>
>> --
>> Raj Dabre.
>> Doctoral Student,
>> Graduate S
t; Moses-support mailing list
> > Moses-support@mit.edu
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> >
>
>
> --
> Raj Dabre.
> Doctoral Student,
> Graduate School of Informatics,
> Kyoto University.
> CSE MTech, IITB., 2011-2014
>
Thanks Vincent,
Good catch about Python's Unicode processing. This script uses Python's
`codecs` library, which treats characters according to their Unicode
definitions. So, the function fh.splitlines() splits the string into a
list as expected with traditional ASCII cr/lf sequences. In additio
There was an error about a missing /dev/stderr in there... Did that special
file get deleted?
Jeroen
On September 12, 2015 2:58:31 PM GMT+07:00, Hieu Hoang
wrote:
>i can't see where the problem is, the commands seems to be ok.
>
>what operating system are you using? where did you get the
>m
Hello Raj,
Usually, nplm is used in addition to a back-off LM for best results.
That being said, your results indicate that nplm is performing poorly.
If you have little training data, a smaller vocabulary size and more
training epochs may be appropriate. I would advise to provide a
developme
Hi,
I have had a similar experience with NPLM.
Do you perhaps have a small corpus?
On Sun, Sep 13, 2015 at 6:51 PM, Rajnath Patel
wrote:
> Hi all,
>
> I have tried Neural LM(nplm) with phrase based English-Hindi SMT, but
> translation quality is kind of not good as compared to n-gram LM(scores a
Hi all,
I have tried Neural LM(nplm) with phrase based English-Hindi SMT, but
translation quality is kind of not good as compared to n-gram LM(scores are
given below). I have trained LM for 3-gram and 5-gram with default
setting(as mentioned on statmt.org/moses). Kindly suggest, If some one has
tr
in order to use makemteval.py we need to remove 0D and E2 80 A8 from txt
files.
python handles them as additional line breakers.
Le 12/09/2015 22:07, Vincent Nguyen a écrit :
> Hi,
>
> What script do you guys use to generate sgm sets based on txt file ?
>
> I have tried makemteval.py in contrib
11 matches
Mail list logo