Hi Francois, Thanks for your information. Can you help to let me know or send your scripts for the Chinese tokenizing and detokenizing for my information?
It seems the default tokenizing script doesn't support the Chinese language code. Thanks so, Wenlong 2010/8/31 <[email protected]> > Send Moses-support mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > http://mailman.mit.edu/mailman/listinfo/moses-support > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Moses-support digest..." > > > Today's Topics: > > 1. Re: No weight.ini in example data for EMS > (Sonja PETROVI? LUNDBERG) > 2. Fwd: Re: Tree based models - Eng > Ger general question > (Hieu Hoang) > 3. Re: No weight.ini in example data for EMS (Philipp Koehn) > 4. How can I know used translation rules? (Lee, Joo-Young) > 5. Re: No weight.ini in example data for EMS > (Sonja PETROVI? LUNDBERG) > 6. Re: Train Moses Engine for EN to ZH_CN (Francois Masselot) > 7. Re: Train Moses Engine for EN to ZH_CN (Francois Masselot) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 31 Aug 2010 13:22:27 +0200 > From: Sonja PETROVI? LUNDBERG <[email protected]> > Subject: Re: [Moses-support] No weight.ini in example data for EMS > To: Philipp Koehn <[email protected]> > Cc: [email protected] > Message-ID: > > <[email protected]<aanlktinswhjkfz5fre52oj%[email protected]> > > > Content-Type: text/plain; charset=UTF-8 > > weight.ini was missing in that directory, but I've found it in > another, almost identical, directory (with Moses decoder stuff). I > have no idea why there were two different EMS directories, and why > there was no weight.ini in the first one. > > Now I experience another problem with the same command "perl > experiment.perl -config config.toy": > > DEFINE STEPS (run with -exec if everything ok) > Warning: locale not supported by Xlib, locale set to C > gv: Cannot open file steps/0/graph.0.ps (Inappropriate file type or > format) > > The locale warning probably appears because I have OS X, but why the > GhostView problem? > > Thank you, > Sonja > > > 2010/8/31 Philipp Koehn <[email protected]>: > > Hi, > > > > the directory > ?/Users/so/tools/moses-scripts/scripts-20100806-1525/ems/example/data > > should contain: > > > > nc-5k.en > > nc-5k.fr > > test-ref.en.sgm > > test-src.fr.sgm > > weight.ini > > > > At least that is what is in the SVN directory > moses/scripts/ems/example/data. > > > > -phi > > > > 2010/8/31 Sonja PETROVI? LUNDBERG <[email protected]>: > >> Hi! > >> > >> I am trying to learn how to use EMS on my computer, but already in the > >> testing phase, using config.toy that comes with the installation of > >> Moses, I experience this problem: > >> > >> ikso-ho:test so$ perl experiment.perl -config config.toy > >> STARTING UP AS PROCESS 41208 ON ikso-ho.lan AT Tue Aug 31 11:05:48 CEST > 2010 > >> LOAD CONFIG... > >> find: > /Users/so/tools/moses-scripts/scripts-20100806-1525/ems/example/data/weight.ini*: > >> No such file or directory > >> TUNING:weight-config: file > >> > /Users/so/tools/moses-scripts/scripts-20100806-1525/ems/example/data/weight.ini > >> does not exist! > >> Died at experiment.perl line 355. > >> > >> Is weight.ini supposed to be there, or should it be created during the > >> configuration process? > >> > >> Regards, > >> Sonja > >> _______________________________________________ > >> Moses-support mailing list > >> [email protected] > >> http://mailman.mit.edu/mailman/listinfo/moses-support > >> > > > > > > ------------------------------ > > Message: 2 > Date: Tue, 31 Aug 2010 13:01:16 +0100 > From: Hieu Hoang <[email protected]> > Subject: [Moses-support] Fwd: Re: Tree based models - Eng > Ger > general question > To: [email protected] > Message-ID: <[email protected]> > Content-Type: text/plain; charset="iso-8859-1" > > > hi arda > > I think an email by Chris Dyer sums up the issue that it's pretty hard > to beat the phrase-based BLEU for many language pairs. > http://www.mail-archive.com/[email protected]/msg01995.html > here's Edinburgh's attempt from this years WMT10: > http://aclweb.org/anthology-new/W/W10/W10-1715.pdf > > The straightforward way of adding syntax severely reduces BLEU, you have > to add something extra to get any gains. Off the top of my head, the > main ways that i've seen so far is > 1. Add alternative parses, eg. forest decoding > 2. Mix up the parse tree, eg. SAMT > 3. Soft constrain instead of hard constraints, eg > http://www.isi.edu/~chiang/papers/acl2010-chiang.pdf > 4. Occasionally ignoring syntax, eg. > http://aclweb.org/anthology-new/W/W10/W10-1761.pdf > There's loads of other ways & papers i haven't mentioned > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mailman.mit.edu/mailman/private/moses-support/attachments/20100831/59e86bad/attachment-0001.htm > > ------------------------------ > > Message: 3 > Date: Tue, 31 Aug 2010 13:39:08 +0100 > From: Philipp Koehn <[email protected]> > Subject: Re: [Moses-support] No weight.ini in example data for EMS > To: Sonja PETROVI? LUNDBERG <[email protected]> > Cc: [email protected] > Message-ID: > > <[email protected]<9v_7afe5x8xs9bitu%[email protected]> > > > Content-Type: text/plain; charset=UTF-8 > > Hi, > > ghostview seems to sometimes act funny and not display > a just recently written file. In this case, you need to manually > type > > % gv steps/0/graph.0.ps > > after running experiment.perl > > -phi > > 2010/8/31 Sonja PETROVI? LUNDBERG <[email protected]>: > > weight.ini was missing in that directory, but I've found it in > > another, almost identical, directory (with Moses decoder stuff). I > > have no idea why there were two different EMS directories, and why > > there was no weight.ini in the first one. > > > > Now I experience another problem with the same command "perl > > experiment.perl -config config.toy": > > > > DEFINE STEPS (run with -exec if everything ok) > > Warning: locale not supported by Xlib, locale set to C > > gv: Cannot open file steps/0/graph.0.ps (Inappropriate file type or > format) > > > > The locale warning probably appears because I have OS X, but why the > > GhostView problem? > > > > Thank you, > > Sonja > > > > > > 2010/8/31 Philipp Koehn <[email protected]>: > >> Hi, > >> > >> the directory > ?/Users/so/tools/moses-scripts/scripts-20100806-1525/ems/example/data > >> should contain: > >> > >> nc-5k.en > >> nc-5k.fr > >> test-ref.en.sgm > >> test-src.fr.sgm > >> weight.ini > >> > >> At least that is what is in the SVN directory > moses/scripts/ems/example/data. > >> > >> -phi > >> > >> 2010/8/31 Sonja PETROVI? LUNDBERG <[email protected]>: > >>> Hi! > >>> > >>> I am trying to learn how to use EMS on my computer, but already in the > >>> testing phase, using config.toy that comes with the installation of > >>> Moses, I experience this problem: > >>> > >>> ikso-ho:test so$ perl experiment.perl -config config.toy > >>> STARTING UP AS PROCESS 41208 ON ikso-ho.lan AT Tue Aug 31 11:05:48 CEST > 2010 > >>> LOAD CONFIG... > >>> find: > /Users/so/tools/moses-scripts/scripts-20100806-1525/ems/example/data/weight.ini*: > >>> No such file or directory > >>> TUNING:weight-config: file > >>> > /Users/so/tools/moses-scripts/scripts-20100806-1525/ems/example/data/weight.ini > >>> does not exist! > >>> Died at experiment.perl line 355. > >>> > >>> Is weight.ini supposed to be there, or should it be created during the > >>> configuration process? > >>> > >>> Regards, > >>> Sonja > >>> _______________________________________________ > >>> Moses-support mailing list > >>> [email protected] > >>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>> > >> > > > > > > ------------------------------ > > Message: 4 > Date: Tue, 31 Aug 2010 22:05:49 +0900 > From: "Lee, Joo-Young" <[email protected]> > Subject: [Moses-support] How can I know used translation rules? > To: [email protected] > Message-ID: > > <[email protected]<db8a6lt1t3upc-wb6uviu2hwlw%[email protected]> > > > Content-Type: text/plain; charset="iso-8859-1" > > Hi all, > > I use moses-chart and it works well. > > But, I want to know and get the translation rules which are used to > translate a given source sentence in decoding time. > > Simply said, I try to find a way to know which translation rules are > selected in each ChartCell of moses-chart. > > Is there any method or API? > > Best regards. > > Joo-Young Lee > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://mailman.mit.edu/mailman/private/moses-support/attachments/20100831/35a5fb00/attachment-0001.htm > > ------------------------------ > > Message: 5 > Date: Tue, 31 Aug 2010 15:15:21 +0200 > From: Sonja PETROVI? LUNDBERG <[email protected]> > Subject: Re: [Moses-support] No weight.ini in example data for EMS > To: Philipp Koehn <[email protected]> > Cc: [email protected] > Message-ID: > <[email protected]> > Content-Type: text/plain; charset=UTF-8 > > Thanks! > > Next problem happens in the first step: > > step TRAINING:prepare-data crashed > number of steps doable or running: 0 > > I tried rm steps/1/TRAINING_prepare-data.1* and experiment.perl > -continue 1 -exec, but it crashed again, at the same place. > > Sonja > > 2010/8/31 Philipp Koehn <[email protected]>: > > Hi, > > > > ghostview seems to sometimes act funny and not display > > a just recently written file. In this case, you need to manually > > type > > > > ?% gv steps/0/graph.0.ps > > > > after running experiment.perl > > > > -phi > > > > 2010/8/31 Sonja PETROVI? LUNDBERG <[email protected]>: > >> weight.ini was missing in that directory, but I've found it in > >> another, almost identical, directory (with Moses decoder stuff). I > >> have no idea why there were two different EMS directories, and why > >> there was no weight.ini in the first one. > >> > >> Now I experience another problem with the same command "perl > >> experiment.perl -config config.toy": > >> > >> DEFINE STEPS (run with -exec if everything ok) > >> Warning: locale not supported by Xlib, locale set to C > >> gv: Cannot open file steps/0/graph.0.ps (Inappropriate file type or > format) > >> > >> The locale warning probably appears because I have OS X, but why the > >> GhostView problem? > >> > >> Thank you, > >> Sonja > >> > >> > >> 2010/8/31 Philipp Koehn <[email protected]>: > >>> Hi, > >>> > >>> the directory > ?/Users/so/tools/moses-scripts/scripts-20100806-1525/ems/example/data > >>> should contain: > >>> > >>> nc-5k.en > >>> nc-5k.fr > >>> test-ref.en.sgm > >>> test-src.fr.sgm > >>> weight.ini > >>> > >>> At least that is what is in the SVN directory > moses/scripts/ems/example/data. > >>> > >>> -phi > >>> > >>> 2010/8/31 Sonja PETROVI? LUNDBERG <[email protected]>: > >>>> Hi! > >>>> > >>>> I am trying to learn how to use EMS on my computer, but already in the > >>>> testing phase, using config.toy that comes with the installation of > >>>> Moses, I experience this problem: > >>>> > >>>> ikso-ho:test so$ perl experiment.perl -config config.toy > >>>> STARTING UP AS PROCESS 41208 ON ikso-ho.lan AT Tue Aug 31 11:05:48 > CEST 2010 > >>>> LOAD CONFIG... > >>>> find: > /Users/so/tools/moses-scripts/scripts-20100806-1525/ems/example/data/weight.ini*: > >>>> No such file or directory > >>>> TUNING:weight-config: file > >>>> > /Users/so/tools/moses-scripts/scripts-20100806-1525/ems/example/data/weight.ini > >>>> does not exist! > >>>> Died at experiment.perl line 355. > >>>> > >>>> Is weight.ini supposed to be there, or should it be created during the > >>>> configuration process? > >>>> > >>>> Regards, > >>>> Sonja > >>>> _______________________________________________ > >>>> Moses-support mailing list > >>>> [email protected] > >>>> http://mailman.mit.edu/mailman/listinfo/moses-support > >>>> > >>> > >> > > > > > > ------------------------------ > > Message: 6 > Date: Tue, 31 Aug 2010 06:24:22 -0700 > From: Francois Masselot <[email protected]> > Subject: Re: [Moses-support] Train Moses Engine for EN to ZH_CN > To: "[email protected]" <[email protected]> > Message-ID: > < > 0a83ecc8b2b3f342a24b0483f48fead52b22a44...@adsk-namsg-01.mgdadsk.autodesk.com > > > > Content-Type: text/plain; charset="iso-8859-1" > > > Dear Wenlong, > > The Moses toolkit is language independent, so there shouldn't be anything > special to do. The one thing to take care of is to tokenize properly the > Chinese training corpus. Moses takes as input sentences where words (tokens) > are space-separated, and usually in Chinese texts, words are not separated > by spaces. There's nothing else special: I created recently an > English-Chinese and Chinese-English Moses engines and training and decoding > work just fine. > For decoding, you just need to tokenize and detokenize accordingly, i.e. > tokenize Chinese source sentences, and remove spaces between Chinese words > when Chinese is the target language. > > Regards > Fran?ois > > > > > ------------------------------ > > Message: 7 > Date: Tue, 31 Aug 2010 06:27:20 -0700 > From: Francois Masselot <[email protected]> > Subject: Re: [Moses-support] Train Moses Engine for EN to ZH_CN > To: "[email protected]" <[email protected]> > Message-ID: > < > 0a83ecc8b2b3f342a24b0483f48fead52b22a44...@adsk-namsg-01.mgdadsk.autodesk.com > > > > Content-Type: text/plain; charset="iso-8859-1" > > Dear Wenlong, > > The Moses toolkit is language independent, so there shouldn't be anything > special to do. The one thing to take care of is to tokenize properly the > Chinese training corpus. Moses takes as input sentences where words (tokens) > are space-separated, and usually in Chinese texts, words are not separated > by spaces. There's nothing else special: I created recently an > English-Chinese and Chinese-English Moses engines and training and decoding > work just fine. > For decoding, you just need to tokenize and detokenize accordingly, i.e. > tokenize Chinese source sentences, and remove spaces between Chinese words > when Chinese is the target language. > > Regards > Fran?ois > > > > > > ------------------------------ > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > > End of Moses-support Digest, Vol 46, Issue 42 > ********************************************* >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
