Hi, It seems the problem is lower and upper case issue. You are using lower case while decoding and upper case during training.
Best Regards, Zeeshan Ahmed On 9 December 2012 23:52, swapnil jadhav <[email protected]> wrote: > Hello, > > I am trying to model translation system between graphemes to phonemes. > I made source file and target file using CMU 7 dictionary. > > *corpus.ph ( phonemes as target)* > ... > K AA1 N V ER0 S AH0 Z > K AH0 N V ER1 S IH0 NG > K AH0 N V ER1 ZH AH0 N > K AH0 N V ER1 ZH AH0 N Z > K AH0 N V ER1 ZH AH0 N Z > ... > *and* > > *corpus.gr ( graphemes as source)* > ... > C O N V E R S E S > C O N V E R S I N G > C O N V E R S I O N > C O N V E R S I O N ' S > C O N V E R S I O N S > ... > > Totally there are 133247 lines. > I followed the method described at following without any errors. > > http://www.statmt.org/moses/?n=Development.GetStarted > http://www.statmt.org/moses/?n=Moses.Baseline > > I used 100% dictionary for training and 10% for tuning. > > But I am not getting correct answers. > > For example. > saj@Jadhavs:~$* echo 'h e l l' | ~/g2p/mosesdecoder-master/bin/moses -f > ~/g2p/working/binarised-model/moses.ini* > Defined parameters (per moses.ini or switch): > config: /home/saj/g2p/working/binarised-model/moses.ini > distortion-file: 0-0 wbe-msd-bidirectional-fe-allff 6 > /home/saj/g2p/working/train/model/reordering-table.wbe-msd-bidirectional-fe.gz > > distortion-limit: 6 > input-factors: 0 > lmodel-file: 8 0 3 /home/saj/g2p/lm/corpus.blm.ph > mapping: 0 T 0 > threads: 2 > ttable-file: 0 0 0 5 /home/saj/g2p/working/train/model/phrase-table.gz > ttable-limit: 20 > weight-d: 0.0155888 -0.198259 0.0255069 0.0189083 0.00337951 > 0.00813773 0.0176252 > weight-l: 0.0393243 > weight-t: 0.0455618 0.0121683 0.453074 -0.033838 0.00337398 > weight-w: 0.125254 > /home/saj/g2p/mosesdecoder-master/bin > ScoreProducer: Distortion start: 0 end: 1 > ScoreProducer: WordPenalty start: 1 end: 2 > ScoreProducer: !UnknownWordPenalty start: 2 end: 3 > Loading lexical distortion models...have 1 models > ScoreProducer: LexicalReordering_wbe-msd-bidirectional-fe-allff start: 3 > end: 9 > Creating lexical reordering... > weights: -0.198 0.026 0.019 0.003 0.008 0.018 > Loading table into memory...done. > Start loading LanguageModel /home/saj/g2p/lm/corpus.blm.ph : [12.396] > seconds > ScoreProducer: LM start: 9 end: 10 > Finished loading LanguageModels : [12.397] seconds > Start loading PhraseTable > /home/saj/g2p/working/train/model/phrase-table.gz : [12.397] seconds > filePath: /home/saj/g2p/working/train/model/phrase-table.gz > ScoreProducer: PhraseModel start: 10 end: 15 > Finished loading phrase tables : [12.397] seconds > Start loading phrase table from > /home/saj/g2p/working/train/model/phrase-table.gz : [12.397] seconds > Reading /home/saj/g2p/working/train/model/phrase-table.gz > > ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100 > > **************************************************************************************************** > Finished loading phrase tables : [21.864] seconds > IO from STDOUT/STDIN > Created input-output object : [21.864] seconds > Translating line 0 in thread id 3066051392 > Translating: h e l l > Line 0: Collecting options took 0.000 seconds > Line 0: Search took 0.000 seconds > h e l l > *BEST TRANSLATION: h|UNK|UNK|UNK e|UNK|UNK|UNK l|UNK|UNK|UNK > l|UNK|UNK|UNK [1111] ** > [total=-402.355]*core=(0.000,-4.000,-400.000,0.000,0.000,0.000,0.000,0.000,0.000,-47.137,0.000,0.000,0.000,0.000,0.000) > > Line 0: Translation took 0.001 seconds total > user 21.529 > sys 0.316 > VmPeak: 424040 kB > VmRSS: 400732 kB > saj@Jadhavs:~$ > > > For Each input I am getting UNK. > I have used full dictionary for training but sizes of parallel corpuses > are less ( ~2.5 MB). > The only change I made is I did not use -L option for language and used > English as default for both the corpuses. > Is that a factor for wrong answer ? > If it is then what parallel data and language I should use ? > I did not skip a step written in pages mentioned above. > Is grapheme to phoneme conversion possible with moses ? > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
