[Moses-support] the best way to design the mapping steps for training the factored translation model(English = German)

2010-10-31 Thread JiaHongwei
Dear all,

I want to train a morphological analysis and generation model for moses,
based on which the further translation is from English to German.

And I have prepared my training data like this:

 % tail -n 1 factored-corpus/proj-syndicate.??

 == factored-corpus/proj-syndicate.en ==

 corruption|corruption|nn flourishes|flourish|nns .|.|.

 

 == factored-corpus/proj-syndicate.de ==

 korruption|korruption|nn|nn.fem.cas.sg floriert|florieren|vvfin|vvfin
.|.|per|per

 

Each word is not only represented by its surface form , but also with
additional factors.

And both the English factors and that of German are surface form,lemma,part
of speech and morphy.

 

And now I want to know the best way to design the mapping steps for training
the factored translation model? Can you help me?

 

BTW, I have designed a total of four mapping steps such as below(for your
reference):

% train-model.perl \

--corpus factored-corpus/…… \

--root-dir morphgen \

--f de --e en \

--lm 0:3:factored-corpus/surface.lm:0 \

--lm 2:3:factored-corpus/pos.lm:0 \

--translation-factors 1-1+2-2,3 \

--generation-factors 1-2+1,2,3-0 \

--decoding-steps t0,g0,t1,g1 \

The above way for designation followed the Turorial for Using Factored
Models on the website:

http://www.statmt.org/moses/?n=Moses.FactoredTutorial#ntoc4

 

Your kind suggestions will be greatly appreciated!

 

Best Regards 

Henry

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


答复: [Moses-support] about Morph tagging

2010-10-22 Thread JiaHongwei
Thank you very much!
BTW, I’m studying Morphisto now, which is a morphological analyzer for
German.
http://code.google.com/p/morphisto/
And maybe I will use relevant HFST's tools as morphological analyzer for
other languages.
Best Regards
Henry
-邮件原件-
发件人: Francis Tyers [mailto:fty...@prompsit.com] 
发送时间: 2010年10月20日 18:13
收件人: JiaHongwei
抄送: moses-support@mit.edu
主题: Re: [Moses-support] about Morph tagging

You could use the morphological analysers from the Apertium project.

http://wiki.apertium.org/wiki/Using_an_lttoolbox_dictionary
http://wiki.apertium.org/wiki/Lttoolbox
http://wiki.apertium.org/wiki/HFST

Fran

El dc 20 de 10 de 2010 a les 17:58 +0800, en/na JiaHongwei va escriure:
 Hi,
 
 I need to train a model with POS tags and morphological
 information for Moses involving languages such as German, Spanish,
 French and Italian.
 
 By using TreeTagger, I can get POS tags in the format 'form pos
 lemma'. 
 
 But I want it further processed to be like this, such as 'form
 pos lemma morph'.
 
 So the job is taking 'form pos lemma' as input and output in
 format 'form pos lemma morph'.
 
 Could you recommend a way or a tool to help me do this job
 automatically or in pipeline?
 
 Thanks in advance!
 
  
 
 Best Regards
 
 Henry
 
 
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support




___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


答复: [Moses-support] best pos tagger fo r tagging text in German, Italian, Spani sh, Italian

2010-10-13 Thread JiaHongwei
Thanks a lot!

 

I’d like to have a trial of SVMTool.

 

BTW, does it support generating morphological information?

 

And which tools will you recommend if I want to use a part-of-speech tagger
plus generated morphological information?

 

Best Regards

Henry

  _  

发件人: Jesús Giménez [mailto:jgime...@lsi.upc.edu] 
发送时间: 2010年10月13日 17:53
收件人: JiaHongwei
抄送: moses-support@mit.edu
主题: Re: [Moses-support] best pos tagger for tagging text in German,
Italian, Spanish, Italian

 

hi Henry,

  I may also recommend you SVMTool, a state-of-the-art open source
part-of-speech tagger (and generator of sequential taggers). Latest
development version includes tagging models for English, Spanish and Catalan
(both case-sensitive and case-insensitive models). Besides, in the short
term we plan to include models for French, Romanian, Czech, Italian and
German.

*   svn co svn://biniki.lsi.upc.edu/svmtool/trunk svmtool


  It's implemented in Perl. There is also a ~10 times faster C++ version
(SVMTool++). Models are fully compatible. You might want to consider using
it for massive text processing.

*   svn co svn://biniki.lsi.upc.edu/svmtool++/trunk svmtool++


-- jesus



On 09/10/10 07:22, JiaHongwei wrote: 

Hi,

   I want to use factored model for translating text in German, Italian,
Spanish, French to English using Moses and I noticed there's a pos tagging
step before using factored model in decoding. Can u recommend me on good
tagger for the translation I mentioned?

   And I also wonder if the pos tagging result can be better using a
combined set of pos taggers? If so, then how can I do that?

   Your support will be really appreciated!

 

Thanks a lot

Henry

 
 
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
  
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support