Hi Barry,

Thank you very much!
I’ll do a test with this tool.
Btw, the download URL link on Moses website is unreachable. The project site 
should be:
http://projectile.sv.cmu.edu/research/public/tools/salm/salm.htm#update

From: Barry Haddow [mailto:[email protected]]
Sent: Friday, August 03, 2012 4:03 PM
To: Tan, Jun
Cc: [email protected]
Subject: Re: [Moses-support] help with filtering the noise enties in 
phrase-table.

Hi Jun

It's normal for the phrase table to contain a lot of "noise", and generally the 
decoder can cope with this since it looks at various sources of information, ie 
different phrase scores, language model, reordering model. Moses also applies a 
cutoff when it loads translation options, so most of them won't be considered.

You can also try pruning the phrase table using the technique of Johnson et al
http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc17
This makes the phrase table much smaller, but doesn't usually change 
translation quality much,

cheers - Barry

On 03/08/12 02:57, [email protected]<mailto:[email protected]> wrote:

Hi all,

I'm using Moses as the decoder. After the creation of translation model, i 
checked with the phrase-table. There are a lot of meaningless entries, looks l 
like below:

Can i remove these entries from the phrase-table? Will it have impact on the 
translation quality? If so, how about i delete all the punctuations before the 
creation of the translation model?




! kadov- > 网络 ||| ! kadov- &gt; Network ||| 1 0.341024 1 0.114004 2.718 ||| ||| 
4 4
! kadov- > 网络 参数 ||| ! kadov- &gt; Network parameters ||| 1 0.32747 1 0.0420637 
2.718 ||| ||| 2 2
! kadov- > 网络 参数 和 ||| ! kadov- &gt; Network parameters and ||| 1 0.19872 1 
0.0395799 2.718 ||| ||| 2 2
! kadov- > 网络 参数 和 安全 ||| ! kadov- &gt; Network parameters and security ||| 1 
0.129761 1 0.0179069 2.718 ||| ||| 2 2
! kadov- > 网络 时间 ||| ! kadov- &gt; Network Time ||| 1 0.225683 1 0.00977698 
2.718 ||| ||| 2 2
! kadov- > 网络 时间 协议 ( ||| ! kadov- &gt; Network Time Protocol ( ||| 1 0.0968096 
1 0.00137715 2.718 ||| ||| 2 2
! kadov- > 网络 时间 协议 ||| ! kadov- &gt; Network Time Protocol ||| 1 0.208681 1 
0.00154011 2.718 ||| ||| 2 2
! kadov- > 脱机 LUN ||| ! kadov- &gt; Offline LUN ||| 1 0.307783 1 0.03884 2.718 
||| ||| 1 1
! kadov- > 脱机 LUN 信息 ||| ! kadov- &gt; Offline LUN information ||| 1 0.289121 1 
0.0252108 2.718 ||| ||| 1 1
! kadov- > 脱机 LUN 信息 。 ||| ! kadov- &gt; Offline LUN information . ||| 1 
0.280973 1 0.0243627 2.718 ||| ||| 1 1
! kadov- > 脱机 ||| ! kadov- &gt; Offline ||| 1 0.313173 1 0.0645311 2.718 ||| 
||| 2 2
! kadov- > 脱机 状态 ||| ! kadov- &gt; Offline state ||| 1 0.266374 1 0.0156563 
2.718 ||| ||| 1 1
! kadov- > 自动 ||| ! kadov- &gt; Auto ||| 1 0.507327 0.2 0.0200052 2.718 ||| ||| 
1 5
! " ||| ! ” ||| 0.0185185 0.00821238 1 0.00102938 2.718 ||| ||| 54 1
! # % ' * + - ||| ! # % &apos; * + - ||| 1 0.00122135 1 0.239671 2.718 ||| ||| 
2 2
! # % ' * + ||| ! # % &apos; * + ||| 1 0.00255049 1 0.750455 2.718 ||| ||| 2 2
! # % ' * ||| ! # % &apos; * ||| 1 0.00801901 1 0.773186 2.718 ||| ||| 2 2
! # % ' ||| ! # % &apos; ||| 1 0.0121474 1 0.784364 2.718 ||| ||| 2 2
! # % ||| ! # % ||| 1 0.0426557 1 0.897899 2.718 ||| ||| 2 2
! # ||| ! # ||| 1 0.0550974 1 0.908339 2.718 ||| ||| 2 2
! #$ % ^&* ||| ! # $ % ^ &amp; * ||| 1 5.14084e-07 1 0.00855668 2.718 ||| ||| 2 
2
! #$ % ||| ! # $ % ||| 1 6.54489e-05 1 0.23103 2.718 ||| ||| 2 2
! #$ ||| ! # $ ||| 1 8.45389e-05 1 0.233717 2.718 ||| |






_______________________________________________

Moses-support mailing list

[email protected]<mailto:[email protected]>

http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to