date:20080619

[Moses-support] R: encoding for parallel corpus

2008-06-19 Thread Marcello Federico

If you have a very small corpus at hand
just use the witten-bell smoothing method.
Do also not go beyond order 3.
.
Best, Marcello
Marcello Federico
FBK-irst Trento, Italy


- Messaggio originale -
Da: [EMAIL PROTECTED] [EMAIL PROTECTED]
A: 'Philipp Koehn' [EMAIL PROTECTED]
Cc: moses-support@mit.edu moses-support@mit.edu
Inviato: Thu Jun 19 03:28:24 2008
Oggetto: [Moses-support]  encoding for parallel corpus

Hi
  I have a problem. I download the corpus of  factored-corpus.tgz from the
Moses page in which there is a file namely pos.lm. I want to know how to
train the file.
I POS tagged my English sentences e.g. the|DT light|NN was|VBD red|JJ
.|.and extract the pos tag to get the sentence such as DT NN VBD JJ ..
Then I train such pos sentence by srilm with the following order:
///
/home/srilm/bin/i686/ngram-count -order 3 -interpolate -kndiscount -text
EN_pos.txt -lm pos.lm
~one of required modified KneserNey count-of-counts is zero
error in discount estimator for order 1
///
In such condition no lm file is generated.

When I remove the parameters  -interpolate -kndiscount 
/
/home/ srilm/bin/i686/ngram-count -order 3  -text EN_pos.txt -lm pos.lm
warning: no singleton counts
GT discounting disabled
warning: discount coeff 1 is out of range: 0.67
warning: discount coeff 2 is out of range: 0.800271
warning: discount coeff 3 is out of range: 0.439665
warning: discount coeff 4 is out of range: 0.918576
warning: discount coeff 6 is out of range: 0.860417
warning: discount coeff 7 is out of range: 0.900741
warning: discount coeff 1 is out of range: 2.25939
warning: discount coeff 3 is out of range: -0.0390595
warning: discount coeff 4 is out of range: 1.6028
warning: discount coeff 5 is out of range: 1.62952
warning: discount coeff 6 is out of range: -0.17675
BOW denominator for context NN is zero; scaling probabilities to sum to 1
BOW denominator for context VB is zero; scaling probabilities to sum to 1
BOW denominator for context IN is zero; scaling probabilities to sum to 1

In such condition a lm file is generated but when I execute the order
///
mert-moses.pl input ref moses/moses-cmd/src/moses model/moses.ini -nbest 200
--working-dir tuning --rootdir
/home/moses_new/bin/moses-scripts/scripts-20080519-1755 
some error is
///
Loading table into memory...done.
Created lexical orientation reordering
Start loading LanguageModel
/home/yqhe/iwslt2007/moses_new/enfactordata/lm/en.lm : [0.000] seconds
Start loading LanguageModel
/home/yqhe/iwslt2007/moses_new/enfactordata/lm/pos.lm : [1.000] seconds
Finished loading LanguageModels : [1.000] seconds
Start loading PhraseTable
/home/yqhe/iwslt2007/moses_new/enfactordata/tuning/filtered/phrase-table.0-0
,1.1 : [1.000] seconds
Finished loading phrase tables : [3.000] seconds
Created input-output object : [3.000] seconds
Translating: 哦 那个 航班 是 C 三 零 六 。

moses: LanguageModelSRI.cpp:154: virtual float
LanguageModelSRI::GetValue(const std::vectorconst Word*,
std::allocatorconst Word* , const void**, unsigned int*) const:
Assertion `(*contextFactor[count-1])[factorType] != __null' failed.
Aborted (core dumped)
Exit code: 134
The decoder died. CONFIG WAS -w 0.00 -lm 0.10 0.10 -d 0.10
0.10 0.10 0.10 0.10 0.10 0.10 -tm 0.03 0.02
0.03 0.02 0.00
/
So I don't know how to train a lm file by srilm. Can you tell me how you
train pos.lm? Even the specific ngram-count order.


Best regards.

He Yanqing




___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

[Moses-support] n-best-list with output-factors

2008-06-19 Thread Carlos Henriquez

Hi all.

I was wondering if I can get the output factors in the nbest list. It seems 
that I have only two choices

-report-all-factors To get them for my best translation and
-n-best-list to get the list without factors

I tried them together but the nbest came out without factors and I need them 
for postprocessing the nbest-list.

Thanks in advance for your help.

--
Carlos A. Henríquez Q.
+34-693-278-219
[EMAIL PROTECTED]
[EMAIL PROTECTED]


  __ 
Enviado desde Correo Yahoo! La bandeja de entrada más inteligente.___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] n-best-list with output-factors

2008-06-19 Thread Hieu Hoang

try this

http://article.gmane.org/gmane.comp.nlp.moses.user/947/match=output+factors

  _  

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Carlos Henriquez
Sent: 19 June 2008 10:39
To: moses-support@mit.edu
Subject: [Moses-support] n-best-list with output-factors

Hi all.

I was wondering if I can get the output factors in the nbest list. It seems
that I have only two choices

-report-all-factors To get them for my best translation and
-n-best-list to get the list without factors

I tried them together but the nbest came out without factors and I need them
for postprocessing the nbest-list.

Thanks in advance for your help.

--
Carlos A. Henríquez Q.
+34-693-278-219
[EMAIL PROTECTED]
[EMAIL PROTECTED] 

  _  

Enviado desde Correo
http://us.rd.yahoo.com/mailuk/taglines/isp/control/*http://us.rd.yahoo.com/
evt=52431/*http://es.docs.yahoo.com/mail/overview/index.html Yahoo!
La bandeja de entrada más inteligente.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

[Moses-support] R: encoding for parallel corpus

[Moses-support] n-best-list with output-factors

Re: [Moses-support] n-best-list with output-factors

3 matches

Site Navigation

Mail list logo

Footer information