Re: [Moses-support] IWSLT 2019: 1st call for papers

2019-07-04 Thread Nicola Bertoldi
proviamo a partecipare al task English-Czech?


Nicola

> On 11 Jun 2019, at 22:14, marco turchi  wrote:
> 
> [Apologies for multiple posting]
> 
> The 16th International Workshop on Spoken Language Translation
> IWSLT 2019 – First Call for Participation
> 
> November 2nd – 3rd, 2019 – Hong Kong
> http://www.iwslt.org 
> IMPORTANT DATES
> 
> Scientific papers:
> September 1st: paper submissions
> October 7th: notification of acceptance
> October 13th: camera-ready paper due
> Evaluation Campaign:
> June: release of train and dev data
> July 1st – September 8th: evaluation period
> Sept 22nd: system description paper due
> October 7th: review feedback
> October 13th: camera-ready paper due
> The International Workshop on Spoken Language Translation (IWSLT) is a yearly 
> scientific workshop, associated with an open evaluation campaign on spoken 
> language translation, where both scientific papers and system descriptions 
> are presented. The 16th IWSLT will take place in Hong Kong, on November 2nd 
> and  3rd, 2019.
> 
> Evaluation Campaign
> 
> IWSLT will feature three evaluation tasks focusing on end-to-end speech 
> translation, multimodel models and spontaneous speech:
> Speech translation of audiovisual content: HowTo and TED and real lectures 
> from English to Portuguese and German
> Clean speech translation of spontaneous, disfluent telephone conversations 
> from Spanish to English
> Text translation on a less resourced language pair: English to Czech
> Training and development data for each task will be released to the 
> participants through the workshop website at the beginning of June 2019. The 
> evaluation period will be from July 1st  to September 8th 2019. 
> 
> Scientific papers
> 
> The IWSLT invites submissions of scientific papers to be published in the 
> workshop proceedings and presented in dedicated technical sessions during the 
> workshop, either in oral or poster format. The workshop welcomes high 
> quality, original contributions covering theoretical and practical issues in 
> the fields of automatic speech recognition and machine translation that are 
> applied to spoken language translation. Possible topics include, but are not 
> limited to:
> 
> MT and SLT approaches
> End-to-End models for SLT
> MT and SLT evaluation
> Language resources for MT and SLT
> Open source software for ASR, MT and SLT
> Multilingual ASR and TTS
> Multimodal speech and text translation
> Architectures for ASR, MT and SLT
> Adaptation for ASR, MT and SLT
> Post- and Pre-processing for ASR, MT and SLT
> Efficiency in ASR, MT and SLT
> 
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support


-- 
--
Le informazioni contenute nella presente comunicazione sono di natura 
privata e come tali sono da considerarsi riservate ed indirizzate 
esclusivamente ai destinatari indicati e per le finalità strettamente 
legate al relativo contenuto. Se avete ricevuto questo messaggio per 
errore, vi preghiamo di eliminarlo e di inviare una comunicazione 
all’indirizzo e-mail del mittente.

--
The information transmitted is 
intended only for the person or entity to which it is addressed and may 
contain confidential and/or privileged material. If you received this in 
error, please contact the sender and delete the material.
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] DEBUG_LEVEL:1 Error: lower order count-of-counts cannot be estimated properly

2016-05-20 Thread Nicola Bertoldi
Hi Thomas

this is clearly a problem related to IRSTLM

I would kindly ask you to open a ticket in the IRSTLM Github Repo adding as 
much info as possible

Please also add the actual command you run

I (as IRSTLM developer) will reply asap

Nicola


> On 20 May 2016, at 11:31, Tomasz Gawryl  wrote:
> 
> Hi,
> I’m trying to build 10 ngram’s model but my training pipeline ends with 
> error: “Error: lower order count-of-counts cannot be estimated properly”. 
> Corpus has 33 mln sentences.
> I successfully trained much smaller corpus (around 5 mln sentences) using the 
> same config file. Would you suggest me something how to fix this problem?
>  
> Regards,
> Thomas
>  
> --
>  
> # more steps/2/LM_ACROSS-BIGMAMA-OPENSUB2016_train.2.STDERR
>  
> Generating successor statistics
> level 2
> level 3
> level 4
> level 5
> level 6
> level 7
> level 8
> level 9
> level 10
> level 1
> computing statistics
> n1: 1 n2: 0 n3: 0 n4: 0 unover3: 0
> DEBUG_LEVEL:1 Error: lower order count-of-counts cannot be estimated properly
> Hint: use another smoothing method with this corpus.
>  
> EXECUTING rm -rf 
> /home/moses/working/experiments/NGRAM10-A/tmp/irstlm-build-tmp.6920
> FINISH.
> ___
> Moses-support mailing list
> Moses-support@mit.edu 
> http://mailman.mit.edu/mailman/listinfo/moses-support 
> 
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] help

2014-06-20 Thread Nicola Bertoldi
Hi Maria
first  I would like to ask (for next time)
to write to the IRSTLM mailing list  (user-irs...@list.fbk.eu)
to any problem related to IRSTLM only

 
Now I am going to reply you privately.


Nicola



On Jun 20, 2014, at 6:47 AM, Maria Marpaung wrote:

 Hello, I need help.
 I'm working on my thesis for form translation system using MOSES MT. I came 
 to the Language model Training. I am having some problems. I have followed 
 some of the steps like following the Moses Baseline. including the following:
 
 1. The language model (LM) is used to ensure fluent output, so it is built
 with the target language (i.e Indonesia language in this case). The IRSTLM
 documentation gives a full explanation of the command-line options, but the
 following will build an appropriate 3-gram language model, removing
 singletons, smoothing with improved Kneser-Ney, and adding sentence
 boundary symbols:
 
  mkdir ~/lm
  cd ~/lm
  ~/irstlm/bin/add-start-end.sh \
 ~/corpus/news-commentary-v8.fr-en.true.en \
 news-commentary-v8.fr-en.sb.en
  export IRSTLM=$HOME/irstlm; ~/irstlm/bin/build-lm.sh \
-i news-commentary-v8.fr-en.sb.en  \
-t ./tmp  -p -s improved-kneser-ney -o news-commentary-v8.fr-en.lm.en
  ~/irstlm/bin/compile-lm --text news-commentary-v8.fr-en.lm.en.gz \
news-commentary-v8.fr-en.arpa.en
 
 First until four commands were executed successfuly. The last one failed.
 Here is the result after entering the following command line:
 
 maria@maria-Aspire-E1-471:~/lm$ ~/moses/irstlm/bin/compile-lm --text
 news-commentary-v8.fr-en.lm.en.gz news-commentary-v8.fr-en.arpa.en
 
 inpfile: news-commentary-v8.fr-en.lm.en.gz
 outfile: news-commentary-v8.fr-en.arpa.en
 loading up to the LM level 1000 (if any)
 dub: 1000
 Failed to open news-commentary-v8.fr-en.lm.en.gz!
 
 
 2. I add command yes like this:
 maria@maria-Aspire-E1-471:~/lm$ ~/moses/irstlm/bin/compile-lm –text yes
 news-commentary-v8.fr-en.lm.en.gz news-commentary-v8.fr-en.arpa.en
 
 Warning: Too many arguments
 
 
 
 please help me, what should I do?
 
 
 Best regards!
 
 Maria Marpaung
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Moses using other LMs

2014-03-19 Thread Nicola Bertoldi
Hi Zheng

IRSTLM is able to read and manage google n-gram.

Nicola




On 03/18/14 20:39, Hieu Hoang wrote:
Moses supports RandLM and neural network LM which can handle very large
amounts of data, I think.

I'm not sure if IRSTLM or KenLM can handle Google ngram data, but I know
they can handle large amount of data


On 17 March 2014 14:56, Zheng Yuan 
yuanzheng_b...@126.commailto:yuanzheng_b...@126.com
mailto:yuanzheng_b...@126.com wrote:

   Hi,

   I am wondering is it possible for Moses to use other kinds of LMs?
   Like some existing Web interface or Google n-gram?

   Regards,
   Zheng

   ___
   Moses-support mailing list
   Moses-support@mit.edumailto:Moses-support@mit.edu 
mailto:Moses-support@mit.edu
   http://mailman.mit.edu/mailman/listinfo/moses-support




--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu




___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Giza++ HMMTable readJumps implementation

2014-03-17 Thread Nicola Bertoldi
Hi Nima,

at FBK, we have recently develop a new (beta) version of MGIZA++ which enables 
online word alignment


Maybe you could use it directly or grasp the way we load the tables/models
do not hesitate to ask us any clarification.

You can find it here
http://hlt.fbk.eu/technologies/onlinemgiza


cheers,
Nicola




On Mar 17, 2014, at 11:06 AM, Sara Stymne wrote:

Hi,

You might want to have a look at mgiza, and this HOWTO on how to do force 
alignment: http://www.kyloo.net/software/doku.php/mgiza:forcealignment

Best,
Sara


On 03/16/2014 05:49 PM, Nima Pourdamghani wrote:
I need to load my probability tables into Giza++. Currently I am working with 
HMM model. I've managed to load the t-table, but loading the a-table (i.e. 
h-table) is not implemented in the version that I have (the body of readJumps 
function in HMMTables.cpp file is empty).
Is there an implementation of this function available? If not how can one 
implement it?

Cheers



___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support




___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Language Model Training failed

2014-03-05 Thread Nicola Bertoldi
Hi  Janez,

Seth syggested you the right fix

I just checked the IRSTLM documentation
http://sourceforge.net/apps/mediawiki/irstlm/index.php?title=Estimating_gigantic_models
and the correct notation is reported there.

Could you please tell me from where do you get the wrong information
So that I correct it.


Nicola
(on behalf of IRSTLM development team)



On Mar 5, 2014, at 1:36 AM, Seth Jarrett wrote:

First four commands were executed successfuly. The last one failed. Here
is the result after entering the following command line:zzz at
zzz-laptop:~/lm$ ~/moses/irstlm/bin/compile-lm --text
news-commentary-v8.fr-en.lm.en.gz news-commentary-v8.fr-en.arpa.en

inpfile: news-commentary-v8.fr-en.arpa.enloading up to the LM level 1000
(if any)
dub: 1000Failed to open news-commentary-v8.fr-en.arpa.en!zzz at
zzz-laptop:~/lm$ Where we made a mistake? I see the
xxx.arpa.en listed as input file. Shouldn't be the xxx.arpa.en file an
output file?Best regards!


I was having the same problem when following the steps in the baseline
instructions but I was able to get it to work by adding yes after --text.

Try this:

~/moses/irstlm/bin/compile-lm --text yes news-commentary-v8.fr-en.lm.en.gz
news-commentary-v8.fr-en.arpa.en


___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] how to make moses decoder output more than one best translation?

2014-01-21 Thread Nicola Bertoldi
Hi Xheng

this is possible through the parameter -n-best-list

see for details
http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc10

Nicola




On Jan 21, 2014, at 5:28 PM, Zheng Yuan wrote:

Hi,

The default settings of the Moses decoder only output the best translation 
(only one). I am wondering is it possible to make the decoder to output a bunch 
of translation hypotheses (e.g., the best 5)?

Thanks in advance!

Regards,
Zheng

___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] language models

2013-10-31 Thread Nicola Bertoldi
Hi James,

yes,   I already succeeded in installing randLM and SRILM on both MacOSx  and 
Linux Redhat. it was very simple following their  documentations.

and 

yes,  it is possible to compile Moses with any LM library you like
adding zero, one or more of the following parameters

--with-irstlm=/path/to/irstlm

--with-randlm=/path/to/randlm

--with-srilm=/path/to/srilm

you can find more details here
http://www.statmt.org/moses/?n=Development.GetStarted
in section Optional packages


at run time, you can select the desired LM type in the configuration file.



Nicola



On Oct 31, 2013, at 5:01 PM, Read, James C wrote:

 Does anybody have experience with installing RandLM and SRILM? Is it possible 
 to compile Moses to support a variety of language model libraries and then 
 configure to select which one to use for each experiment?
 
 thanks,
 James
 
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Placeholders

2013-10-10 Thread Nicola Bertoldi
Hi Hieu

I read the documentation
and you mention that you enable the exclusive mode of xml-input

I see few issues:

- you mention that you enable the exclusive mode of xml-input;
  this can conflict with other usage of xml-input which instead require the  
inclusive mode.
  do you have any comments on that?

- when you use the exclusive mode you force the translation of the span (@num@) 
with 100)
  and other larger span including @num@ are not allowed
  am I right?
  If yes, what is the advantage of having phrase pairs including other words

- what is the meaning of  -placeholder-factor 1 ?


Nicola Bertoldi




On Oct 10, 2013, at 1:05 PM, Hieu Hoang wrote:

Hi all

Achim and I have been working on adding support for placeholders into Moses. 
That is, replacing a number, date, or named entity with a symbol eg. @num@, 
-date-, =named-entity=. We think it would be especially useful for commercial 
users of Moses, and for people translating text with lots of numbers, dates etc.

It is now supported in the Moses training and decoding pipeline. See the 
following URL  for more details.
   http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc60

--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu

___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] lattices with EPSILON

2013-10-04 Thread Nicola Bertoldi
I don't see any reason why a lattice should contain an EPSILON edge.

In a confusion network, EPSILON are needed to allow the translation of input of 
different lengths.
The sausage structure of the CN imposes the same amount of source words,
and the EPSILONs overcome this constraint.

This is not the case for lattice, because you can have any number of 
edges/words in a complete source path.


cheers,
Nicola



On Oct 4, 2013, at 2:52 PM, Hieu Hoang wrote:

I'm just looking at the lattices decoding, as implemented in moses.

for confusion networks, it's fair to have EPSILON words (that represent blank 
words). However, I don't see the point of them in lattices.

Anyone have an opinion? How is it implemented in cdec  joshua?

--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu

___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Error compiling Moses on Mac

2013-09-17 Thread Nicola Bertoldi
It seems that the  BOOST library libboost_unit_test_framework-mt.dylib is 
missing

I also use a Mac OSX (10.6) with BOOST 1.51, compiled without multithreaded 
support (-mt)
and hence in my BOOST  the library is called
libboost_unit_test_framework.dylib
i.e. without suffix -mt
and the testing phase works properly

Which options you use to compile BOOST?

Maybe it is sufficient you rename the library,
or alternatively you can try to recompile that library without mt support


Nicola



On Sep 16, 2013, at 5:04 PM, f.fancellu wrote:

Hi,
I get the following error when trying to compile Moses:

I ran: ./bjam --with-srilm=/Users/ffancellu/Documents/Moses/srilm

and got the following output:

notice: found boost-build.jam at 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/boost-build.jam
notice: loading Boost.Build from 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/kernel
notice: Searching /etc /Users/ffancellu 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/kernel 
/usr/share/boost-build 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/kernel 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/util 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/build 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/tools 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/contrib 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/. for 
site-config configuration file site-config.jam .
notice: Loading site-config configuration file site-config.jam from 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/site-config.jam 
.
notice: Searching /Users/ffancellu 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/kernel 
/usr/share/boost-build 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/kernel 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/util 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/build 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/tools 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/contrib 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/. for 
user-config configuration file user-config.jam .
notice: Loading user-config configuration file user-config.jam from 
/Users/ffancellu/Documents/Moses/mosesdecoder/share/boost-build/user-config.jam 
.

file /Users/ffancellu/Documents/Moses/mosesdecoder/previous.sh
#!/bin/sh
./bjam --with-srilm=/Users/ffancellu/Documents/Moses/srilm 
--debug-configuration -d2

bash -c g++ -lSegFault -x c++ - 'int main() {}' -o 
/Users/ffancellu/Documents/Moses/mosesdecoder/dummy /dev/null 2/dev/null  
rm /Users/ffancellu/Documents/Moses/mosesdecoder/dummy 2/dev/null
 1
bash -c g++  -lboost_program_options-1_54 -x c++ - 'int main() {}' -o 
/Users/ffancellu/Documents/Moses/mosesdecoder/dummy /dev/null 2/dev/null  
rm /Users/ffancellu/Documents/Moses/mosesdecoder/dummy 2/dev/null
 1
bash -c g++  -static -lboost_program_options -x c++ - 'int main() {}' -o 
/Users/ffancellu/Documents/Moses/mosesdecoder/dummy /dev/null 2/dev/null  
rm /Users/ffancellu/Documents/Moses/mosesdecoder/dummy 2/dev/null
 1
bash -c g++  -lboost_system-mt  -DBOOST_SYSTEM_DYN_LINK -x c++ - 'int 
main() {}' -o /Users/ffancellu/Documents/Moses/mosesdecoder/dummy /dev/null 
2/dev/null  rm /Users/ffancellu/Documents/Moses/mosesdecoder/dummy 
2/dev/null
 0
bash -c g++  -lboost_thread-mt  -DBOOST_THREAD_DYN_DLL -x c++ - 'int main() 
{}' -o /Users/ffancellu/Documents/Moses/mosesdecoder/dummy /dev/null 
2/dev/null  rm /Users/ffancellu/Documents/Moses/mosesdecoder/dummy 
2/dev/null
 0
bash -c g++  -lboost_program_options-mt  -DBOOST_PROGRAM_OPTIONS_DYN_LINK -x 
c++ - 'int main() {}' -o /Users/ffancellu/Documents/Moses/mosesdecoder/dummy 
/dev/null 2/dev/null  rm 
/Users/ffancellu/Documents/Moses/mosesdecoder/dummy 2/dev/null
 0
bash -c g++  -lboost_unit_test_framework-mt  -DBOOST_TEST_MODULE=CompileTest  
-include boost/test/unit_test.hpp  -DBOOST_TEST_DYN_LINK -x c++ - 
'BOOST_AUTO_TEST_CASE(foo) {}' -o 
/Users/ffancellu/Documents/Moses/mosesdecoder/dummy /dev/null 2/dev/null  
rm /Users/ffancellu/Documents/Moses/mosesdecoder/dummy 2/dev/null
 0
bash -c g++  -lboost_iostreams-mt  -DBOOST_IOSTREAMS_DYN_LINK -x c++ - 'int 
main() {}' -o /Users/ffancellu/Documents/Moses/mosesdecoder/dummy /dev/null 
2/dev/null  rm /Users/ffancellu/Documents/Moses/mosesdecoder/dummy 
2/dev/null
 0
bash -c g++  -lboost_filesystem-mt  -DBOOST_FILE_SYSTEM_DYN_LINK -x c++ - 
'int main() {}' -o /Users/ffancellu/Documents/Moses/mosesdecoder/dummy 
/dev/null 2/dev/null  rm 
/Users/ffancellu/Documents/Moses/mosesdecoder/dummy 2/dev/null
 0
bash -c g++  -static -lz -x c++ - 'int main() {}' -o 
/Users/ffancellu/Documents/Moses/mosesdecoder/dummy /dev/null 2/dev/null  
rm /Users/ffancellu/Documents/Moses/mosesdecoder/dummy 2/dev/null
 1
bash -c g++ -ltcmalloc_minimal -x c++ - 'int main() {}' -o 

Re: [Moses-support] Building Moses with statically linked libraries?

2013-08-28 Thread Nicola Bertoldi
Maybe this can help to solve the issue

I also noted that   XMLRPC libraries   is not statically link

this is what I got running ldd   on moses first   and mosesserve
ldd moses   
linux-vdso.so.1 =  (0x7fff3f3a1000)
librt.so.1 = /lib64/librt.so.1 (0x003620a0)
libSegFault.so = /lib64/libSegFault.so (0x7fc75f667000)
libstdc++.so.6 = /usr/lib64/libstdc++.so.6 (0x003620e0)
libm.so.6 = /lib64/libm.so.6 (0x00362020)
libgcc_s.so.1 = /lib64/libgcc_s.so.1 (0x00362120)
libpthread.so.0 = /lib64/libpthread.so.0 (0x00361fe0)
libc.so.6 = /lib64/libc.so.6 (0x00361f60)
/lib64/ld-linux-x86-64.so.2 (0x00361f20)

ldd   mosesserver 
linux-vdso.so.1 =  (0x7fffa59ff000)
librt.so.1 = /lib64/librt.so.1 (0x003620a0)
libSegFault.so = /lib64/libSegFault.so (0x7f70dbc55000)
libxmlrpc_server_abyss++.so.4 = not found
libxmlrpc_server++.so.4 = not found
libxmlrpc_server_abyss.so.3 = not found
libxmlrpc_server.so.3 = not found
libxmlrpc_abyss.so.3 = not found
libpthread.so.0 = /lib64/libpthread.so.0 (0x00361fe0)
libxmlrpc++.so.4 = not found
libxmlrpc.so.3 = not found
libxmlrpc_util.so.3 = not found
libxmlrpc_xmlparse.so.3 = not found
libxmlrpc_xmltok.so.3 = not found
libstdc++.so.6 = /usr/lib64/libstdc++.so.6 (0x003620e0)
libm.so.6 = /lib64/libm.so.6 (0x00362020)
libgcc_s.so.1 = /lib64/libgcc_s.so.1 (0x00362120)
libc.so.6 = /lib64/libc.so.6 (0x00361f60)
/lib64/ld-linux-x86-64.so.2 (0x00361f20)



Nicola



On Aug 27, 2013, at 12:41 AM, Kenneth Heafield wrote:

 Hi,
 
   Ugh sorry something was weird with bjam.  I've put a kludge in that
 forces static linkage unless link=shared appears on the command line.
 
 Kenneth
 
 On 08/26/13 21:59, Lane Schwartz wrote:
 I'm attempting to compile Moses in such a way that at least the boost
 libraries are statically compiled. I'm fine if other libraries are shared.
 
 My interpretation of this thread (and BUILD-INSTRUCTIONS.txt) is that if
 I compile boost (following instructions in BUILD-INSTRUCTIONS.txt) so
 that it makes available static versions of its libraries, then compile
 moses using link=static, boost should be statically linked with moses.
 Unfortunately, that doesn't appear to be happening for me:
 
 ./bjam link=static -j8 --with-cmph=/tools/moses/cmph-2.0
 --with-srilm=/tools/SRILM/SRILM-1.7.0
 --with-boost=/tools/moses/boost_1_53_0 -q --debug-configuration -d2
 
 The log for the above is attached. If I try running moses after this
 compile (without adding /tools/moses/boost_1_53_0 to LIBRARY_PATH and
 LD_LIBRARY_PATH), I get the following error:
 
 ./bin/moses: error while loading shared libraries:
 libboost_system-mt.so.1.53.0: cannot open shared object file: No such
 file or directory
 
 Running ldd on bin/moses confirms that there is a dynamic boost library
 linking:
 
 linux-vdso.so.1 = (0x7fff8adff000)
 
 libz.so.1 = /lib64/libz.so.1 (0x00394920)
 
 librt.so.1 = /lib64/librt.so.1 (0x00394960)
 
 libbz2.so.1 = /lib64/libbz2.so.1 (0x003955a0)
 
 libboost_system-mt.so.1.53.0 = not found
 
 libSegFault.so = /lib64/libSegFault.so (0x7f1671726000)
 
 libstdc++.so.6 = /usr/lib64/libstdc++.so.6 (0x00394fa0)
 
 libm.so.6 = /lib64/libm.so.6 (0x00394860)
 
 libgcc_s.so.1 = /lib64/libgcc_s.so.1 (0x00394d60)
 
 libpthread.so.0 = /lib64/libpthread.so.0 (0x003948e0)
 
 libc.so.6 = /lib64/libc.so.6 (0x00394820)
 
 /lib64/ld-linux-x86-64.so.2 (0x003947e0)
 
 
 Looking in /tools/moses/boost_1_53_0/lib I can verify that the static .a
 file exists:
 
 
 libboost_system-mt.a
 
 libboost_system-mt.so
 
 libboost_system-mt.so.1.53.0
 
 
 Any help would be appreciated. I'm attaching the compile log and a
 listing of the boost files installed in /tools/moses/boost_1_53_0/lib.
 
 
 Thanks,
 
 Lane
 
 
 
 
 On Mon, Aug 6, 2012 at 12:39 PM, Kenneth Heafield mo...@kheafield.com
 mailto:mo...@kheafield.com wrote:
 
Ok, committed.  Here's how the build system now behaves:
 
link=shared: Everything linked dynamically.
 
Default: internal libraries are statically linked.  Boost and zlib
statically linked if possible.  libSegFault dynamically linked.
Dynamically linked executable.
 
--without-libsegfault: Same as default but no libSegFault.  Still a
dynamically linked executable, even if you have static boost and zlib.
It's complicated to do detect then be automatic.
 
--static: No libSegFault.  Print warning messages if you're missing
static libraries, but keep building anyway.  Static executable.
 
Kenneth
 
On 08/06/2012 12:00 PM, Kenneth Heafield wrote:
 D'oh, it's a feature, not a bug.  Add runtime-link=static and
you'll get

Re: [Moses-support] Moses fails to load IRSTLM interpolated LM

2013-05-03 Thread Nicola Bertoldi
It seems that you compiled Moses without link to IRSTLM

If you are interested to use IRSTLM
you must re-compile Moseswith IRSTLM as additional package
using the following parameter:

--with-irstlm=/path/to/irstlm

as described here
http://www.statmt.org/moses/?n=Development.GetStarted

best regards,
Nicola Bertoldi

On May 3, 2013, at 12:47 AM, Pradeep Dasigi wrote:

Hello,

I have been trying to use interpolated LM produced by IRSTLM (-v 5.80.02 ) in 
Moses (-v 1.0).  interpolate-lm of IRSTLM successfully produces an output file 
(lmlist.final) in the following format:

LMINTERPOLATION 2
0.124027 lmfile1
0.875973 lmfile2

However, when I try to use it with moses, with the following lines in the ini 
file:

[lmodel-file]
1 0 5 lmlist.final

I get the following errors:

ERROR:Language model type unknown. Probably not compiled into library
ERROR:no LM created. We probably don't have it compiled

From the earlier discussions in this mailing list on this topic, I see that 
others have successfully done this with older versions of Moses and IRSTLM.  
So, I must be missing something obvious, or something must have changed in the 
newer versions of the packages.  Can someone help me?

Thanks,
Pradeep
___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Improving phrase-table

2013-04-23 Thread Nicola Bertoldi
Dear Nelson

I developed a phrase table (based on a cache) which can be automatically fed a 
new entries at run-time
The insertion of new entries is performed by adding to the input command line a 
xml-like input 
'dlt cblm=source_phrase|||target_phrase/
similar to that for the local suggestion of new translation options

the new phrase pairs are thn scored according to their insertion age.

This implementation is not yet publicly available, but it will be very soon

If you have specific needs, we can discuss in private how to enhanced my 
implementation to fit them.

best regards,
Nicola


On Apr 23, 2013, at 12:16 AM, Nelson Simao wrote:

 Hello!
 
 
 I'm going to create a plug-in to my translator which with the help of the 
 user it helps to improve the translation quality, through the best 
 translations that my system produces.
 So, I like to know if it's possible to modify the phrase-table?
 
 
 Thank you,
 Nelson.
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Not able to generate lm.*.gz file while creating moses baseline system

2013-04-02 Thread Nicola Bertoldi
This is clearly an issue related to IRSTLM.

please send your answer  to the irstlm mailing list  (user-irstlm   AT  
list.fbk.euhttp://list.fbk.eu)

Maybe Hieu is right, you run out of memory or outo-ofdisk

Which is the CPU RAM of your machine?
Have yo disk space in the temporary directory?

Please try also to run the build-lm.sh command with this additional 
parameters -k 10 -verbose,
collect the stdout and stderr and send it me.

Nicola Bertoldi
On behalf of IRSTLM team




On Apr 2, 2013, at 2:19 PM, Hieu Hoang wrote:

i'm not an expert with irstlm but you should double check you haven't run out 
of disk space.

Also, in
   build-lm.sh
comment out the lines that delete temporary files and directories to see what 
the script has created. This will help you in debugging the problem
  rm $tmpdir/* 2 /dev/null
and
  rmdir $tmpdir 2 /dev/null



On 1 April 2013 16:27, Swapnil Jadhav 
saj1...@hotmail.commailto:saj1...@hotmail.com wrote:
I am getting stuck at the following step.

export IRSTLM=$HOME/g2p/irstlm; ~/g2p/irstlm/bin/build-lm.sh -i 
file.sb.trhttp://file.sb.tr/ -t ~/g2p/flm/tmp -p -s improved-kneser-ney -o 
file.lm.trhttp://file.lm.tr/

As .gz file is not getting created.
I have used moses 3-4 times now and I never got stuck at this step.
The only change is previously I used training files with size  10 Mb. And now 
36 Mbs.
Will that affect ???
Because when I am trying my previous files I am successfully getting passed 
this step.
Please help.

Output :

saj@Jadhavs:~/g2p/flm$ ~/g2p/irstlm/bin/add-start-end.sh  
~/g2p/fcorpus/file.true.trhttp://file.true.tr/  
file.sb.trhttp://file.sb.tr/

saj@Jadhavs:~/g2p/flm$ ls
file.sb.trhttp://file.sb.tr/

saj@Jadhavs:~/g2p/flm$ export IRSTLM=$HOME/g2p/irstlm; 
~/g2p/irstlm/bin/build-lm.sh -i file.sb.trhttp://file.sb.tr/ -t ~/g2p/flm/tmp 
-p -s improved-kneser-ney -o file.lm.trhttp://file.lm.tr/
Temporary directory /home/saj/g2p/flm/tmp does not exist
creating /home/saj/g2p/flm/tmp
Extracting dictionary from training corpus
Splitting dictionary into 3 lists
Extracting n-gram statistics for each word list
Important: dictionary must be ordered according to order of appearance of words 
in data
used to generate n-gram blocks,  so that sub language model blocks results 
ordered too
dict.000
dict.001
dict.002
$bin/ngt -i=$inpfile -n=$order -gooout=y -o=$gzip -c  
$tmpdir/ngram.${sdict}.gz -fd=$tmpdir/$sdict $dictionary 
-iknstat=$tmpdir/ikn.stat.$sdict  $logfile 21
Estimating language models for each word list
dict.000
dict.001
dict.002
$scr/build-sublm.plhttp://build-sublm.pl/ $verbose $prune $smoothing cat 
$tmpdir/ikn.stat.dict.* --size $order --ngrams $gunzip -c 
$tmpdir/ngram.${sdict}.gz -sublm $tmpdir/lm.$sdict  $logfile 21
Merging language models into file.lm.trhttp://file.lm.tr/
Cleaning temporary directory /home/saj/g2p/flm/tmp
Removing temporary directory /home/saj/g2p/flm/tmp

saj@Jadhavs:~/g2p/flm$ ls
file.sb.trhttp://file.sb.tr/

saj@Jadhavs:~/g2p/flm$ ~/g2p/irstlm/bin/compile-lm --text yes file.lm.tr.gz 
file.arpa.trhttp://file.arpa.tr/
inpfile: file.lm.tr.gz
loading up to the LM level 1000 (if any)
dub: 1000
Failed to open file.lm.tr.gz!


From
Swapnil A Jadhav
MTech CSE-IS
NIT Warangal

___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support




--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu



___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] SRILM vs IRSTLM

2012-11-14 Thread Nicola Bertoldi
Modified ShiftBeta  (aka modified Kenser Ney) does not considered the real 
counts for computing probabilties, but the corrected counts, which basically 
are the number of different successors of a n-gram.
Hence in this case your bigram schválení těchto occurs always before zpráv, 
and hence it behaves like a singleton.

Please refer to this paper to more details about this smoothing technique:
Chen, S. F. and Goodman, J. (1999). An empirical study of smoothing techniques 
for language modeling. Computer Speech and Language, 4(13):359–393.

Nicola

On Nov 14, 2012, at 4:50 PM, Philipp Koehn wrote:

 Hi,
 
 I encountered the same problem when using msb and
 pruned singletons on large corpora (Europarl).
 SRILM's ngram complaints about no bow for prefix of ngram
 
 Here a Czech example:
 
 grep 'schválení těchto' /home/pkoehn/experiment/wmt12-en-cs/lm/europarl.lm.38
 -2.35639  schválení těchto zpráv  -0.198088
 -0.390525 schválení těchto zpráv ,
 -0.390525 proti schválení těchto zpráv
 
 There should be an entry for the bigram schválení těchto.
 
 I do not see how this could happen - the ngram occurs twice in the corpus:
 
 grep 'schválení těchto zpráv' lm/europarl.truecased.16
 zatímco s dlouhodobými cíli souhlasíme , nemůžeme souhlasit s
 prostředky k jejich dosažení a hlasovali jsme proti schválení těchto
 zpráv , které se neomezují pouze na eurozónu .
 zatímco souhlasíme s dlouhodobými cíli , nemůžeme souhlasit s
 prostředky k jejich dosažení a hlasovali jsme proti schválení těchto
 zpráv , které se neomezují pouze na eurozónu .
 
 I suspect that the current implementation throws out higher order n-grams
 if they occur in _one_context_, not _once_.
 
 -phi
 
 On Thu, Nov 8, 2012 at 3:31 AM, Marcin Junczys-Dowmunt
 junc...@amu.edu.pl wrote:
 Using -lm=msb instead of -lm=sb and testing on several evaluation sets
 seems to help. Then one time IRSTLM is better another time I have better
 results with SRILM. So on average they seem to be on par now.
 
 Interesting, however, that you say there should be no differences. I
 never manage to get the same BLEU scores on a test set for IRSTLM and
 SRILM. I have to do some reading on this dub issue and see what happens.
 
 W dniu 08.11.2012 09:20, Nicola Bertoldi pisze:
 From the FBK community...
 
 as already mentioned by ken,
 
 tlm computes correctly  the Improved Kneser-Ney method  (-lm=msb)
 
 tlm can keep the singletons: set parameter  -ps=no
 
 As concerns as OOV words tlm computes the probability of the OOV  as it 
 were a class of all possible unknown words.
 In order to get the actual prob of one single OOV tokentlm requires 
 that a Dictionary Upper Bound is set.
 The Dictionary Upper Bound is intended to be a rough estimate of the 
 dictionary size (a reasonable value could be 10e+7, which is also the 
 default)
 Note that having the same Dictionary Upper Bound (dub) value is 
 useful/mandatory to properly compare different LMs in terms of Perplexity
 Moreover, Note that the dub value is not stored in the saved LM
 
 In IRSTLM, you can/have to  set this value with the parameter  -dub   when 
 you compute the perplexity   either withtlmorcompile-lm
 In MOSES, you can/have to set this parameter with-lmodel-dub
 
 I remember you can use the LM estimated by means of IRSTLM toolkit  
 directly in MOSES setting the first field of the -lmodel-file parameter 
 to 1
 without transforming it with build-binary.
 
 
 As concerns the difference between IRSTLM and SRILM, they should not be 
 there.
 Have you notice difference also in the perplexity?
 Maybe you can send us  a tiny benchmark (data and used commands) in which 
 you experience such difference,
 so that we can debug.
 
 
 
 Nicola
 
 
 On Nov 8, 2012, at 8:22 AM, Marcin Junczys-Dowmunt wrote:
 
 Hi Pratyush,
 Thanks for the hint. That solved the problem I had with the arpa files
 when using -lm=msb and KenLM. Unfortunately, this does not seem to
 improve performance of IRSTLM much when compared to SRILM. So I guess I
 will have to stick with SRILM for now.
 
 Kenneth, weren't you working on your own tool to produce language models?
 Best,
 Marcin
 
 W dniu 07.11.2012 11:18, Pratyush Banerjee pisze:
 Hi Marcin,
 
 I have used msb with irstlm... but seems to have worked fine for me...
 
 You mentioned faulty arpa files for 5-grams... is it because KenLM
 complains of missing 4-grams, 3-grams etc ?
 Have you tried using -ps=no option with tlm ?
 
 IRSTLM is known to prune singletons n-grams in order to reduce the
 size of the LM... (tlm has it on by default..)
 
 If you use this option, usually KenLM does not complain... I have also
 used such LMs with SRILM for further mixing and it went fine...
 
 I am sure somebody from the IRSTLM community could confirm this...
 
 Hope this resolves the issue...
 
 Thanks and Regards,
 
 Pratyush
 
 
 On Tue, Nov 6, 2012 at 9:26 PM, Marcin Junczys-Dowmunt
 junc...@amu.edu.pl mailto:junc...@amu.edu.pl wrote:
 
On the irstlm page it says

Re: [Moses-support] SRILM vs IRSTLM

2012-11-08 Thread Nicola Bertoldi
From the FBK community... 

as already mentioned by ken,

tlm computes correctly  the Improved Kneser-Ney method  (-lm=msb)

tlm can keep the singletons: set parameter  -ps=no

As concerns as OOV words tlm computes the probability of the OOV  as it were a 
class of all possible unknown words.
In order to get the actual prob of one single OOV tokentlm requires that a 
Dictionary Upper Bound is set.
The Dictionary Upper Bound is intended to be a rough estimate of the dictionary 
size (a reasonable value could be 10e+7, which is also the default)
Note that having the same Dictionary Upper Bound (dub) value is 
useful/mandatory to properly compare different LMs in terms of Perplexity
Moreover, Note that the dub value is not stored in the saved LM 

In IRSTLM, you can/have to  set this value with the parameter  -dub   when you 
compute the perplexity   either withtlmorcompile-lm
In MOSES, you can/have to set this parameter with-lmodel-dub

I remember you can use the LM estimated by means of IRSTLM toolkit  directly in 
MOSES setting the first field of the -lmodel-file parameter to 1
without transforming it with build-binary.


As concerns the difference between IRSTLM and SRILM, they should not be there.
Have you notice difference also in the perplexity?
Maybe you can send us  a tiny benchmark (data and used commands) in which you 
experience such difference,
so that we can debug.



Nicola


On Nov 8, 2012, at 8:22 AM, Marcin Junczys-Dowmunt wrote:

 Hi Pratyush,
 Thanks for the hint. That solved the problem I had with the arpa files 
 when using -lm=msb and KenLM. Unfortunately, this does not seem to 
 improve performance of IRSTLM much when compared to SRILM. So I guess I 
 will have to stick with SRILM for now.
 
 Kenneth, weren't you working on your own tool to produce language models?
 Best,
 Marcin
 
 W dniu 07.11.2012 11:18, Pratyush Banerjee pisze:
 Hi Marcin,
 
 I have used msb with irstlm... but seems to have worked fine for me...
 
 You mentioned faulty arpa files for 5-grams... is it because KenLM 
 complains of missing 4-grams, 3-grams etc ?
 Have you tried using -ps=no option with tlm ?
 
 IRSTLM is known to prune singletons n-grams in order to reduce the 
 size of the LM... (tlm has it on by default..)
 
 If you use this option, usually KenLM does not complain... I have also 
 used such LMs with SRILM for further mixing and it went fine...
 
 I am sure somebody from the IRSTLM community could confirm this...
 
 Hope this resolves the issue...
 
 Thanks and Regards,
 
 Pratyush
 
 
 On Tue, Nov 6, 2012 at 9:26 PM, Marcin Junczys-Dowmunt 
 junc...@amu.edu.pl mailto:junc...@amu.edu.pl wrote:
 
On the irstlm page it says:
 
'Modified shift-beta, also known as “improved kneser-ney smoothing”'
 
Unfortunately I cannot use msb because it seems to produce
faulty arpa
files for 5-grams. So I am trying only shift-beta whatever that
means.
Maybe that's the main problem?
Also, my data sets are not that small, the plain arpa files currently
exceed 20 GB.
 
Best,
Marcin
 
W dniu 06.11.2012 22:15, Jonathan Clark pisze:
 As far as I know, exact modified Kneser-Ney smoothing (the current
 state of the art) is not supported by IRSTLM. IRSTLM instead
 implements modified shift-beta smoothing, which isn't quite as
 effective -- especially on smaller data sets.
 
 Cheers,
 Jon
 
 
 On Tue, Nov 6, 2012 at 1:08 PM, Marcin Junczys-Dowmunt
 junc...@amu.edu.pl mailto:junc...@amu.edu.pl wrote:
 Hi,
 Slightly off-topic, but I am out of ideas. I am trying to
figure out
 what set of parameters I have to use with IRSTLM to creates LMs
that are
 equivalent to language models created with SRILM using the
following
 command:
 
 (SRILM:) ngram-count -order 5 -unk -interpolate -kndiscount -text
 input.en -lm lm.en.arpa
 
 Up to now, I am using this chain of commands for IRSTLM:
 
 perl -C -pe 'chomp; $_ = s $_ /s\n'  input.en 
input.en.sb http://input.en.sb
 ngt -i=input.en.sb http://input.en.sb -n=5 -b=yes -o=lm.en.bin
 tlm -tr=lm.en.bin -lm=sb -bo=yes -n=5 -o=lm.en.arpa
 
 I know this is not quite the same, but it comes closest in terms of
 quality and size. The translation results, however, are still
 consistently worse than with SRILM models, differences in BLEU
are up to
 1%.
 
 I use KenLM with Moses to binarize the resulting arpa files, so
this is
 not a code issue.
 
 Also it seems IRSTLM has a bug with the modified shift beta
option. At
 least KenLM complains that not all 4-grams are present although
there
 are 5-grams that contain them.
 
 Any ideas?
 Thanks,
 Marcin
 ___
 Moses-support mailing list
 Moses-support@mit.edu mailto:Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support
 
___
Moses-support mailing list
Moses-support@mit.edu mailto:Moses-support@mit.edu

Re: [Moses-support] discrepancy between kenlm LM score and moses LM feature

2012-10-25 Thread Nicola Bertoldi
Moses useslog_e
while Kenlm useslog_10

log_e(P) = log_e(10) log_10(P)

-17.715 = log_e(10)*-7.69363


Am I right?


Nicola

On Oct 25, 2012, at 4:09 PM, Adam Teichert wrote:

 Hi all,
 
  After decoding, I get the output I am . from moses.  I expected
 the lm feature of the winning hypothesis to match the language model
 score of the output (i.e. running kenlm on I am .) but it doesn't
 match.
 
  moses decoder says the lm feature should be: -17.7152
  kenlm says:  -7.69363
 
   Any ideas?  (more verbose details below)
 
   Thanks,
 
--Adam
 
 
 moses:
 0 ||| I am .  ||| d: 0 lm: -17.7152 0 w: -3 tm: -13.6791 -13.4943
 -7.13198 -2.7697 2.99969 ||| -17.7152
 
 kenlm:
 I=11 2 -0.960569  am=16 2 -3.58125.=2 1 -2.85073  /s=12 2
 -0.301086 Total: -7.69363 OOV: 0
 
 lm.arpa:
 
 
 \data\
 ngram 1=24
 ngram 2=61
 ngram 3=55
 ngram 4=37
 ngram 5=15
 
 \1-grams:
 -1.468523 unk
 -1.847186 ,   -1.369804
 -1.949801 .   -1.569853
 -1.963158 and -1.181599
 -2.323228 is  -1.180351
 -2.361128 that-1.115681
 -2.58365  :   -0.9018961
 -2.668149 it  -0.9380397
 -2.68633  this-0.947085
 -2.810327 we  -1.037582
 -2.848784 not -0.6793468
 -2.959159 I   -1.039103
 -3.07579  /s
 -3.192658 my  -0.771203
 -3.257855 no  -0.6133907
 -99   s -2.155778
 -4.495705 am  -0.2612615
 -3.793312 little  -0.4288335
 -3.485892 me  -0.7101038
 -4.420267 surprise-0.5936328
 -4.396488 surprised   -0.7797279
 -4.735315 surprises   -0.5008686
 -4.552582 surprising  -0.5104726
 -4.420267 wonder  -0.7249804
 
 \2-grams:
 -0.3010858. /s
 -0.7091757surprises me-0.2860005
 -0.8920909surprises . -0.89
 -0.9196912surprised that  -0.3720784
 -0.9494263surprise ,  -0.2865064
 -0.9513271surprising ,-0.3125141
 -0.9605687s I   -2.082714
 -1.027536 surprises , -0.1418063
 -1.03712  surprising that -0.234587
 -1.113948 surprising .-0.818023
 -1.128903 wonder ,-0.201676
 -1.129378 surprise that   -0.2199759
 -1.144491 s we  -1.811983
 -1.176047 surprised , -0.3082836
 -1.223108 me ,-0.4326107
 -3.267176 , . -0.6048141
 -1.942681 , I -1.324145
 -1.624891 , and   -1.041633
 -2.088251 , is-0.7182704
 -2.00548  , it-1.121896
 -2.659058 , no-0.426425
 -2.169326 , not   -0.7878548
 -2.013858 , that  -0.741353
 -4.511329 s am  -1.611718
 -2.70271  s no  -0.6921422
 -1.498533 I am-0.6396686
 -2.678573 I wonder-0.4021412
 -1.723108 am I-0.1898816
 -1.983081 am not  -0.05170299
 -3.118301 and .   -0.6289005
 -4.704317 and surprising  -0.03819112
 -4.235544 and wonder  -0.3394165
 -2.49379  is .-1.147667
 -2.878376 is it   -0.3021726
 -2.286608 is no   -0.2445829
 -1.610847 is not  -0.7426848
 -4.088284 is surprising   -0.2584699
 -1.43062  it .-1.05238
 -1.354294 it is   -0.4715224
 -1.872162 little .-0.7615967
 -1.302207 me .-0.7159881
 -2.216844 no .-0.4275606
 -3.09122  no surprise -0.238984
 -3.29149  no surprises-0.1407324
 -2.364784 not .   -0.5624467
 -3.183594 not it  -0.0237771
 -3.857746 not me  -0.1115348
 -3.5716   not surprise-0.333057
 -3.570876 not surprised   -0.2835279
 -3.621675 not surprising  -0.1414568
 -1.247403 surprise .  -0.5275136
 -1.554344 surprise me -0.1131797
 -1.231814 surprised . -1.243741
 -1.353927 surprised me-0.2527946
 -2.455974 that .  -0.8608306
 -1.859256 that is -0.5548451
 -1.854284 that it -0.8317314
 -2.064787 this .  -0.8331819
 -2.12582  this is -0.4543184
 -2.882952 we .-0.9427112
 -1.525799 wonder .-1.009658
 
 \3-grams:
 -0.01054467   s am I-0.4033368
 -0.0125701surprised . /s
 -0.02177685   wonder . /s
 -0.0255149we . /s
 -0.02761195   is . /s
 -0.03434359   surprising . /s
 -0.03933057   little . /s
 -0.04950372   surprises . /s
 -0.05429693   and . /s
 -0.06596278   it . /s
 -0.06643473   , . /s
 -0.069778 surprise . /s
 -0.08091767   me . /s
 -0.0885259that . /s
 -0.0999171this . /s
 -0.1734623no . /s
 -0.1389831not . /s
 -2.214, I wonder  -0.9099113
 -1.619141 I am not-0.4131633
 -4.240031 , and . -0.2288982
 -3.244178 , is .  -0.4621631
 -2.173454 , is no -0.5102387
 -2.217342 it is . -0.6322967
 -1.657225 it is not   -0.3576518
 -2.566752 that is .   -0.5047822
 -3.111846 that is it  -0.1018341
 -2.304894 this is .   -0.7973652
 -2.962754 this is it  -0.08542722
 -0.6746925, it is -0.9160917
 -1.739163 is it . -0.1419258
 -1.454397 not it .0
 -0.8604823that it is  -0.6332519
 -0.5766525

Re: [Moses-support] Combining cased TM with uncased LM

2012-08-24 Thread Nicola Bertoldi
Hi Nick,

if you intend to translate case-sensitive input text
and your LM combination is either liner or log-linear,
the answer is YES.

You can use IRSTLM package.
There is the possibility to load (together with a standard uncased LM) a map 
which transforms the words of an ngram before querying the LM.
You can just define a map between cased words and uncased words (for all words 
of the lm dictionary)

I am available for give you any further clarification on that
Nicola

On Aug 24, 2012, at 1:44 PM, Nicholas Ruiz wrote:

 Hi everyone,
 
 Is it possible to combine a cased translation model with an uncased 
 language model in Moses?
 
 Thanks,
 Nick Ruiz
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] R: Changing scores in memory

2012-08-24 Thread Nicola Bertoldi
 Hi Barry, Hi Prashant

could the implementation of  parallel data structure
be a nice project for the forthcoming  MT Marathon?

Which characteristics should it have?

I have some ideas about it

Nicola


On Aug 24, 2012, at 5:25 PM, Barry Haddow wrote:

Hi Prashant

If you want to update the phrase table after it's loaded, then you  want to use 
the in-memory one (PhraseDictionaryMemory) rather than the on-demand one 
(PhraseDictionaryTree) since the entries in the latter are loaded from disk as 
required and changes may be overwritten.

If you update the in-memory table though, you should remember that Moses 
assumes it is thread-safe, and also that Moses prunes it at load time to (a 
default of) 20 entries per source phrase.

I actually think that creating a parallel data structure, or in fact a 
wrapper that implements the same interface as the PhraseDictionaries and 
delegates to one of the real ones, could be a cleaner solution,

cheers - Barry

On 24/08/12 15:03, Prashant Mathur wrote:
Hi All,
I am trying to change the phrase scores after the phrase table and reordering 
model is loaded in the memory. Is it possible? If so, which class should I look 
into?
If not.. is it possible to maintain a parallel data structure and prioritize it 
over the already loaded scores?

--
Prashant



___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] [IRSTLM] Segmentation fault when loading a model

2012-08-06 Thread Nicola Bertoldi
Dear Filip.

I am happy to help you, but I get too few info from your information

By the way, I would like to  move this thread to the IRSTLM mailing list
or even in a private thread.

How do you use the dictionary in your program?
Are you using in in cremental mode or not?

Could you please send me the piece of your code related to dictionary,
as well as the log of your debugging?

best regards,
Nicola Bertoldi
(IRSTLM development team)

On Aug 6, 2012, at 3:52 PM, Filip Petkovski wrote:

Hi,

I am using IRSTLM for making a language model and I got a segmentation fault
when I was trying to load a binary model trained using build-lm.sh and compiled 
using compile-lm.

I tracked the problem down to the dictionary::load(std::istream) method in the 
trunk/src/dictionary.cpp file.

As far as I could tell, there is an issue with initialization of a dictionary 
object and its member fields,
since the segmentation fault occurred when trying to access a member field of 
strstack in strstack::push(const char *)

I compiled my program with g++ -Wall -I$IRSTLM/include program.cpp -o program 
-L$IRSTLM/lib -lirstlm -lz

Best Regards,
Filip Petkovski
___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Ems: interpolating LM using IrstLM

2012-08-02 Thread Nicola Bertoldi
Dear Mauro,


could you please send me the command line you use?

Probably, there is a mismatch in the versions. I would check immediately.
Which version of IRSTLM and MOSES did you actually installed?

Meanwhile could you try to edit the configuration file and remove the keyword

LMINTERPOLATION

Nicola
(IRSTLM developer staff)

PS:  I would suggest you to send IRSTLM-specific questions to the mailing of 
IRSTLM
user-irs...@list.fbk.eumailto:user-irs...@list.fbk.eu

On Aug 2, 2012, at 3:43 PM, Mauro Zanotti wrote:

Thank you Pratyush and thank you all,

I installed new version of Irstlm and Moses, but when I try to interpolate the 
LMs I get the following error:

Reading /opt/tools/moses/scripts/ems/example/data-iw/lm-interpol/lmlist.init...
Wrong input format.

For my lmlist.init i took your example...

LMINTERPOLATION 2
0.439053 /opt/tools/moses/scripts/ems/example/ex9watris/lm/Auto+Euro.lm.2
0.560947 /opt/tools/moses/scripts/ems/example/ex9watris/lm/LMEN.lm.1

Do you know how I can solve the problem?
Thank you in advance
Mauro


On Thu, Aug 2, 2012 at 11:28 AM, Pratyush Banerjee 
pbaner...@computing.dcu.iemailto:pbaner...@computing.dcu.ie wrote:
Hi Mauro,

Alongside the documentation pointed out by Daniel (which is the official IRSTLM 
documentation),  you would need a few more things in order to  interpolate LMs 
using IRSTLM...

The interpolate-lm script would create a config file (lets say interp.wt.final) 
of the following format

LMINTERPOLATION 2
0.439053 full path to your LM-1
0.560947 full path to your LM-2

However, IRSTLM does not allow you to create a single interpolated LM as SRILM 
does... Instead you can directly use the interpolated LMs in your Moses.ini by 
passing the final config file directly.


[lmodel-file]
1 0 5 /home/mt/l_models/interp.wt.final

But to have this functionality, you should have a relatively new Moses build 
and IRSTLM version 5.70.04...

For further reference you could look at 
http://comments.gmane.org/gmane.comp.nlp.moses.user/6341



Hope this helps..

Thanks and Regards,

Pratyush

On Thu, Aug 2, 2012 at 9:16 AM, Daniel Schaut 
danielsh...@hotmail.commailto:danielsh...@hotmail.com wrote:
Hi Mauro,

IRSTLM provides a special tool for that. Here you can find more information 
about how to interpolate LMs using IRSTLM
http://sourceforge.net/apps/mediawiki/irstlm/index.php?title=LM_interpolation

Daniel

Von: moses-support-boun...@mit.edumailto:moses-support-boun...@mit.edu 
[mailto:moses-support-boun...@mit.edumailto:moses-support-boun...@mit.edu] Im 
Auftrag von Philipp Koehn
Gesendet: 02 August 2012 00:35
An: Mauro Zanotti
Cc: moses-support@mit.edumailto:moses-support@mit.edu
Betreff: Re: [Moses-support] Ems: interpolating LM using IrstLM

Hi,

yes, the current implementation relies on SRILM.
But maybe someone from IRST can explain how
to interpolate their models.

-phi
On Wed, Aug 1, 2012 at 3:37 PM, Mauro Zanotti 
mau.zano...@gmail.commailto:mau.zano...@gmail.com wrote:
Dear all,

I trained 2 LM in EMS module, how can I interpolate them using irstlm instead 
of srilm? interpolate-lm.perl works only with srilm?

Thank you in advance
Mauro

___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support





___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] train-recaser.perl and new IRSTLM

2012-05-30 Thread Nicola Bertoldi
Hi Tomas,

IRSTLM actually returns 0 if it succeeds,
It returns values greater than 0, if something fails and if the usage info are 
required.

Which is the cmd actually executed by train-recaser.perl?
Have you tried to run this command separately in a shell?
What does it return?

The problem could be related to the usage of the fixed temporary directory 
/tmp
Do you have permission to write on it?

cheers,
Nicola


On May 29, 2012, at 12:53 PM, Tomas Hudik wrote:

Hi there,

It seems newer versions of IRSTLM (build-lm.sh) ends up with exit code 1. But 
if build-lm.sh exits with something else than 0 train-recaser fails.
It is due to the command:
system($cmd) == 0 || die(Language model training failed with error  . ($?  
8) . \n);
which should be changed to
system($cmd) == 256 || die(Language model training failed with error  . ($? 
 8) . \n);

or, maybe even better would be to test the existence of output file 
(cased_irstlm.gz)

cheers, Tomas
___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] irstlm how to use caching feature?

2012-04-20 Thread Nicola Bertoldi
Dear Somayeh,

I am answering to the same email you sent to the IRSTLM support list
user-irs...@list.fbk.eumailto:user-irs...@list.fbk.eu

because it is more appropriate for you problem.

best
Nicola


On Apr 20, 2012, at 11:35 AM, somayeh bakhshaei wrote:

Hello all,

I am trying to use caching option of Irstlm ,
So I have config it with -enable-caching

the lm is made,

then I tried to compile and change it to ARPA format:

compile-lm lm --text yes out

but it gives this error:

Reading /Share/local/bakhshaei/ITRC/en-fr/cacheLm/integratedv0.4-cache.gz...
iARPA
loadtxt()
1-grams: reading 299158 entries
line=-5.776930
compile-lm: lmtable.cpp:237: int parseline(std::istream, int, ngram, float, 
float): Assertion `howmany == (Order+ 1) || howmany == (Order + 2)' failed.
Aborted


Is it possible to tell me what is wrong please?


-
Best Regards,
S.Bakhshaei

After All you will come 
And will spread light on the dark desolate world!
O' Kind Father! We will be waiting for your affectionate hands ...

___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Segmentation fail problem when using script mert-moses.pl

2012-04-11 Thread Nicola Bertoldi
You are using one factor   (with index 0)

but in the configuration file you are loading a LM for the factor with index 1 
and no one with index 0

the correct line is
1 0 5 /home/loki/Downloads/irstlm-5.70.04/scripts/train.irstlm.gz



Futhermore, you are loading a textual version of the LM, and this take a long 
time
Finished loading LanguageModels : [2945.000] second

I strongly suggest you to compile it n binary format
using the command compile-lmof tIRSTLM toolkit, as follows(with the 
path)

compile-lm rain.irstlm.gz  train.irstlm.blm

see the manual for more details:


the Moses configuration file should be changed accordingly (with the correct 
path)

1 0 5   train.irstlm.blm


You spend some time to binarize the LM once, but then loading it in Moses will 
be very fast.



best
Nicola Bertoldi

On Apr 11, 2012, at 8:58 AM, Loki Cheng wrote:

Hi, everyone, I want to use the script mert-moses.plhttp://mert-moses.pl/ 
to tuning the parameters in the mose.ini file, but the running process fail 
eventually, here is the log information:
-
loki@ubuntu:~/Downloads/moses/scripts/target/scripts-20120222-0301/training$ 
./mert-moses.plhttp://mert-moses.pl/ 
/home/loki/Downloads/moses/scripts/target/scripts-20120222-0301/training/development_corpus/input/dev.clean.en
 
/home/loki/Downloads/moses/scripts/target/scripts-20120222-0301/training/development_corpus/references/dev.zh.utf8
 /home/loki/Downloads/moses/moses-cmd/src/moses 
/home/loki/Downloads/moses/scripts/target/scripts-20120222-0301/training/model/moses.ini
 --mertdir /home/loki/Downloads/moses/mert --working-dir 
/home/loki/Downloads/moses/scripts/target/scripts-20120222-0301/training/tunning
 --rootdir $SCRIPTS_ROOTDIR

main::create_extractor_script() called too early to check prototype at 
./mert-moses.plhttp://mert-moses.pl/ line 666.
Using SCRIPTS_ROOTDIR: 
/home/loki/Downloads/moses/scripts/target/scripts-20120222-0301
Assuming the tables are already filtered, reusing filtered/moses.ini
Using cached features list: ./features.list
MERT starting values and ranges for random generation:
  d =   0.600 ( 0.00 ..  1.00)
 lm =   0.500 ( 0.00 ..  1.00)
  w =  -1.000 ( 0.00 ..  1.00)
 tm =   0.200 ( 0.00 ..  1.00)
 tm =   0.200 ( 0.00 ..  1.00)
 tm =   0.200 ( 0.00 ..  1.00)
 tm =   0.200 ( 0.00 ..  1.00)
 tm =   0.200 ( 0.00 ..  1.00)
run 1 start at Tue Apr 10 21:15:19 PDT 2012
Parsing --decoder-flags: ||
Saving new config to: ./run1.moses.ini
Saved: ./run1.moses.ini
(1) run decoder to produce n-best lists
params =
Normalizing lambdas: 0.60 0.50 -1.00 0.20 0.20 0.20 
0.20 0.20
DECODER_CFG = -w -0.322581 -lm 0.161290 -d 0.193548 -tm 0.064516 0.064516 
0.064516 0.064516 0.064516
decoder_config = -w -0.322581 -lm 0.161290 -d 0.193548 -tm 0.064516 0.064516 
0.064516 0.064516 0.064516
Executing: /home/loki/Downloads/moses/moses-cmd/src/moses   -config 
filtered/moses.ini -inputtype 0 -w -0.322581 -lm 0.161290 -d 0.193548 -tm 
0.064516 0.064516 0.064516 0.064516 0.064516  -n-best-list run1.best100.out 100 
-input-file 
/home/loki/Downloads/moses/scripts/target/scripts-20120222-0301/training/development_corpus/input/dev.clean.en
  run1.out
Defined parameters (per moses.ini or switch):
config: filtered/moses.ini
distortion-limit: 6
input-factors: 0
input-file: 
/home/loki/Downloads/moses/scripts/target/scripts-20120222-0301/training/development_corpus/input/dev.clean.en
inputtype: 0
lmodel-file: 1 1 5 
/home/loki/Downloads/irstlm-5.70.04/scripts/train.irstlm.gz
mapping: 0 T 0
n-best-list: run1.best100.out 100
ttable-file: 1 0 0 5 
/home/loki/Downloads/moses/scripts/target/scripts-20120222-0301/training/model/phrase-table
ttable-limit: 20
weight-d: 0.193548
weight-l: 0.161290
weight-t: 0.064516 0.064516 0.064516 0.064516 0.064516
weight-w: -0.322581
Loading lexical distortion models...have 0 models
Start loading LanguageModel 
/home/loki/Downloads/irstlm-5.70.04/scripts/train.irstlm.gz : [0.000] seconds
In LanguageModelIRST::Load: nGramOrder = 5
Language Model Type of 
/home/loki/Downloads/irstlm-5.70.04/scripts/train.irstlm.gz is 1
iARPA
loadtxt_ram()
1-grams: reading 443439 entries
done level1
2-grams: reading 8644803 entries
.done level2
3-grams: reading 48970019 entries
.done level3
4-grams: reading 103946151 entries
done level4
5-grams: reading 140766631 entries
done level5
done
OOV code is 443438
OOV code is 443438
IRST: m_unknownId=443438
Finished loading LanguageModels : [2945.000] seconds
Start loading PhraseTable 
/home/loki/Downloads/moses/scripts/target/scripts-20120222-0301/training/model/phrase-table
 : [2945.000] seconds
filePath: 
/home/loki/Downloads/moses/scripts/target/scripts

Re: [Moses-support] Using interpolated IRSTLM models directly in Moses...

2012-04-03 Thread Nicola Bertoldi
Dear Pratyush,

 the format of the LM metafile 
(/home/pbanerjee/smt-work/incremental_LM_Exp/mt/l_models/interp.wt.fina)
is the following


LMINTERPOLATION 2
0.439053 /home/pbanerjee/smt-work/incremental_LM_Exp/mt/l_models/forum.lm
0.560947 /home/pbanerjee/smt-work/incremental_LM_Exp/mt/l_models/tm.lm

Note the keyword LMINTERPOLATION

best,
Nicola


On Apr 3, 2012, at 4:44 PM, Pratyush Banerjee wrote:

Hi,

I am trying to use interpolated language models (prepared using IRSTLM - v 
5.50.01) directly in Moses..
I use interpolate-lm to learn weights on a devset but am not sure how to pass 
the configuration file to Moses directly..

I learnt from one of Nicola's mails (in the IRSTLM mailing list) that it could 
directly be passed on to the Moses decoder...
However, when i try the following in my moses.ini file :

[lmodel-file]
1 0 5 /home/pbanerjee/smt-work/incremental_LM_Exp/mt/l_models/interp.wt.final

The language models are not loaded... I get the following lines in my log 
file...

Start loading LanguageModel 
/home/pbanerjee/smt-work/incremental_LM_Exp/mt/l_models/interp.wt.final : 
[1.000] seconds
In LanguageModelIRST::Load: nGramOrder = 5
Loading LM file (no MAP)
2
loadtxt()
done
starting to use OOV words [unk]
OOV code is 0
OOV code is 0
IRST: m_unknownId=0

The content of my interpolation config file is as below...

2
0.439053 /home/pbanerjee/smt-work/incremental_LM_Exp/mt/l_models/forum.lm
0.560947 /home/pbanerjee/smt-work/incremental_LM_Exp/mt/l_models/tm.lm

Could some one please let me know how to pass a interpolated configuration file 
to be directly used by Moses ?

Thanks and Regards,

Pratyush
___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] xml-markup problem

2012-03-09 Thread Nicola Bertoldi
Dear Philley,

the correct syntax is the following
np translation=trg prob=0.0src/np

or 
np translation=trg prob=1.0src/np

where src   a portion of the input text
and trg is the suggested translation forsrc


I guess moses is confused by the quotation marks.

If you echo the input,  you have to use single quotes 

echo 'n translation=north prob=1.0sud/n' | moses -f  moses.ini  
-xml-input inclusive
BEST TRANSLATION: north [1]  [total=-0.712] 0.000, -1.000, 0.000, -10.794, 
0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000
north

echo 'n translation=north prob=0.0sud/n' | moses -f  moses.ini  
-xml-input inclusive
BEST TRANSLATION: south [1]  [total=-0.899] 0.000, -1.000, 0.000, -10.828, 
0.000, 0.000, 0.000, 0.000, 0.000, -0.313, -0.845, -0.249, -0.249, 1.000
south

best regards,
Nicola Bertoldi


On Mar 9, 2012, at 7:44 PM, Feifan Liu wrote:

 Dear All,
 I have a problem with xml-markup. I used the parameter -xml-input inclusive 
 and tried with markup np translation=  prob=0.0 and np 
 translation=  prob=1.0. 
 I expect that when prob=1.0, the translation will be definitely the 
 markuped one, and when prob=0.0, the translation will not be the markuped 
 one. But the output is the same. 
 
 Anyone can help with explaining this? Or what should I do if I want different 
 output with different probabilities?  Can the input sentence be markuped with 
 different prob values? 
 Waiting for your help!
 Thanks very much.
 -Philley
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] xml-markup problem

2012-03-09 Thread Nicola Bertoldi
It is strange...we will debug what you you reported


What happens if you run the following

cat inputfile | moses -f moses.ini-xml-input inclusive

Do you get error?

Nicola
On Mar 10, 2012, at 12:59 AM, Feifan Liu wrote:

Dear Nicola,
Thanks for your detailed explanation!
May I ask you another question? When I tried to use -xml-input inclusive and 
-input-file filename simultaneously, it didn't work. I have to echo each 
line of the file and then use -xml-input inclusive, which is much slower.
Are you aware of this issue?
Much appreciate it!
-Philley

On Fri, Mar 9, 2012 at 4:12 PM, Nicola Bertoldi 
berto...@fbk.eumailto:berto...@fbk.eu wrote:
Dear Philley,

let us assume that you use inclusive options and that you suggest  the 
translation trg
for the source src with probability P with the following xml markup:
np translation=trg prob=Psrc/np

This is almost equivalent to have an additional phrase pairs in the phrase 
table with probability P
and hence this alternative compete with all other hypothesis.
In other words you are not assured that your suggestion will be chosen by the 
decoder.
for instance could happen that your suggestion does not fit the LM getting a 
very bad for LM score.

In order to force the use of your suggestion, you have to use the exclusive 
option;
in this case the only alternative for src is your trg.

best regards,
nicola bertoldi


On Mar 9, 2012, at 11:03 PM, Feifan Liu wrote:

Thanks a lot, Barry!
I thought 0.0 will force the decoder not to use the suggested translation. 
But 1.0 will force to use the suggested translation, am I right?
Best,
Philley
On Fri, Mar 9, 2012 at 3:01 PM, Barry Haddow 
bhad...@staffmail.ed.ac.ukmailto:bhad...@staffmail.ed.ac.ukmailto:bhad...@staffmail.ed.ac.ukmailto:bhad...@staffmail.ed.ac.uk
 wrote:
Hi Philley

Are there any other translations in the phrase table for 'complained'?

Notice that Moses converts the probabilities to logs, then floors them at -20. 
So an xml-option with probability 0 could still be used if there are no other 
options,

Cheers - Barry

Sent from my ZX81


- Reply message -
From: Feifan Liu 
feifan@gmail.commailto:feifan@gmail.commailto:feifan@gmail.commailto:feifan@gmail.com
Date: Fri, Mar 9, 2012 20:26
Subject: [Moses-support] xml-markup problem
To: Nicola Bertoldi 
berto...@fbk.eumailto:berto...@fbk.eumailto:berto...@fbk.eumailto:berto...@fbk.eu
Cc: 
moses-support@mit.edumailto:moses-support@mit.edumailto:moses-support@mit.edumailto:moses-support@mit.edu
 
moses-support@mit.edumailto:moses-support@mit.edumailto:moses-support@mit.edumailto:moses-support@mit.edu


Dear Nicola,
Thanks for your reply. But I did use the same syntax as you did.

echo 'np translation=hy prob=0.0complained/np' |
../../tools/mosesdecoder/dist/bin/moses -f moses.ini -xml-input inclusive
2/dev/null
hy
BEST TRANSLATION: hy [1]  [total=-28.871] 0.000, -1.000, 0.000, -18.542,
-20.000, -20.000, -20.000, -20.000, -20.000

Anything I was missing?
-Philley

On Fri, Mar 9, 2012 at 1:46 PM, Nicola Bertoldi 
berto...@fbk.eumailto:berto...@fbk.eumailto:berto...@fbk.eumailto:berto...@fbk.eu
 wrote:

 Dear Philley,

 the correct syntax is the following
 np translation=trg prob=0.0src/np

 or
 np translation=trg prob=1.0src/np

 where src   a portion of the input text
 and trg is the suggested translation forsrc


 I guess moses is confused by the quotation marks.

 If you echo the input,  you have to use single quotes

 echo 'n translation=north prob=1.0sud/n' | moses -f  moses.ini
  -xml-input inclusive
 BEST TRANSLATION: north [1]  [total=-0.712] 0.000, -1.000, 0.000,
 -10.794, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000,
 0.000
 north

 echo 'n translation=north prob=0.0sud/n' | moses -f  moses.ini
  -xml-input inclusive
 BEST TRANSLATION: south [1]  [total=-0.899] 0.000, -1.000, 0.000,
 -10.828, 0.000, 0.000, 0.000, 0.000, 0.000, -0.313, -0.845, -0.249, -0.249,
 1.000
 south

 best regards,
 Nicola Bertoldi




  Dear All,
  I have a problem with xml-markup. I used the parameter -xml-input
 inclusive and tried with markup np translation=  prob=0.0 and np
 translation=  prob=1.0.
  I expect that when prob=1.0, the translation will be definitely the
 markuped one, and when prob=0.0, the translation will not be the markuped
 one. But the output is the same.
 
  Anyone can help with explaining this? Or what should I do if I want
 different output with different probabilities?  Can the input sentence be
 markuped with different prob values?
  Waiting for your help!
  Thanks very much.
  -Philley
  ___
  Moses-support mailing list
  Moses-support@mit.edumailto:Moses-support@mit.edumailto:Moses-support@mit.edumailto:Moses-support@mit.edu
  http://mailman.mit.edu/mailman/listinfo/moses-support





The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.







___
Moses

Re: [Moses-support] Regarding speed up of translation using binary phrase table

2012-03-02 Thread Nicola Bertoldi
I would only put in evidence that irstlm is not responsible for the very large 
loading time .
In fact, it starts and ends in less than 1 second (start at 49 seconds and end 
at the same time)

So probably as Barry mention, you have to use binarized version of the phrase 
and reordering tables

In the configuration file where you specify the translation table
the first field is set to 1, meaning that your phrase table is binarized,

Are you sure that you actually created it?
In other words, do you have a set of files like
/home/ssp/smt/etof/model/phrase-table.binphr*

cheers,
Nicola

On Mar 2, 2012, at 6:29 PM, Barry Haddow wrote:

 Hi Shweta
 
 Here's some suggestions:
 - Make sure you binarise the reordering model too
 - Try with kenlm instead of irstlm
 - make sure your binarised files are on a local disk (not nfs!)
 - Remember that you can translate a batch of sentences in one moses run
 
 cheers - Barry
 
 On Friday 02 March 2012 17:15:35 shweta porwal wrote:
 Hi im building an smt web translation system using your given guidelines on
 moses home.
 
 The translations are good , but even after doing the steps of memory
 management for speeding up the translation given in the step by step guide
 , I am still facing delay in translation output due to disk access times.
 According to the guide it should not load the entire phrase table every
 time, but I guess I have missed something.
 
 
 the output of echo command is:
 
 echo mardi | /opt/tools/moses/dist/bin/moses -f
 /home/ssp/smt/etof/model/moses-bin.ini
 
 Defined parameters (per moses.ini or switch):
  config: /home/ssp/smt/etof/model/moses-bin.ini
  distortion-file: 0-0 wbe-msd-bidirectional-fe-allff 6
 /home/ssp/smt/etof/model/reordering-table.wbe-msd-bidirectional-fe
  distortion-limit: 6
  input-factors: 0
  lmodel-file: 1 0 3 /home/ssp/smt/etof/lm/europarl-v6.blm.mm
  mapping: 0 T 0
  ttable-file: 1 0 0 5 /home/ssp/smt/etof/model/phrase-table
  ttable-limit: 20
  weight-d: 0.3 0.3 0.3 0.3 0.3 0.3 0.3
  weight-l: 0.5000
  weight-t: 0.20 0.20 0.20 0.20 0.20
  weight-w: -1
 Loading lexical distortion models...have 1 models
 Creating lexical reordering...
 weights: 0.300 0.300 0.300 0.300 0.300 0.300
 Loading table into memory...done.
 Start loading LanguageModel /home/ssp/smt/etof/lm/europarl-v6.blm.mm :
 [49.000] seconds
 In LanguageModelIRST::Load: nGramOrder = 3
 Language Model Type of /home/ssp/smt/etof/lm/europarl-v6.blm.mm is 1
 blmt
 loadbin()
 lmtable::loadbin_dict()
 dict-size(): 41308
 loadbin_level (level 1)
 mapping 41308 1-grams
 tableOffs 494937 tableGaps3417-grams
 done (level1)
 loadbin_level (level 2)
 mapping 484826 2-grams
 tableOffs 1114557 tableGaps445-grams
 done (level2)
 loadbin_level (level 3)
 mapping 297921 3-grams
 tableOffs 8386947 tableGaps2435-grams
 done (level3)
 done
 OOV code is 1499
 IRST: m_unknownId=1499
 Finished loading LanguageModels : [49.000] seconds
 Start loading PhraseTable /home/ssp/smt/etof/model/phrase-table : [49.000]
 seconds
 filePath: /home/ssp/smt/etof/model/phrase-table
 Finished loading phrase tables : [49.000] seconds
 IO from STDOUT/STDIN
 Created input-output object : [49.000] seconds
 Translating line 0  in thread id 3039353712
 Translating: mardi
 
 reading bin ttable
 size of OFF_T 8
 binary phrasefile loaded, default OFF_T: -1
 Collecting options took 0.120 seconds
 Search took 0.120 seconds
 tuesday
 BEST TRANSLATION: tuesday [1]  [total=-13.136] 0.000, -1.000, 0.000,
 -0.547, 0.000, 0.000, 0.000, 0.000, 0.000, -27.202, -0.916, -0.496, -0.865,
 -0.580, 1.000
 reset caches
 Translation took 0.120 seconds
 Finished translating
 reset mmap
 len  = 623037
 sync = 0
 running msync...
 done. Running munmap...
 done
 len  = 7272835
 sync = 0
 running msync...
 done. Running munmap...
 done
 len  = 2087882
 sync = 0
 running msync...
 done. Running munmap...
 done
 len  = 623037
 sync = 0
 running msync...
 done. Running munmap...
 done
 len  = 7272835
 sync = 0
 running msync...
 done. Running munmap...
 done
 len  = 2087882
 sync = 0
 running msync...
 done. Running munmap...
 done
 
 It takes about 2-3 minutes to translate a single word.
 
  Any suggestions on how I could reduce the total translation time? I would
 really appreciate the help.
 
 Thanks.
 
 
 --
 Barry Haddow
 University of Edinburgh
 +44 (0) 131 651 3173
 
 -- 
 The University of Edinburgh is a charitable body, registered in
 Scotland, with registration number SC005336.
 
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] ERROR: Can't generate symmetrized alignment file

2012-03-01 Thread Nicola Bertoldi
it seems (line Executing of the step 3)
that the root-dir (where the code is available) is not expanded to its actual 
value
Instead of
-root-dir/training/symal/giza2bal.pl

you should have something like
/path-to-code//training/symal/giza2bal.pl

This can be due to a wrong installation of the scripts.

Nicola

On Mar 1, 2012, at 5:31 AM, Herry Sujaini wrote:


Hi,

When I train phrase model, a mistake I met was showed as follows:

...
line 43000
line 44000
END.
(2.1b) running giza en-fr @ Thu Mar  1 11:12:41 WIT 2012
/home/herry/bin/GIZA++  -CoocurrenceFile ./giza.en-fr/en-fr.cooc -c 
./corpus/en-fr-int-train.snt -m1 5 -m2 0 -m3 3 -m4 3 -model1dumpfrequency 1 
-model4smoothfactor 0.4 -nodumps 1 -nsmooth 4 -o ./giza.en-fr/en-fr 
-onlyaldumps 1 -p0 0.999 -s ./corpus/fr.vcb -t ./corpus/en.vcb
  ./giza.en-fr/en-fr.A3.final.gz seems finished, reusing.
(3) generate word alignment @ Thu Mar  1 11:12:41 WIT 2012
Combining forward and inverted alignment from files:
  ./giza.fr-en/fr-en.A3.final.{bz2,gz}
  ./giza.en-fr/en-fr.A3.final.{bz2,gz}
Executing: mkdir -p ./model
Executing: -root-dir/training/symal/giza2bal.pl -d gzip -cd 
./giza.en-fr/en-fr.A3.final.gz -i gzip -cd ./giza.fr-en/fr-en.A3.final.gz 
|-root-dir/training/symal/symal -alignment=grow -diagonal=yes -final=yes 
-both=yes  ./model/aligned.grow-diag-final-and
sh: Illegal option -r
Exit code: 2
ERROR: Can't generate symmetrized alignment file

What is the reason for its happening? And how can I solve it?

Thank you.
Herry S
___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] KenLM: The context of every 4-gram should appear as a 3-gram

2012-02-17 Thread Nicola Bertoldi
Dear Sylvain,

I am starting to answer the question in this thread.

- Most recent release of IRSTLMis 5.70.04 and can be downloaded from 
SourceForge

- The IRSTLM user guide can be found  in the SourceForge website:
https://sourceforge.net/apps/mediawiki/irstlm/index.php?title=Main_Page

  We try to keep it updated as much as possible, and your suggestions to 
improve it are welcome.


- By default   tlm   performspruning of n-gram singletons of order larger 
or equal to 3.
  To disable singleton pruning use this parameter -PruneSingletons=no   (or 
its short version  -ps=no)

  Note that, for hystorical reasons, singleton prunin is off by default if you 
use build-lm.sh to build a LM
  To enable, in this case, please use -p


- As concerns the original problem, it is not really clear to me, whether the 
4-gram to support them .
  is present or not  in the LM built with IRSTLM tlm command.
  I am glad to debug this if you could send me the input text you train the 
model on.

  In general, the Modified Shift Beta smoothing approach can have odd behavior 
if the training data are few,
 and it recommended to use a less sophisticated, but more robust smoothing 
approaches, like ShiftBeta or even Witten-Bell.

- As concerns Ken's question I have to double-check with the other developers, 
I will come back to you very soon.

best,
Nicola

On Feb 16, 2012, at 6:23 PM, Sylvain Raybaud wrote:

Hi

 No, I haven't turned on pruning. I've been looking in IRSTLM manual if
it was on by default but I couldn't find the information (and I couldn't
find an up to date manual either, only for version 5.60.something).

Since it seems to depend on the smoothing method, maybe msb turns it on,
but not sb?

The solution you propose would indeed make me happy :) Actually, I just
need it to run with moses and yield acceptable performance to be happy.
I can even live with -lm=sb, since finding the best LM parameters isn't
the core of my research :)

thanks for your reply!

cheers,

Sylvain

On 16/02/12 17:46, Kenneth Heafield wrote:
Hi,

This is hopefully a stupid question.  Did you turn on pruning?  I don't
see it in the command line: tlm -tr=toy.sent_start_end.en -lm=msb -n=5
-o=toy.en.n5.lm.  Or did IRSTLM make pruning the default in new releases?

KenLM should be accepting pruned models and I take responsibility for
that.  But I am also confused as to how to support them did not appear
if pruning was off.

Kenneth

On 02/16/2012 10:16 AM, Kenneth Heafield wrote:
Hi,

Interesting.  The only other person to run into this is David Chiang
who had some custom software to prune/build models.

I have been requiring that property to make right state minimization
work correctly: if it doesn't match to support them then the right
state contains at most support them, rendering to support them .
inaccessible.  I could reinsert to support them when this happens,
with p(to support them) = b(to support)p(support them) and b(to support
them) = 0.

It's a bit of a pain to do this correctly.  Would you be happy if only
the default probing model supported it, but the trie continued to throw
an error message?

The ARPA standard, to the extent that there is one, does not require
this behavior, so IRSTLM is within their rights to prune them.

Nicola, how does IRSTLM handle these cases at inference time?

Kenneth

On 02/16/2012 07:59 AM, Sylvain Raybaud wrote:
Hi

   LM stuff again!

I've created a language model with IRSTLM (release 5.70.04):
tlm -tr=toy.sent_start_end.en -lm=msb -n=5 -o=toy.en.n5.lm

When I specify type 1 (IRSTLM) in moses.ini it's loading fine. But if I
try to load it with KenLM I get:

The context of every 4-gram should appear as a 3-gram Byte: 471440 File:
/global/markov/raybauds/DATA/TOY/toy.en.n5.lm

Byte 471440 seems to be the '\n' between the following lines:
-1.16894to support them .   -0.0679314
-0.836008   to deal with hamas

As a matter of fact, to support them does not appear as a trigram in
the model. If I remove this 4-gram the same problem arises with another
one, whose 3-gram prefix is also missing. I think it is the problem. If
I change the smoothing method to sb instead of msb I get a usable
LM. Is this normal behavior? Do you think it's a KenLM or an IRSTLM
related problem?


cheers,

___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


--
Sylvain Raybaud
___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] creating LM with IRST toolkit

2011-11-30 Thread Nicola Bertoldi
Hi Hieu

On Dec 1, 2011, at 8:34 AM, Hieu Hoang wrote:

 hi all
 
 can anyone tell me if creating LM with the IRST toolkit is integrated into 
 the EMS yet?
 

I let anyone else to answer this point.

 if not, is this the entirety of what has to be run?
   cat $CORPUSFILE | $IRSTLM/bin/add-start-end.sh | gzip -c  
 temp/monolingual.setagged.gz 
   $IRSTLM/bin/build-lm.sh -t stat4 -i gunzip -c 
 temp/monolingual.setagged.gz -n 5 -p -o temp/iarpa.gz -k 10 
   $IRSTLM/bin/compile-lm temp/iarpa.gz --text yes /dev/stdout | gzip -c  
 $LMFILE
 

yes, this is the procedure to train a LM with IRSTLM.
If your corpus is not too big and fits in the memory, you
can use the tlm command to esimate the LM  and directly
store it in binary format (skipping the compile-lm step).
Please, see the IRSTLM manual for details on its usage,
and send further questions directly to the irstlm mailing list:
user-irs...@list.fbk.eu


best
Nicola

 
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] mert-moses.pl script

2011-10-31 Thread Nicola Bertoldi
Hi Neda

There is also a parameter of mert-moses.pl  --predictable-seed(see 
the help)
which makes MERT  deterministic.

In general, MERT procedure relies on an initial seedseed_t   at each
iteration   t   for the creation of the (20 by default) random starting points
for the optimization;  not that   seed_i   differs from seed_j  (if i differs 
from j)

If you activate this flag --predictable-seed,  seed_i still  differs from 
seed_j  
BUT the sequence seed_1, seed_2, ... seed_N   is always the same; and 
hence, the random starting points considered for the optimization are always
the same, so that the final results   of the MERT is always the same.

best regards
Nicola


On Oct 29, 2011, at 6:37 PM, Patrik Lambert wrote:

 
 Hi Neda,
 
 this happens because the seed used in the MERT optimizer depends by 
 default on the moment on which you launch it.
 If you need deterministic MERT runs, you can set the seed by adding this 
 switch to the mert-moses.pl call:
 
 --mertargs= -r $seed 
 
 Patrik
 
 
 Le 29/10/2011 18:11, moses-support-requ...@mit.edu a écrit :
 Message: 3
 Date: Sat, 29 Oct 2011 17:05:11 +0100
 From: Barry Haddowbhad...@staffmail.ed.ac.uk
 Subject: Re: [Moses-support] mert-moses.pl script
 To: moses-support@mit.edu
 Message-ID:201110291705.11699.bhad...@staffmail.ed.ac.uk
 Content-Type: Text/Plain;  charset=utf-8
 
 Hi Neda
 
 Yes, this is quite normal. The best plan is to do several runs and take the
 average bleu. See this paper for a discussion
 
 http://www.cs.cmu.edu/~jhclark/pubs/significance.pdf
 
 cheers - Barry
 
 On Saturday 29 Oct 2011 10:09:29 Neda NoorMohammadi wrote:
 hello,
 
 In my experiment this is happening:
 
 I am running mert-moses.pl script  on a same develop set and configure file
 in a same condition but by each execution I gain different weights which
 leads to different Bleu (about 1%) (note that these executions are
 independent of each other and are initialized to predefine weights of
 moses.ini).
 
 The used features in configure file are: LM: 3gram, Reordering:
 msd-bidirectional-fe, Alignment: grow-diag-final-and, phrase scores.
 
 Is it normal? What is your suggestion?
 In this situation what is my baseline? The best one or the worth?
 
 Neda Noormohammadi
 
 
 --
 
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support
 
 
 End of Moses-support Digest, Vol 60, Issue 46
 *
 
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] moses compilation problem

2011-09-27 Thread Nicola Bertoldi
The actual error message is a bit above in the log 
please send me the full log

which version/revision of Moses and IRSTLM are you using?

Nicola


On Sep 26, 2011, at 11:49 AM, itua ijagbone wrote:

 Hi, i want to compile moses , when i get to ./configure 
 -with-irstlm=/home/mystudio/mytranslate/tools/irstlm and run it, i get this 
 error
 ...
 LanguageModelIRST.cpp: In function ‘bool Moses::LMCacheCleanup(size_t, 
 size_t)’:
 LanguageModelIRST.cpp:225: warning: comparison between signed and unsigned 
 integer expressions
 make[3]: *** [LanguageModelIRST.lo] Error 1
 make[3]: Leaving directory `/home/mystudio/moses/moses/moses/src'
 make[2]: *** [all] Error 2
 make[2]: Leaving directory `/home/mystudio/moses/moses/moses/src'
 make[1]: *** [all-recursive] Error 1
 make[1]: Leaving directory `/home/mystudio/moses/moses'
 make: *** [all] Error 2
 
 i do not know what to do. been at it all day. getting tired and frustrated
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Compiling Moses with IRSTLM

2011-09-26 Thread Nicola Bertoldi
Dear Jehan,

IRSTLM is able to load the ARPA-format  language model

In order to constrain Moses to use IRSTLM toolokit (instead of SRILM or 
others), 
change the first field (1 for IRSTLM or 0 for SRILM)in the configuration 
file

best regards,
Nicola Bertoldi

On Sep 26, 2011, at 10:32 AM, Jehan Pages wrote:

 Hi,
 
 On Mon, Sep 26, 2011 at 3:48 PM, Nicola Bertoldi berto...@fbk.eu wrote:
 I am going to release (very soon) a new version of Moses including  new LM 
 types
 Stay tuned on IRSTLM webpage
 
 If you need immediately, get the code from the IRSTLM SF repository
 
 you can download revision 452, which properly interfaces with the latest 
 revision of Moses
 
 Thanks for the answer. As right now, I am mainly testing this engine,
 the development version from the repo suits me ok. Anyway Moses
 compiled fine using revision 452 of IRSTLM. So that's great. Thanks
 again!
 
 Also just to be sure, in the getting started page, the sample models
 which are linked are only for SRILM, right? Because I wanted to test
 as explained in the page, and I get:
 
 [...]
 Start loading LanguageModel lm/europarl.srilm.gz : [0.000] seconds
 ERROR:Language model type unknown. Probably not compiled into library
 Segmentation fault
 
 
 Seeing the srilm.gz extension, I guess that won't work with only
 IRSTLM compiled in. That information may be worth being updated into
 the Getting started page. :-)
 I guess I'll have to test directly with more complete data.
 
 Jehan


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] conver a phrase table from binary to textual format

2011-09-22 Thread Nicola Bertoldi
Hi all,

do you know if it is possible (and how) the conversion of a phrase table (or 
reordering table)
 from its binary format to the textual format?


Nicola
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] build 5gram with irstlm

2011-09-04 Thread Nicola Bertoldi
Use the full path for the input file and for the output file

If this does not work,

please run 
sh -x ./build-lm.sh -i Texte_EN.txt -n 3 -o lm -k 3 --verbose

collect both STDERR and STDOUT
and send all to me together with your input file (if not too big)


best regards,
Nicola Bertoldi


On Sep 4, 2011, at 6:51 PM, Cyrine NASRI wrote:

 Hello, I'm a new user with irstlm
 I want to build a 3gram with irstlm but i have an error message :
 ./build-lm.sh -i Texte_EN.txt -n 3 -o lm -k 3
 
 Cleaning temporary directory stat
 ./build-lm.sh: 145: Syntax error: Bad fd number
 
 Can you help me please?
 Thanks
 
 Best Regards
 Cyrine
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Who maintain the Moses website?

2011-08-29 Thread Nicola Bertoldi
Dear James

most code of the word alignment symmetrization comes from FBK.

I found an previous email where you asked
giza2bal.plhttp://giza2bal.pl/ -d path-to-updated-tgt-to-src-ahmm -i 
path-to-updated-src-to-tgt-ahmm
j symal -alignment=grow -diagonal=yes -final=yes -both=yes  
new-alignment-file
However, I can not understand the parameter path-to-updated-tgt-to-src-ahmm 
and
path-to-updated-src-to-tgt-ahmm means. Could anyone please tell me the files 
they refer to?
Which types files do they match with in my fold? My folder contains the 
following files,




Briefly,
you know that the word alignment provided by (any version of) GIZA is a 
directional map (from src to trg or viceversa)
but it is more efficient extracting phrase pairs from a symmetrized word 
alignment, built starting from the two directional alignments.
Hence, you know that you have to run GIZA in the two directions.

The commands you mentioned makes exactly this.
giza2bal.plhttp://giza2bal.pl/ takes the files with suffix A3.final 
generated by the src-to-trg alignment (giza-src-trg) and by the inverse 
trg-to-src alignment (giza-trg-src)
and provides an intermediate bi-directional word alignment format
symal actually computes the symmetrized alignment  according to a selected 
policy specified through its parameters.

I hope my explanation is clear,

best regards,
nicola bertoldi

On Aug 29, 2011, at 8:14 AM, 蒋乾 wrote:

Hi all,

There is a command

za2bal.plhttp://za2bal.pl/ -d path-to-updated-tgt-to-src-ahmm -i 
path-to-updated-src-to-tgt-ahmm | symal -alignment=grow -diagonal=yes 
-final=yes -both=yes  new-alignment-file

recorded in the Advanced features on the website.

Do you know who recorded the content and from whom I would probablly get the 
answer?

Best Regards,
James
___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Using Input Features (Lattices) Breaks mert-moses

2011-08-29 Thread Nicola Bertoldi
Dear Grahan,

have you specified the parameter
weight-i
in the configuration file?

Nicola
On Aug 29, 2011, at 5:35 PM, Graham Neubig wrote:

 Hi,
 
 I am trying to run mert-moses.pl on lattice input (using the latest
 version from SVN) and am getting the following error after the first
 iteration finishes:
 
 ---
 
 The decoder returns the scores in this order: d d d d d d d lm w I tm
 tm tm tm tm
 Mismatched lambdas. Decoder returned d d d d d d d lm w I tm tm tm tm
 tm, we expected d d d d d d d lm w tm tm tm tm tm tm at
 /home/neubig/usr/bin/scripts-20110829-0448/training/mert-moses.pl line
 959
 
 ---
 
 Looking at the code quickly, it looks like the feature for the input
 weight is being registered as a PhraseDictionaryFeature, which is
 causing it to return tm instead of the expected I, which is
 causing the sanity check in mert-moses.pl to fail.
 
 Graham
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] MT Marathon: Third Call for Participation

2011-07-27 Thread Nicola Bertoldi
(Apologies if you receive multiple copies)

Please, distribute it among potentially interested colleagues.


CALL FOR PARTICIPATION

The Machine Translation Marathon 2011 is the sixth in a series of events 
promoted by EuroMatrix and EuroMatrixPlus, which are EU research projects on 
Machine Translation.

The Sixth MT Marathon, organised by the HLT Research Unit of Fondazione Bruno 
Kessler (FBK), will take place 5-10 September 2011 in Trento, Italy.

The EuroMatrixPlus consortium invites researchers, developers, students and 
users of machine translation for participation.

Participants can attend the MT Marathon in several ways:

  *   Attend lectures and labs: these range from beginners tutorials to 
showcase talks by leading researchers. Everybody can learn or strengthen their 
knowledge!
  *   Attend technical talks about open-source tools for MT.
  *   Take part in a project team to help develop an open-source tool for MT.

Where: Fondazione Bruno Kessler, IRST, Povo, Trento, Italy
When: September 5-10, 2011
How: On-line Registration
Fee: Attendance is free of charge, but limited.

For more information and online registration please go to 
http://mtmarathon2011.fbk.euhttp://mtmarathon2011.fbk.eu/


Best regards,
The 6th MT Marathon organisation committee.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] A second call for solution ERROR: Can't generate symmetrized alignment file

2011-07-20 Thread Nicola Bertoldi
It seems that your parallel corpus contains a line long 999 words (at least in 
target)
So I suspect that something in the cleaning with clean-corpus-n.perl  failed

As you know, if you use 40 as maximum sentence length you should get
only sentence shorter than (or equal to) 40.
Please check this carefully.

if you can not find the solution please send me an excerpt of the parallel 
corpus
(and alignment file) including the portion arising the problem.


Nicola

On Jul 20, 2011, at 9:27 AM, 蒋乾 wrote:

Hello,

When I trained the translation from English to Vietnamese, a mistake I met was 
showed as follows:


|tools/moses-scripts/scripts-20110322-0943//training/symal/symal 
-alignment=grow -diagonal=yes -final=yes -both=yes  
jameswork/En_Vi//model/aligned.grow-diag-final-and
symal: computing grow alignment: diagonal (1) final (1)both-uncovered (1)
20636: target len=999 is not less than MAX_WORD-1=999
symal: symal.cpp:83: int getals(std::fstream, int, int*, int, int*): 
Assertion `strlen(w)1000-1' failed.
sh: line 1: 16708 Broken pipe 
tools/moses-scripts/scripts-20110322-0943//training/symal/giza2bal.plhttp://giza2bal.pl/
 -d gzip -cd jameswork/En_Vi//giza.vi-en/vi-en.A3.final.gz -i gzip -cd 
jameswork/En_Vi//giza.en-vi/en-vi.A3.final.gz
 16709 Aborted (core dumped) | 
tools/moses-scripts/scripts-20110322-0943//training/symal/symal 
-alignment=grow -diagonal=yes -final=yes -both=yes  
jameswork/En_Vi//model/aligned.grow-diag-final-and
Exit code: 134
ERROR: Can't generate symmetrized alignment file

I tried to clean-corpus-n.perl  both of the paralled corpus  
commentary.lm into 1 40. But the mistake still happened!

Has anyone meet this problem  solve it successfully? Please share your 
experience.

Thank you.
___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Erratic IRSTLM compile failures

2011-05-24 Thread Nicola Bertoldi
the problem was due to missing dependencies and
occurred only with parallel compilation (make -jN).

the revision   409   solves the problem.

best regards,
Nicola Bertoldi

On May 24, 2011, at 4:09 AM, Tom Hoar wrote:


We're updating DoMY with the newest Moses components. Our IRLSTM installation 
script includes this:

cpus=`grep -c ^processor /proc/cpuinfo`
export MACHTYPE=`uname -m`
export LC_ALL=C
export IRSTLM=/usr/src/irstlm
./regenerate-makefiles.sh $log
./configure --prefix=$IRSTLM --enable-caching $log
make -j $cpus $log
make -j $cpus install $log

In testing our installation scripts, we rebuild a clean system from a new copy 
of the original svn source many times. We are using the newest svn rev 406 
although the problem also happened with svn 405.

The problem is, make -j $cpus fails intermittently, about 1 of 3 or 4 times. 
All references to 'ld' and 'libtool' in the configure log report everything is 
okay. The error occurs using make with/without -j. Details below.

Thanks
Tom



Build environment: Ubuntu 10.04 LTS Server, 64-bit
libtool version: ltmain.sh (GNU libtool) 2.2.6b
g++ version: g++ (Ubuntu 4.4.3-4ubuntu5) 4.4.3



Successful make log output:
mv -f .deps/tlm.Tpo .deps/tlm.Po
/bin/bash ../libtool --tag=CXX   --mode=link g++ -isystem/usr/include -W -Wall 
-ffor-scope -D_FILE_OFFSET_BITS=64 -D_LARGE_FILES  -DMYCODESIZE=3-o dict 
dict.o -lirstlm  -lz
libtool: link: ranlib .libs/libirstlm.a
libtool: link: g++ -isystem/usr/include -W -Wall -ffor-scope 
-D_FILE_OFFSET_BITS=64 -D_LARGE_FILES -DMYCODESIZE=3 -o dict dict.o  -lirstlm 
-lz
libtool: link: ( cd .libs  rm -f libirstlm.la  ln -s ../libirstlm.la 
libirstlm.la )
make[2]: Leaving directory `/usr/src/irstlm/src'
make[1]: Leaving directory `/usr/src/irstlm'



Failed make log output:
mv -f .deps/tlm.Tpo .deps/tlm.Po
/bin/bash ../libtool --tag=CXX   --mode=link g++ -isystem/usr/include -W -Wall 
-ffor-scope -D_FILE_OFFSET_BITS=64 -D_LARGE_FILES  -DMYCODESIZE=3-o dict 
dict.o -lirstlm  -lz
libtool: link: ranlib .libs/libirstlm.a
libtool: link: g++ -isystem/usr/include -W -Wall -ffor-scope 
-D_FILE_OFFSET_BITS=64 -D_LARGE_FILES -DMYCODESIZE=3 -o dict dict.o  -lirstlm 
-lz
/usr/bin/ld: cannot find -lirstlm
collect2: ld returned 1 exit status
make[2]: *** [dict] Error 1
make[2]: *** Waiting for unfinished jobs
libtool: link: ( cd .libs  rm -f libirstlm.la  ln -s ../libirstlm.la 
libirstlm.la )
make[2]: Leaving directory `/usr/src/irstlm/src'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/usr/src/irstlm'
make: *** [all] Error 2

___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Call for Papers for the Open Source Convention held at the Sixth MT Marathon

2011-05-23 Thread Nicola Bertoldi
(Apologies if you receive multiple copies. Please, distribute it among 
potentially interested colleagues.)


CALL FOR PAPERS:


OPEN SOURCE TOOLS FOR MACHINE TRANSLATION

The Machine Translation Marathon 2011 is the sixth in a series of events 
promoted by EuroMatrix and EuroMatrixPlus, which are EU research projects on 
Machine Translation. The MT Marathon will take place 5-10 September 2011 in 
Trento, Italy, organised by the HLT Research Unit of Fondazione Bruno Kessler 
(FBK).
For more information on the MT Marathon go to the official website: 
`http://mtmarathon2011.fbk.eu

The MT Marathon is hosting an Open Source Convention to advance the state of 
the art in machine translation. We invite developers of open source tools to 
present their work and submit a paper of up to 10 pages that describes the 
underlying methodology and includes instructions on how to use the tools.

We are looking for stand-alone tools and extensions of existing tools, such as 
the Moses open source system. Accepted papers will be presented during the MT 
Marathon and published in the Prague Bulletin of Mathematical Linguistics 
(http://ufal.mff.cuni.cz/pbml).

Possible Topics:

  *   Training of Machine Translation models
  *   Machine Translation decoders
  *   Tuning of Machine Translation systems
  *   Evaluation of Machine Translation
  *   Visualisation, annotation or debugging tools
  *   Tools for human translators
  *   Interfaces for web-based services or APIs
  *   Extensions of existing tools
  *   Other tools for Machine Translation



This is the fourth time that the MT Marathon will host the Open Source 
Convention. The papers from last three marathons are available online 
(http://ufal.mff.cuni.cz/pbml-91-100.html).

Papers will be reviewed by two reviewers appointed by the program committee. 
Most of the accepted papers will be printed in PBML in time for the MT 
Marathon, some papers may require substantial revisions and may be postponed to 
subsequent PBML issues.



Important Dates

Abstract submission: July 10, 2011   (1 paragraph, to help us allocate 
reviewers)
Paper submission: July 24, 2011
Notification of acceptance: August 3, 2011
Camera-ready: August 10, 2011
Presentations: 5-10 settembre 2011 (at the MT Marathon in Trento)

Author Instructions

Please send full non-anonymous submissions in PDF to Philip Koehn (pkoehn AT 
inf DOT ed DOT ac DOT uk) and the full Xe(La)TeX source for technical 
pre-review to Ondrej Bojar (bojar AT ufal DOT mff DOT cuni DTO cz).

The maximum length for submissions is 10 pages, including references. This 
limit will be strictly enforced. If your paper has been accepted, please send 
your camera-ready version in both PDF and Xe(La)TeX format to Ondřej Bojar.

Submissions will be accepted only in the PBML Xe(La)TeX format 
(http://ufal.mff.cuni.cz/pbml-instructions.html) for short papers (i.e. MS Word 
and other formats or a PDF without source files will not be accepted).


Best regards,
The program committee.


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Moses dies with segmentation fault on first sentence (IRSTLM)

2011-03-31 Thread Nicola Bertoldi
Dear Arda

This combination compiles
MOSES3878
IRSTLM   5.60.01


By the way,
we have just upload on the official IRSTM website   
http://hlt.fbk.eu/en/irstlm
a new version 5.60.02, which the latest Moses revisions (should) compile with.


best regards,
Nicola




On Mar 31, 2011, at 3:07 PM, Arda Tezcan wrote:

Hi Kenneth,
Thanks a lot for the tip!
As a result I tried to compile Moses with the 5.60.01 however I still got a no 
matching function error for the file lmmacro.h.
So I gave another try with a later revision of Moses (3923) and could compile 
it with this version of IRSTLM.

Regards,
Arda


--

I've had this happen too when running benchmarks.  The latest IRSTLM is
actually 5.60.01: http://hlt.fbk.eu/en/irstlm and appears to resolve
your issue.  The sourceforge page is out of date.

#include kenlm/advertisement


___
Moses-support mailing list
Moses-support@mit.edumailto:Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Reg. Giza++ crash reported by EMS

2011-03-18 Thread Nicola Bertoldi
Hi Sriram (and all MacOsX users)

are you using a MacOsX machine?
So probably the error is due to the fact that by default the OS  has a 
case-insensitive filesystem
so A3.final files are wrongly overwritten by a3.final ones

The easiest way to solve this issue is the following: (found on the web but 
I've lost the link)


1) edit the GIZA source file GIZA++-v2/model3.cpp and change the lines 
321-322 
alignfile = Prefix + .A3. + number ;
test_alignfile = Prefix + .tst.A3. + number ;

as follows

alignfile = Prefix + .UA3. + number ;
test_alignfile = Prefix + .tst.UA3. + number ;

2) recompile GIZA

3) call the Moses training script train-model.perl adding the following 
parameter
--giza-extension UA3.final

This works for giza-pp-v1.0.2.tar.gz, but similar changes can be done for other 
versions.

best regards,
Nicola Bertoldi



On Mar 17, 2011, at 5:17 PM, Barry Haddow wrote:

 Hi Sriram
 
 GIZA has output an error message, which may mean your alignmenmts are faulty. 
 You should search for 'error' in TRAINING_run-giza.3.STDERR, and remember 
 that it may appear in uppercase in this file. 
 
 If you want to try continuing with the alignments that were produced, then 
 you 
 can force ems to use them by adding something like
 
 giza-alignment = $working-dir/training/giza.12
 giza-alignment-inverse = $working-dir/training/giza-inverse.12
 
 to the TRAINING section,
 
 best regards - Barry
 
 On Thursday 17 March 2011 15:57, Sriram V wrote:
 Hello,
 
 When I run ems/experiment.perl, giza++ runs well in both the directions and
 produces the corresponding *.A3.final.gz files. However, it is reported
 that those steps have crashed. Subsequently, the following components do
 not run. Any ideas about what could have gone wrong here ?
 
 TRAINING_run-giza.3.STDERR.digest
 
 error
 error
 
 TRAINING_run-giza-inverse.3.STDERR.digest
 
 error
 error
 
 
 Here are the last few lines of the file TRAINING_run-giza.3.STDERR
 
 7
 8
 9
 NTable contains 286060 parameter.
 Executing: rm -f
 ...working-dir//training/giza-inverse.3/de-en.A3.final.gz
 Executing: gzip .../working-dir//training/giza-inverse.3/de-en.A3.final
 
 
 
 Regards,
 Sriram
 
 -- 
 The University of Edinburgh is a charitable body, registered in
 Scotland, with registration number SC005336.
 
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] format of phrase table without lexical scores

2010-12-29 Thread Nicola Bertoldi
Dear Cyrine,
Consider these two possibilities:
- set the weights of these two features to 0.0 in the configuration  
file (or on the command line)
- create an other phrase table with the three (interesting) values  
and cahnge the configuration  file accordingly

best
Nicola


On Dec 29, 2010, at 7:03 AM, Cyrine NASRI wrote:

 Hello everyone,
 I have a question about the translation table:
 I tried initially using the translation format with 5 features:  
 probability in both directions , lexical phrase score and word  
 penality
 I wonder if I can do the translation ignoring both score value (2nd  
 and 4th fields)?
 Knowing that after what I read in the documentationi connot make 0
 Thank you

 Cordially
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] allow-unknown-lambdas fails in mert-moses-new.pl

2010-11-27 Thread Nicola Bertoldi

from revision r3398 (2010-08-10)
mert-moses-new.pl is renamed to mert-moses.pl
and it is no more updated and maintained.

so please use mert-moses.pl
Nevertheless, I do not think  this solve your pb

best
Nicola


On Nov 27, 2010, at 3:59 AM, Lane Schwartz wrote:

In the 2010-08-13 release, I tried to use mert-moses-new.pl, in  
conjuction with an older decoder binary.


If I run mert-moses-new.pl, I get told, The decoder also produced  
some 'tm' scores, but we do not know the ranges for them, no way to  
optimize them


If I specify lambdas, I get told lambdas specifed for 'tm' 5, but  
none needed


If I specify the --allow-unknown-lambdas flag, I get told Unknown  
option: --allow-unknown-lambdas


What is going on?

Thanks,
Lane
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] LanguageModelIRST.cpp errors

2010-11-03 Thread Nicola Bertoldi
Dear Charles,

just to know

which compiler are you using? which Operating system?

Nicola


On Nov 3, 2010, at 8:47 AM, Nicola Bertoldi wrote:

 Dear Charles,

 as mentioned by Tom,
 we already discovered this bug which occurs only with some  
 compilers   (probably from version 4.3) and/or some OS

 I have already fixed it, and I am trying to test on several  
 (virtual) machines

 Hopefully by today, I will distribute a new release.

 best regards,
 Nicola Bertoldi


 On Nov 3, 2010, at 5:12 AM, Charles Chiu wrote:

 Nicola,

 This memset error is the same problem we had last week with the new
 compiler. Your new htable.h and lmtable.h files solved the problem  
 for me.

 Tom


 On Tue, 2 Nov 2010 21:12:34 -0700 (PDT), Charles Chiu
 charles5...@yahoo.com wrote:
 Dear Nicola,
   First of all, I would like to thank you for your quick response  
 and
 the
 information you provide. I now try to compile the 5.50.01 and  
 encounter
 some
 error during make command. I tried to search online to find some
 possible
 answers before sending you this email for help. However, I didn't  
 have
 much
 luck. Would you be kind enough to help me out with this error? Thank
 you!

 cc...@cchiu-laptop:~/tools/irstlm-5.50.01$ make
 make  all-recursive
 make[1]: Entering directory `/home/cchiu/tools/irstlm-5.50.01'
 Making all in src
 make[2]: Entering directory `/home/cchiu/tools/irstlm-5.50.01/src'
 if /bin/bash ../libtool --tag=CXX --mode=compile g++ - 
 DHAVE_CONFIG_H -I.
 -I.
 -I..   -DTRACE_ENABLE=1 -W -Wall -ffor-scope -D_FILE_OFFSET_BITS=64
 -D_LARGE_FILES  -DMYCODESIZE=3 -g -O2 -MT lmmacro.lo -MD -MP -MF
 .deps/lmmacro.Tpo -c -o lmmacro.lo lmmacro.cpp; \
 then mv -f .deps/lmmacro.Tpo .deps/lmmacro.Plo; else rm -f
 .deps/lmmacro.Tpo; exit 1; fi
  g++ -DHAVE_CONFIG_H -I. -I. -I.. -DTRACE_ENABLE=1 -W -Wall -ffor- 
 scope
 -D_FILE_OFFSET_BITS=64 -D_LARGE_FILES -DMYCODESIZE=3 -g -O2 -MT
 lmmacro.lo
 -MD
 -MP -MF .deps/lmmacro.Tpo -c lmmacro.cpp -o lmmacro.o
 In file included from lmmacro.cpp:31:
 htable.h: In member function 'void htableT::map(std::ostream,  
 int)':
 htable.h:225: error: there are no arguments to 'memset' that  
 depend on a

 template parameter, so a declaration of 'memset' must be available
 htable.h:225: note: (if you use '-fpermissive', G++ will accept your
 code,
 but
 allowing the use of an undeclared name is deprecated)
 htable.h:240: error: there are no arguments to 'memset' that  
 depend on a

 template parameter, so a declaration of 'memset' must be available
 make[2]: *** [lmmacro.lo] Error 1
 make[2]: Leaving directory `/home/cchiu/tools/irstlm-5.50.01/src'
 make[1]: *** [all-recursive] Error 1
 make[1]: Leaving directory `/home/cchiu/tools/irstlm-5.50.01'
 make: *** [all] Error 2
 cc...@cchiu-laptop:~/tools/irstlm-5.50.01$




 - Original Message 
 From: Nicola Bertoldi berto...@fbk.eu
 To: Charles Chiu charles5...@yahoo.com
 Cc: moses-support@mit.edu
 Sent: Tue, November 2, 2010 2:53:36 AM
 Subject: Re: [Moses-support] LanguageModelIRST.cpp errors

 Dear Charles,

 probably you are using an old release of IRSTLM
 please download the new one (5.50.01) from FBK website
 http://hlt.fbk.eu/en/irstlm

 as suggested during the compilation of Moses

 Hope this solve your problem

 bet regards,
 Nicola Bertoldi



 On Nov 2, 2010, at 6:10 AM, Charles Chiu wrote:

 Hi,
 I was able to by pass the gcc4 makefile issue i had  
 while
 compiling
 SRILM and moses configuration utility actually is able to finish  
 its
 task.
 However, I am encountering a new issue here. I was trying to  
 compile
 Moses and
 got the following error log when compiling with make -j 2  
 command.
 Could someone help me to figure out what is going on with this  
 issue?
 By
 the
 way, I really appreciate the reply for my last question. It really
 helps
 a
 lot!

 Thanks!

 Error Log:

 In file included from Hypothesis.h:33,
  from SentenceStats.h:30,
  from StaticData.h:43,
  from LanguageModelIRST.cpp:38:
 PhraseDictionaryMemory.h:66: warning: unused parameter ‘outColl’
 LanguageModelIRST.cpp: In member function ‘virtual bool
 Moses::LanguageModelIRST::Load(const std::string,  
 Moses::FactorType,
 size_t)’:
 LanguageModelIRST.cpp:123: warning: comparison of unsigned  
 expression
 =
 0 is
 always true
 LanguageModelIRST.cpp:126: error: ‘class lmtable’ has no member  
 named
 ‘init_caches’
 LanguageModelIRST.cpp: In member function ‘virtual float
 Moses::LanguageModelIRST::GetValue(const std::vectorconst
 Moses::Word*,
 std::allocatorconst Moses::Word* , const void**, unsigned int*)
 const’:
 LanguageModelIRST.cpp:196: warning: comparison of unsigned  
 expression 
 0 is
 always false
 LanguageModelIRST.cpp:213: error: no matching function for call to
 ‘lmtable::clprob(int [20], size_t, NULL, NULL, char**, unsigned  
 int*)’
 /usr/local/include/lmtable.h:267: note: candidates are: virtual  
 double
 lmtable::clprob(ngram)
 LanguageModelIRST.cpp

Re: [Moses-support] LanguageModelIRST.cpp errors

2010-11-02 Thread Nicola Bertoldi
Dear Charles,

probably you are using an old release of IRSTLM
please download the new one (5.50.01) from FBK website
http://hlt.fbk.eu/en/irstlm

as suggested during the compilation of Moses

Hope this solve your problem

bet regards,
Nicola Bertoldi



On Nov 2, 2010, at 6:10 AM, Charles Chiu wrote:

 Hi,
 I was able to by pass the gcc4 makefile issue i had  
 while compiling
 SRILM and moses configuration utility actually is able to finish  
 its task.
 However, I am encountering a new issue here. I was trying to  
 compile Moses and
 got the following error log when compiling with make -j 2 command.
 Could someone help me to figure out what is going on with this  
 issue? By the
 way, I really appreciate the reply for my last question. It really  
 helps a lot!

 Thanks!

 Error Log:

 In file included from Hypothesis.h:33,
  from SentenceStats.h:30,
  from StaticData.h:43,
  from LanguageModelIRST.cpp:38:
 PhraseDictionaryMemory.h:66: warning: unused parameter ‘outColl’
 LanguageModelIRST.cpp: In member function ‘virtual bool
 Moses::LanguageModelIRST::Load(const std::string,  
 Moses::FactorType, size_t)’:
 LanguageModelIRST.cpp:123: warning: comparison of unsigned  
 expression = 0 is
 always true
 LanguageModelIRST.cpp:126: error: ‘class lmtable’ has no member named
 ‘init_caches’
 LanguageModelIRST.cpp: In member function ‘virtual float
 Moses::LanguageModelIRST::GetValue(const std::vectorconst  
 Moses::Word*,
 std::allocatorconst Moses::Word* , const void**, unsigned int*)  
 const’:
 LanguageModelIRST.cpp:196: warning: comparison of unsigned  
 expression  0 is
 always false
 LanguageModelIRST.cpp:213: error: no matching function for call to
 ‘lmtable::clprob(int [20], size_t, NULL, NULL, char**, unsigned  
 int*)’
 /usr/local/include/lmtable.h:267: note: candidates are: virtual double
 lmtable::clprob(ngram)
 LanguageModelIRST.cpp: In function ‘bool Moses::LMCacheCleanup 
 (size_t, size_t)’:
 LanguageModelIRST.cpp:224: warning: comparison between signed and  
 unsigned
 integer expressions
 make[3]: *** [LanguageModelIRST.lo] Error 1
 make[2]: *** [all] Error 2
 make[1]: *** [all-recursive] Error 1
 make: *** [all] Error 2




 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] KenLM distributed with Moses

2010-10-26 Thread Nicola Bertoldi
the empty line after each ngram-block is not mandatory in the ARPA  
format
(see http://www.speech.sri.com/projects/srilm/manpages/ngram-format. 
5.html)
and IRSTLM does not produce it.


best regards,
Nicola Bertoldi

On Oct 26, 2010, at 9:42 AM, supp...@precisiontranslationtools.com  
supp...@precisiontranslationtools.com wrote:

 Hi Ken,

 I'm created an iARPA file with IRSTLM using the options -n 3 (2  
 grams), -b
 (include the s sentence boundary) and -d (subdictionary for ngrams).
 Then, I used IRSTLM's compile-lm with --text yes to convert to ARPA  
 format.

 Finally, I ran build_binary to binarize the ARPA format for KenLM.  
 I got
 the following error:

 $ build_binary arpa.en.lm arpa.en.binary
 Reading lm.en.lm
 5---10---15---20---25---30---35---40---45---50---55---60---65---70 
 ---75---80---85---90---95--100
 terminate called after throwing an instance of  
 'lm::FormatLoadException'
   what():  Expected blank line after 3-grams at byte 22348989 in file
 arpa.en.lm
 Aborted

 What am I missing?

 Thanks,
 Tom


 On Fri, 22 Oct 2010 10:15:21 -0400, Kenneth Heafield  
 mo...@kheafield.com
 wrote:
 KenLM is inference-only.  It cannot create ARPA files.  So you'll  
 need
 to use your favorite toolkit to generate the ARPA.

 On 10/22/10 07:52, supp...@precisiontranslationtools.com wrote:
 Thanks Ken. Nice work.

 Is there a way to train the ARPA formatted LM with KenLM, or do  
 we need
 to
 train with another tool, like SRILM or convert IRSTLM to full ARPA
 format?

 Thanks again,
 Tom



 On Mon, 18 Oct 2010 20:31:38 -0400, Kenneth Heafield
 mo...@kheafield.com
 wrote:
 Hi Moses,

Introducing kenlm in Moses trunk.  You no longer need to  
 download a
 separate language model to use Moses; it's distributed with  
 Moses and
 compiled in by default on UNIX.  This is threadsafe language model
 inference code that returns the same probabilities as SRI (up to
 floating point rounding).  It loads APRA files in 2/3 the time SRI
 takes
 and uses less memory too.  Using kenlm is simple: in your
 [lmodel-file]
 section, change the first digit to 8.  For example,

 0 0 2 foo.arpa changes to 8 0 2 foo.arpa

For even faster loading, use the binary format:

 kenlm/build_binary foo.arpa foo.binary

 then simply provide the binary filename in your moses.ini e.g.
 8 0 2 foo.binary; it auto detects binary files using magic  
 bytes at
 the beginning.

The code is ready for use and provides correct results.   
 Inference is
 slower than it should be due to inefficiencies in the Moses-side
 wrapper
 code (it does a vocab lookup for all 5 words every time).  I'm  
 working
 on it and once this is done I'll post some benchmarks against  
 SRI and
 IRST. The binary format is subject to change, but contains a  
 version
 number so on very rare occasions after, new versions will tell  
 you to
 rebuild your binary files.  Windows is currently not supported (it
 uses
 mmap) though I welcome contributions using #ifdef and
 CreateFileMapping.

Have fun and let me know about your experiences with it.

 Ken
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Train language model with factors

2010-07-06 Thread Nicola Bertoldi
the usual way of using LMs with factors is to train separate LMs for  
each factor
and specify which factor each LM refers to in the configuration file.

Look at the manual to know how to modify the config file.

best regards,
Nicola


On Jul 6, 2010, at 5:14 AM, yyjj...@163.com wrote:

 hello everyone!
 Recently , I want to use moses‘s factored model, but I could not  
 find the method to train a language model with factors. Is there  
 anyone could help me? Thank you!


 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] moses with irstlm memory hog

2010-06-09 Thread Nicola Bertoldi
hi Tom

I supposed there is a mismatch with the last working IRSTLM release  
and your.

Where did you get yours?

Nicola

On Jun 9, 2010, at 6:36 AM, supp...@precisiontranslationtools.com  
wrote:

 Problem solved... I spoke too early.

 Creating the symlink, $SRILM/bin/i686, enabled me to run MERT with  
 small
 (135K pair) test sets with a 5-gram irstlm whereas before they crashed
 immediately. However, it failed when I move to a training corpus  
 with 1.3
 million pairs. The moses-irstlm system ran for about 1 1/2 days and  
 only
 output 939 of 3842 phrases in the run1. During that time, it gradually
 consuming RAM and swap space. It finally failed with the following  
 fatal
 message:

[113615.909787] Out of Memory : kill process 299 (sh) score  
 53126 or a
 child [113615.909826] Killed process 2100 (moses)

 I just finished successfully tuning the exact same phrase/reordering
 tables with an srilm model. The srilm model was built with the  
 exact same
 data as the irstlm model that fails. Attachments include:

 1) irstlm.europarl.v5.mini.en-nl.tar.gz -- the successful srilm
 logs/configs
 2) irstlm.europarl.v5.en-nl.tar.gz -- the failed irstlm logs/configs
 3) srilm.europarl.v5.en-nl.tar.gz -- the successful srilm logs/configs
 using the same tables

 Linux 'top' reported that during mert with moses-srilm, the moses  
 instance
 consumed about 3.5 GB of RAM with practically zero swap file usage.  
 This
 held constant through each iteration. The irstlm version, however,
 gradually consumes all RAM and then it consumed the swap file before
 crashing.

 To review the builds:
 1) GIZA++ (SVN rev 8, v 1.0.3)
 2) IRSTLM (SVN rev 38, v 5.40.01)
 3) SRILM (ver 1.5.10)
 4) Moses (SVN rev 3210, dated 4-26-2010)
 5) Ubuntu-server 10.04 LTS 64-bit
 6) 2.6 Ghz Core2-Quad with 4GB RAM

 Have any others had similar problems? There's probably a simple  
 solution,
 but I haven't found it. I configured the .ini file for irstlm to  
 use the
 .mm memory-mapped binarized model. Can someone review the attached  
 log and
 moses.ini files?

 Thanks.

 Tom


 On Wed, 02 Jun 2010 21:48:33 -0700,
 supp...@precisiontranslationtools.com
 wrote:
 Problem solved.

 To review the symptoms, I ran the following two mert-moses-new.pl
 command
 lines:

 CASE 1:
 nice mert-moses-new.pl \
 /media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.3.en-nl/ 
 mert.en
 \
 /media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.3.en-nl/ 
 mert.nl
 \
 /usr/local/lib/moses-irstlm/moses-cmd/src/moses \


 /media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.3.en-nl/ 
 moses0.ini
 \
 --working-dir
 /media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.3.en-nl \
 --rootdir /usr/local/lib/moses-irstlm/scripts \
 --mertdir=/usr/local/lib/moses-irstlm/mert \
 --nbest=50 \
 --decoder-flags -v 0

 CASE 2:
 nice mert-moses-new.pl \
 /media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.5.en-nl/ 
 mert.en
 \
 /media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.5.en-nl/ 
 mert.nl
 \
 /usr/local/lib/moses-irstlm/moses-cmd/src/moses \

 /media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.5.en-nl/ 
 moses0.ini
 \
 --working-dir
 /media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.5.en-nl \
 --rootdir /usr/local/lib/moses-irstlm/scripts \
 --mertdir=/usr/local/lib/moses-irstlm/mert \
 --nbest=50 \
 --decoder-flags -v 0


 Only one line (the lmodel-file) was different in the respective  
 starting
 config files:

 CASE 1:

 /media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.3.en-nl/ 
 moses0.ini:
 [ttable-file]
 0 0 0 5
 /media/models/tables/europarl.v5.mini/en-nl/model.en-nl/phrase- 
 table.gz
 [lmodel-file]
 1 0 3 /media/models/irstlm/europarl.v5.mini/3-gram.nl.blm.mm
 [distortion-file]
 0-0 wbe-msd-bidirectional-fe-allff 6

 /media/models/tables/europarl.v5.mini/en-nl/model.en-nl/reordering- 
 table.wbe-msd-bidirectional-fe.gz

 CASE 2:

 /media/models/tables/europarl.v5.mini/en-nl/mert.irstlm.5.en-nl/ 
 moses0.ini:
 [ttable-file]
 0 0 0 5
 /media/models/tables/europarl.v5.mini/en-nl/model.en-nl/phrase- 
 table.gz
 [lmodel-file]
 1 0 5 /media/models/irstlm/europarl.v5.mini/5-gram.nl.blm.mm
 [distortion-file]
 0-0 wbe-msd-bidirectional-fe-allff 6

 /media/models/tables/europarl.v5.mini/en-nl/model.en-nl/reordering- 
 table.wbe-msd-bidirectional-fe.gz


 In both cases, mert-moses-new.pl filtered the phrase table  
 successfully.
 In CASE 1, the tuning process continued and concluded with a final
 moses.ini file with new weights. In CASE 2, however, mert-moses- 
 new.pl
 created run1.moses.ini. The moses process rapidly (less than 5  
 minutes)
 consumed all RAM and virtual memory leaving nothing for other  
 processes.
 It
 never sent output to the run1.out file. The system killed moses and
 mert-moses-new.pl. This occurred from the mert-moses-new.pl script or
 from
 the command line using the run1.moses.ini file.

 Furthermore, I changed run1.moses.ini to use the binarized phrase and
 reordering tables:
 0 0 0 5
 

Re: [Moses-support] What is the use of the lm parameter in the model training stage?

2010-05-21 Thread Nicola Bertoldi
Crhistof is right

the LM is used only to create a formally correct configuration file.

You can simply set any NON EMPTY file, to complete the training successfully.

Of course you have to modify the configfile with your good LM before translating

Nicola

From: moses-support-boun...@mit.edu [moses-support-boun...@mit.edu] on behalf 
of Christof Pintaske [christof.pinta...@oracle.com]
Sent: Friday, May 21, 2010 5:26 AM
To: moses-support@mit.edu
Subject: Re: [Moses-support] What is the use of the lm parameter in the model 
training stage?

On 5/20/10 8:12 PM, yifeng...@sina.com wrote:

 In Factored Tutorial, the first example is:

 % train-model.perl \
 --corpus factored-corpus/proj-syndicate \
 --root-dir unfactored \
 --f de --e en \
 --lm 0:3:factored-corpus/surface.lm:0

 I think the language model is usually used in the decoding stage in
 SMT. What is the use of the lm parameter which lists a language model
 in the model training stage?

I'm not sure if it's really required, but it's written to the moses.ini,
which you later need in decoding. Otherwise you'd have to patch the
moses.ini manually.

just my 2 cents of wisdom
Christof


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] IRSTLM error: converting iARPA to ARPA format

2010-04-21 Thread Nicola Bertoldi
Dear Zahurul

the newest release of IRSTLM (5.40.01) should solve your problem which is 
probably related to the size.

Please download from here:
http://hlt.fbk.eu/en/irstlm


There is an official mailing list for IRSTLM, you can join from here
https://list.fbk.eu/sympa/subscribe/user-irstlm

The mail address to submit your question is:
user-irstlm   AT   list  DOT  fbk  DOT  eu


Nicola

On 4/21/10 10:57 AM, Zahurul Islam zai...@gmail.com wrote:

Hi,
I am trying to build a language model large amount text (13GB). In the step of 
converting iARPA format to ARPA format i met following error:

/tools/irstlm-5.22.01/bin/compile-lm wiki.it.truecase.ilm.gz --text yes 
wiki.it.lm
inpfile: wiki.it.truecase.ilm.gz
dub: 1000
Reading wiki.it.truecase.ilm.gz...
iARPA
loadtxt()
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
/tools/irstlm-5.22.01/bin/compile-lm: line 9: 20328 Aborted 
$dir/$name $@

Any help to identify|solve this problem will be appreciated. Thank you very 
much.

Regards,
Zahurul



___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Translation from English to Foreign Language

2010-02-07 Thread Nicola Bertoldi
Dear Laurentia,

ss you can read somewhere in the moses manual
-the parameter -f tells you which is the source language
-the parameter -e tells you which is the target language

The names -f and -e do not relate with French and English specifically, but 
to generic languages

assuming that your parallel corpus have the suffixes .fr and .en for French 
and English respectively
the right  command line to train an English-to-French system is

nohup nice 
tools/moses-scripts/scripts-MMDD-HHMM/training/train-factored-phrase-model.perl
 -scripts-root-dir tools/moses-scripts/scripts-MMDD-HHMM/ -root-dir work 
-corpus work/corpus/news-commentary.lowercased  -f en -e fr 
-alignment grow-diag-final-and -reordering msd-bidirectional-fe -lm 
0:3:/home/jschroe1/demo/work/lm/news-commentary.lm  work/training.out 


Your modified command does not modify actually anything, but the source 
language which you set to id. Is id the suffix of your source language?


best regards,
Nicola

On 2/7/10 6:58 AM, Laurentia Dwintani pankponk_ho...@yahoo.com wrote:

Is it possible to translate with moses from English to any Foreign language?

because I tried using moses, and at Sanity Check Trained Model step, the 
result of best translation always UNK (English-Foreign), but if I translate 
from Foreign-English, it's work fine

I modified Train Phrase Model into:
nohup nice 
tools/moses-scripts/scripts-MMDD-HHMM/training/train-factored-phrase-model.perl
 -scripts-root-dir tools/moses-scripts/scripts-MMDD-HHMM/ -root-dir work 
-corpus work/corpus/news-commentary.lowercased -e en -f id -alignment 
grow-diag-final-and -reordering msd-bidirectional-fe -lm 
0:3:/home/jschroe1/demo/work/lm/news-commentary.lm  work/training.out 


the original was:
nohup nice 
tools/moses-scripts/scripts-MMDD-HHMM/training/train-factored-phrase-model.perl
 -scripts-root-dir tools/moses-scripts/scripts-MMDD-HHMM/ -root-dir work 
-corpus work/corpus/news-commentary.lowercased -f fr -e en -alignment 
grow-diag-final-and -reordering msd-bidirectional-fe -lm 
0:3:/home/jschroe1/demo/work/lm/news-commentary.lm  work/training.out 


I've build language model for my foreign language, not English-language-model. 
Is there anything I missed here?

Laurent






___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Urgent Help on building phrase table

2009-11-10 Thread Nicola Bertoldi
in the sample from
http://www.statmt.org/moses/download/sample-models.tgz
the phrase table contains only one feature score

while when you train your own phrase table using the Moses training script
you produce a PT with 5 feature scores, becuase this is the default of Moses.

if you want to use such PT, you have to change your configuration file 
accordingly:
- the third field of the ttable-file parameter should be 5 (scores))
- specify  5 TM weights

something like the following

regards,
nicola


# translation tables: source-factors, target-factors, number of scores, file
 [ttable-file]
 0 0 5 phrase-table

 # translation model weights
 [weight-t]
 1
 1
 1
 1
 1


On 11/10/09 11:09 AM, Kranthi Achanta krant...@infotechsw.com wrote:

Hello All,

We downloaded the sample phrase table  language model from the 
Moses website: http://www.statmt.org/moses/?n=Moses.Tutorial  and tried passing 
this phrase table to the decoder that we built in windows machine. We have 
given the input sentence 'das ist ein klein haus' to the decoder and it worked 
well giving the output as 'this is a small house'.
Problem: Now, we tried generating a similar phrase table using the above 
attached input files '' and it generated the phrase table as attached ''. We 
see a vast difference between the 2 phrase tables and the probabilities are 
also different from the same phrase table. And when we ran the decoder using 
this phrase table, we got the following Error: 'ERROR:Size of scoreVector != 
number (5!=1) of score components on line 1'.

Thanks  Regards,
Kranthi.A



From: moses-support-boun...@mit.edu [mailto:moses-support-boun...@mit.edu] On 
Behalf Of Kranthi Achanta
Sent: Tuesday, November 10, 2009 9:21 AM
To: moses-support@mit.edu
Subject: [Moses-support] Urgent Help on building phrase table

Hello All,
   I am a beginner trying to use moses for our translation engine and 
initially build a phrase table for French to English language conversion in 
windows machine using cygwin. I could build the decoder exe and phrase table 
for some sample corpus, but when I try executing the decoder for a small French 
sentence using the generated phrase table, I always get 'ERROR:Size of 
scoreVector != number (5!=1) of score components on line 1' which I do not 
understand. I tried building phrase tables atleast 5 times on different corpus 
each time but still I get the same error. Please find the attached error log 
and assist me what went wrong and let me know if you need to know any other 
details further.

Thanks  Regards,
Kranthi.A




___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Get the input - output word mapping

2009-11-02 Thread Nicola Bertoldi
Waleed,
if you are interested in a word-to-word alignment (instead of phrase-to-phrase 
alignment)
you could give a look to this feature of Moses
http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc6

Pay attention that you have to re-train your phrase tables including the 
word-to-word alignments;
see also
http://www.statmt.org/moses/?n=FactoredTraining.ScorePhrases
and
http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#BinaryPhraseTable

regards,
Nicola

On 11/1/09 4:33 PM, Felipe Sánchez Martínez fsanc...@dlsi.ua.es wrote:

Hello,

To obtain the correspondence between source phrases and target phrases
run the decoder with -report-segmentation. Also note that in
phrase-based SMT each target phrases is the translation of a source
phrases, there are no target phrases aligned with null.

Regards
--
Felipe


El dom, 01-11-2009 a las 17:12 +0200, Waleed Oransa escribió:
 Hello all,

 I would like to know if moses can output the mapping between input
 words and output words. In other way, how can I know the source word
 (or null) for each word in the translated sentence. I appreciate your
 prompt response. Thank you!

 Waleed
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support
--
Felipe Sánchez Martínez fsanc...@dlsi.ua.es
Departamento de Lenguajes y Sistemas Informáticos
Universidad de Alicante, E-03071 Alicante (Spain)
Tel.: +34 965 903 400, ext: 2966 Fax: +34 965 909 326
http://www.dlsi.ua.es/~fsanchez

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support



___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Cn decoding problems

2009-05-12 Thread Nicola Bertoldi
I am answering him privately.

Nicola


On 5/12/09 3:26 PM, Fethi Bougares fethi.bouga...@imag.fr wrote:

Hello,

 I am Working on  Arabic To English Statistical Machine Translation.
 AS you now Arabic is a rich morphological language.
 So i use confusion network (as input type) to combine
 many possible morphologique analyses of arabic text.

 CN decoding is not well documented in moses web site. So i need some
explanation
 My questions are  :

  - how moses decode a confusion network ?
  - the means of the word posterior parameters added to configuration
file 'moses.ini' ?
  - how moses use confusion network scores ? and how it make choice
between path inside the cn ?
  - the effect of cn tuning.
  - finally i try to change the scores, giving zero to a path but moses
-when it decode my cn- it take this path ( even it has an edge with zero
score)
  can you explain that please ?
  Thanks.
  Bougares. LIG Laboratory (Grenoble French)
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support



___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] make binary model

2009-05-05 Thread Nicola Bertoldi
No,

-nscores 5is a parameter of processPhraseTable
and refers to the number of scores (features)
associated to each phrase pair in the phrase table.

The standard setting of Moses training provides 5 scores.
Check the textual format of the phrase table
to know how many scores are in it.


best regards,
NB



On 5/5/09 8:39 AM, m...@bezeqint.net m...@bezeqint.net wrote:

Hello ,  I need help please
Not long ago I've made fr-en training based on the europarl
corpus , i found out that there is an option to convert the
output of the model to binary version.
i used the next command to convert the phrase-table :
cat phrase-table | sort | mosesdecoder/misc/processPhraseTable
   -ttable 0 0 - -nscores 5 -out phrase-table

i would like to know what nscores 5 does ?
does nscores 5 Refers to the number of ngrams used in the
langauge model ?
thanks
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support



___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Multiple phrase tables

2009-02-06 Thread Nicola Bertoldi
Hi Lane


this is what Philip answered some time ago about your question.

best
Nicola

Hi,

there are two choices, when a phrase pair is used
(a) it is scored by both phrase tables, and needs to exist in both  
phrase tables
(b) it is taken from either of the two phrase tables (and scored only
with those scores)

Here is how it's done:

(1) define both tables in the section [ttable-file]

[ttable-file]
0 0 5 /my-dir/table1
0 0 5 /my-dir/table2

(2) set the appropriate number of weights in [weight-t]
--- in our example that would be 10 weights

(3) specify the use of the tables in [mapping]

(a) scoring with both tables:

[mapping]
T 0
T 1

(b) scoring with either table:

[mapping]
0 T 0
1 T 1

Note: the number before T defines a decoding path,
so here are two different decoding paths specified.

-phi



 Dear experts,
 In my experiments, I need to use more than 1 phrase tables.
 I have read in some previous posts that Moses can use them.
 I do not really understand what it happens when the decoder finds  
 the same
 phrase pair in both tables.
 How does it combine the conditional probabilities? Does it add one  
 more
 element to the loglinear models?
 Thanks a lot
 Marco





On Feb 5, 2009, at 10:33 PM, Lane Schwartz wrote:

 Is there a way to specify more than one phrase table for moses to  
 load?

 Thanks,
 Lane

 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] MULTI-BLEU Score Interpretation

2008-11-26 Thread Nicola Bertoldi
The script implements the BELU score as described in the official  
paper by Papineni


Papineni, Kishore, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2001.  
Bleu: a method for
automatic evaluation of machine translation. Research Report RC22176,  
IBM Research

Division, Thomas J. Watson Research Center.

The four figures  91.3/65.9/50.9/39.5 are the precisions of 1- 
grams, 2-grams, 3-grams, and 4-grams, respectively


and the global score (BLEU )is computed as the geometric mean of  
these precisions multiplied by the Brevity Penalty (BP)

caluclated according to the length ratio (ration)

Look at the paper for more details.

The prefix multi means that the script computes BLEU score with  
multiple references.


best regards
Nicola


On Nov 25, 2008, at 11:15 PM, Alok Kothari wrote:


How to Interpret Multi-Bleu score ??

This is Mine for a certain pair of languages
 Pair 1 -  BLEU = 46.94, 91.3/65.9/50.9/39.5 (BP=0.796, ration=0.814)
 Pair 2- BLEU = 0.47, 28.2/3.3/0.5/0.3 (BP=0.236, ration=0.409)
For the Perl-Script Given Here

http://www.statmt.org/wmt06/shared-task/multi-bleu.perlATT1.txt


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Can't run MERT on factored model

2008-11-13 Thread Nicola Bertoldi
The new MERT implementation is independent from the decoder.
You can use it without updating Moses

Unfortunately new implementation has not been tested on multiple  
factors yet,
but hopefully it works fine.

Nicola


On Nov 13, 2008, at 1:09 PM, Miguel José Hernández Vidal wrote:

 Hi Jun,

 Is it possible to use only the new mert implementation without  
 having to
 update the whole system? Could I use the new MER training scripts as
 they come in the package?

 Regards

 jun li wrote:
 Hi,
 suggest you download moses from
 http://sourceforge.net/project/showfiles.php?group_id=171520
 That's 2008-07-11 version.
 I encountered the same problem: the decoder die on the MERT process
 when using the latest version check out from svn .


 On Thu, Nov 13, 2008 at 6:49 PM, Miguel José Hernández Vidal
 [EMAIL PROTECTED] wrote:

 Dear Mailing,

 I've trained my English to Spanish system as Amit did
 (http://www.mail-archive.com/moses-support@mit.edu/ 
 msg00599.html). I got
 the input error too:

 [ERROR] Malformed input at
  Expected input to have words composed of 2 factor(s) (form FAC1| 
 FAC2|...)
  but instead received input with 0 factor(s).
 Aborted (core dumped)

 My phrase table has 1 factor in the input (surface) and 2 in the  
 output
 (surface|pos).  I've modified the config file that
 'filter-model-given-input.pl' builds with:

 [input-factors]
 0
 1

 or more logically

 [output-factors]
 0
 1

 ,but I get the same error. Why did the decoder die on the MERT  
 process?
 I had  run  the decoder before tuning and it worked.


 Regards,

 Miguel
 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support







 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Significance of BLEU using Multi-bleu

2008-09-18 Thread Nicola Bertoldi
Dear Vineet ,
Probably you already check it,
but my impression is that there is a mismatch in the order of reference.

best
Nicola



On Sep 18, 2008, at 9:59 AM, Adam Lopez wrote:


 I would just like to know if there is a significant difference
 when scoring translations using multi-bleu.

 With multi-bleu i got the following scores for testing on 2000
 sentences

 BLEU = 34.62, 63.4/38.8/27.8/21.3 (BP=0.996, ratio=0.996,
 hyp_len=16587,
 ref_len=16660)

 and the following for 5082 sentences

 BLEU = 3.82, 11.1/4.0/2.6/1.9 (1, 1.017,44536,43809)

 The only change i made was increased the corpus size from 6053 to
 8948.

 First, a caveat: In general, BLEU scores are only comparable when they
 are computed using the same reference set.  It's possible to get
 fairly divergent BLEU scores using an identical system on two
 different data sets from the same domain.

 That said, I've never seen differences anywhere near that large, so
 you should need to double-check your experimental setup.  For
 instance, the second set of numbers are n-gram precisions (for
 increasing orders of n).  In your example, the unigram precision went
 from 63.4 to 11.1, a sure sign of problems.

 To answer some of your other questions:

 Another question is that what does other parameters except the first
 which is
 the BLEU score mean ?

 They are n-gram precisions, BLEU penalty, length ratio, hypothesis
 length, and reference length.  For explanation, see the paper:
 http://aclweb.org/anthology-new/P/P02/P02-1040.pdf

 Also, is multi-bleu in par with mteval?

 bleu-1.04.pl (IBM BLEU), mteval-11x.pl (NIST BLEU), and multi-bleu.pl
 (Moses BLEU) all report slightly different scores.

 Can i consider a BLEU of 34.62
 to be correct.

 There is no such thing as correct with BLEU.  Just make sure you use
 the same evaluation script for every output in your experiment.


 Cheers
 Adam


 -- 
 The University of Edinburgh is a charitable body, registered in
 Scotland, with registration number SC005336.

 ___
 Moses-support mailing list
 Moses-support@mit.edu
 http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] mert-moses with NIST score optimization

2008-05-08 Thread Nicola Bertoldi
Dear Jason,
actually no one did this enhancement of cmert
at least to my knowledge.

I also thoguht to modify the MERT part o Moses
to make it more flexible and efficient,
but unfortunately I never had time.

This is a possible project to work on during
the Second MT Marathon (next week in Berlin)
but it is not sure

So I suggest to wait an official reply from Phil
to avoid a duplicate effort.

If this optimization of mert scripts will not be considered
in the 2nd MT Marathon,
I will be very glad to collaborate woth you
in both planning modifications, and writing the code.

best,
Nicola



From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of Jason Katz-Brown [EMAIL 
PROTECTED]
Sent: Thursday, May 08, 2008 5:34 PM
To: moses-support@MIT.EDU
Subject: [Moses-support] mert-moses with NIST score optimization

(I sent this mail once before, but it bounced because one of MIT's
mailman servers filled a disk. So I hope this is not a duplicate.)

Hi all,

Has anybody enhanced mert-moses and cmert to allow one to tune for
maximum NIST score, or to maximize some other metric? If not, I am
interested in adding this functionality.

Also, could somebody give me an idea of the maximum number of feature
weights that is feasible to train with the current mert setup?

Thanks,
Jason
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] R: Running Moses In Parallel

2008-04-21 Thread Nicola Bertoldi
Dear Vineet
one more thing about parallelization

1) with moses-parallel.pl you can produce in parallel
the translations of the many input sentences
and the corresponding nbest, as explained by Marcello.
At the moment you can NOT  generate with parallelization
search and word-graphs, but I am working for this.

best,
Nicola




From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of Marcello Federico [EMAIL 
PROTECTED]
Sent: Sunday, April 20, 2008 11:47 AM
To: Vineet Kashyap; moses-support@mit.edu
Subject: [Moses-support] R:  Running Moses In Parallel

Hi Vineet,

think are much simpler than you can immagine.

1,2. Parallel minimum error training and parallel test scripts basically
split the  test set into batches that are  translated on different
machines with the same run time code.   Hence, there is no
parallelism at the level of single sentences.

7. Parallel word-alignment training uses two distinct processes to training 
source-to-target
and targe-to-source alignments.

3,4 It would be interesting to study some parallel  implementation of the
search algorithm, but I  would suggest to start with some simpler model
to check how much you can gain.

5,6 yes, you just need a sample of human made translations, the larger the 
better.
The amount of data actually depends on the difficulty of the task, namely 
distance of
languages and vocabulary size.  Limited domain tasks, like traveling expressions
(see the IWSLT tasks http://www.slc.atr.jp/IWSLT2008)  can be approached with
parallel corpora of 40K sentence pairs. Translation of  Europarl or Chinese
news require working with several millions of sentence pairs.


Best,
Marcello









Da: [EMAIL PROTECTED] [EMAIL PROTECTED] per conto di Vineet Kashyap [EMAIL 
PROTECTED]
Inviato: sabato 19 aprile 2008 16.29
A: moses-support@mit.edu
Oggetto: [Moses-support] Running Moses In Parallel

Hello users

I am new to moses and will be using it for my research.

Out of curiosity i needed answers to the following questions:

1. Which parts are made parallel when moses is run on 'n' processors?
2. What happens to the input sentence and what does each processor do
   in terms of computation? searching, assigning probability weights ?
3. Can we modify moses-parallel.pl to improve parallelization?
4. Can we use MPICH for parallel implementation?
5. How big the parallel corpora should be to get accurate results? how many
   sentences/words?
6. Parallel Corpora consists of the text in source language along with
   the translation in the target language. Is that all you need ?
7. Also, while training large data the --parallel option can be used.
   Again can we use mpich and which parts are made parallel?

I know these are lot of questions.But it would be highly appreciated
if some one takes the time to answer these.

Thanks in advance.

Regards

Vineet

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Moses - IRSTLM compile error

2008-04-07 Thread Nicola Bertoldi
I already found the problem.
I will fix by tomorrow morning

Best
Nicola


From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of Severin Hacker [EMAIL 
PROTECTED]
Sent: Monday, April 07, 2008 1:42 AM
To: moses-support@mit.edu
Subject: [Moses-support] Moses - IRSTLM  compile error

Hi,

I try to compile Moses with IRSTLM on Ubuntu 7.10 and I get the following error:

/home/severin/irstlm/include/lmtable.h: In member function âvoid 
lmtable::set_dictionary_upperbound(int)â:
/home/severin/irstlm/include/lmtable.h:216: error: âlogâ was not declared in 
this scope
/home/severin/irstlm/include/lmtable.h: At global scope:
/home/severin/irstlm/include/lmtable.h:245: warning: unused parameter âlmfileâ
/home/severin/irstlm/include/lmtable.h:246: warning: unused parameter âlmfileâ
/home/severin/irstlm/include/lmtable.h:246: warning: unused parameter âbuffMbâ
make[2]: *** [LanguageModelIRST.o] Error 1


It looks like it can't find the logarithm function. What shall I do?


Best regards,
Severin Hacker

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] [Fwd: Run mert-moses.pl with confusion network]

2008-02-28 Thread Nicola Bertoldi
moses-parallel.pl and mert-moses.pl were changed .
Now they works well with lattice inputs, too.

Notice that you do NOT need to specify
-decoder-flags -inputtype 2
the parameter
--inputtype 2
of mert-moses.pl is passed to the decoder automatically.


best,
Nicola



From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of ThuyLinh Nguyen [EMAIL 
PROTECTED]
Sent: Wednesday, February 27, 2008 5:16 PM
To: moses-support@mit.edu
Subject: [Moses-support] [Fwd: Run mert-moses.pl with confusion network]

Hello,
just another mistake, mert-moses.pl can't find the phrasetable in binary
format but if run translation without mert, it works
here is the error:
perl mert-moses.pl ../../sstmorph/dev.ar.lattice ../../dev.en.process
../../../moses-cmd/src/moses ./moses.ini --decoder-flags -inputtype 2
--inputtype 2 --rootdir
/nfs/islpc3_13/linh/Programs/mosesdecoder/scripts --no-filter-phrase-table
After default: -l mem_free=0.5G -hard
Using SCRIPTS_ROOTDIR: /nfs/islpc3_13/linh/Programs/mosesdecoder/scripts
checking weight-count for ttable-file
moses.ini:15:File does not exist or empty:
/SMT/Workplace/Linh/IWSLT_0802/gold_morph_space/model/phrase-table.bin
checking weight-count for lmodel-file
SYNC distortionExit 1

but if I run without mer, it works
head -2 ../../sstmorph/dev.ar.lattice | ../../../moses-cmd/src/moses -f
./moses.ini -inputtype 2

Thanks
Linh

Chris Dyer wrote:
 I'll update that- the inputtype should be 2 for lattices...
 Chris

 On Wed, Feb 27, 2008 at 4:39 AM, ThuyLinh Nguyen [EMAIL PROTECTED] wrote:

  Hi Chris,
  Thanks for clarification, so the lattice format is different with confusion
 network format
  but in moses binary, there are only two options for  inputtype: -inputtype:
 text (0) or confusion network (1)

  It does n't recognize the lattice format input.
  This is an example of lattice translation error:

  echo ((('A',1.0,1),),(('B',1.0,1),),) | moses -f moses.ini -inputtype 1
  Defined parameters (per moses.ini or switch):
  config: moses.ini
  distortion-limit: 6
  input-factors: 0
  inputtype: 1
  lmodel-file: 0 0 3 /SMT/Workplace/Linh/IWSLT_0802/train.en.srilm
  mapping: 0 T 0
  ttable-file: 0 0 5
 /SMT/Workplace/Linh/IWSLT_0802/gold_morph_space/model/phrase-table.bin
  ttable-limit: 20 0
  weight-d: 0.6
  weight-l: 0.5000
  weight-t: 0.2 0.2 0.2 0.2 0.2
  weight-w: -1
  Loading lexical distortion models...
  have 0 models
  Start loading LanguageModel /SMT/Workplace/Linh/IWSLT_0802/train.en.srilm :
 [0.000] seconds
  Finished loading LanguageModels : [0.000] seconds
  Start loading PhraseTable
 /SMT/Workplace/Linh/IWSLT_0802/gold_morph_space/model/phrase-table.bin :
 [0.000] seconds
  using binary phrase tables for idx 0
  reading bin ttable
  size of OFF_T 8
  binary phrasefile loaded, default OFF_T: -1
  Finished loading phrase tables : [0.000] seconds
  IO from STDOUT/STDIN
  Created input-output object : [0.000] seconds
  read confusion net with format 0
  End. : [0.000] seconds
  confusion net statistics:
   created:   1
   destroyed: 1
   succ. read:0
   columns:   0
   words: 0
   avg. word/column:  nan
   avg. cols/sent:nan


  Let me know if I made mistake somewhere.

  Thanks
  Linh





  Chris Dyer wrote:

  I am still confused about the lattice format,
  In your examples:

  1 ((('A',1.0,1),),(('B',1.0,1),),)
  2 ((('A',1.0,1),('Z',1.0,2),),(('B',1.0,1),),(('C',1.0,1),),)

  Can I interpret it as:
  from node 0 to node 1 there are 2 lattices: (('A',1.0,1),) and
  (('B',1.0,1),)

  Each entire lattice is encoded on a single line. In line 1, there are
 two arcs from node 0 to node 1, 'A' and 'B'. The 1.0 is the cost of
 the arc and the 1 is the length of the arc (measured in nodes). In
 line two, node 0 has two arcs, arc 'A' that goes to node 1 and arc 'Z'
 that goes to node 2. Node 1 has a single arc, 'B', that goes to node
 2. Node 2 has a single arc 'C' that goes to 3.




  And also what are the meaning of number 1.0 and 1, 2 there? where can I put
 the lattice probabilities?
  Is it possible to add an empty lattice (so that the decoder skip a word)?

  Currently, moses only lets you specify a single cost for an arc, and
 it is actually treated as a probability (the decoder sees it as
 -log(p) -- you can change this in WordLattice.cpp if you want to deal
 with more conventional costs, but the rest of the inputs to the
 decoder are given as probabilities so I wanted to be consistent). If
 you want a null transition, set the arc label to '*eps*' and the
 decoder will treat this as a null.

 --Chris




  Linh




  Chris Dyer wrote:

  Also, if you are using general lattices (as opposed to regular
  confusion networks) as input, you should update to the latest version
  of the decoder from Subversion, since I checked in a fairly crucial
  bug fix yesterday.
  
  Chris
  
  On Wed, Feb 20, 2008 at 4:37 PM, Chris Dyer 

Re: [Moses-support] [Fwd: Run mert-moses.pl with confusion network]

2008-02-28 Thread Nicola Bertoldi
Dear Christian,

As the binary phrase table (PT) is generated from the textual one,
we assumed that the latter exists,
so the check was done only on the textual PT.

If I needed to save space I deleted the textual PT (and not the binaries)
and recreated an almost empty PT with the same name
(containing only one line like EMPTY PHRASE TABLE)


Your solution is absolutely smarter.
I will update on the repository.

thanks for the suggestion

best,
Nicola




From: Christian Hardmeier [EMAIL PROTECTED]
Sent: Thursday, February 28, 2008 11:20 AM
To: Nicola Bertoldi
Subject: Re: [Moses-support] [Fwd: Run mert-moses.pl with confusion network]

Hi Nicola,

I think one of the problems mentioned below is still there:
mert-moses.pl will complain about finding no phrasetable if the
phrasetable is in binary format. Changing line 1084 to

if (! -s $fn  ! -s $fn.gz  ! -s $fn.binphr.tgtdata) {

(or something similar) would fix that, I think.

Best,
Christian

On Thu, 28 Feb 2008, Nicola Bertoldi wrote:

 moses-parallel.pl and mert-moses.pl were changed .
 Now they works well with lattice inputs, too.

 Notice that you do NOT need to specify
 -decoder-flags -inputtype 2
 the parameter
 --inputtype 2
 of mert-moses.pl is passed to the decoder automatically.


 best,
 Nicola


 
 From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of ThuyLinh Nguyen [EMAIL 
 PROTECTED]
 Sent: Wednesday, February 27, 2008 5:16 PM
 To: moses-support@mit.edu
 Subject: [Moses-support] [Fwd: Run mert-moses.pl with confusion network]

 Hello,
 just another mistake, mert-moses.pl can't find the phrasetable in binary
 format but if run translation without mert, it works
 here is the error:
 perl mert-moses.pl ../../sstmorph/dev.ar.lattice ../../dev.en.process
 ../../../moses-cmd/src/moses ./moses.ini --decoder-flags -inputtype 2
 --inputtype 2 --rootdir
 /nfs/islpc3_13/linh/Programs/mosesdecoder/scripts --no-filter-phrase-table
 After default: -l mem_free=0.5G -hard
 Using SCRIPTS_ROOTDIR: /nfs/islpc3_13/linh/Programs/mosesdecoder/scripts
 checking weight-count for ttable-file
 moses.ini:15:File does not exist or empty:
 /SMT/Workplace/Linh/IWSLT_0802/gold_morph_space/model/phrase-table.bin
 checking weight-count for lmodel-file
 SYNC distortionExit 1

 but if I run without mer, it works
 head -2 ../../sstmorph/dev.ar.lattice | ../../../moses-cmd/src/moses -f
 ./moses.ini -inputtype 2

 Thanks
 Linh

 Chris Dyer wrote:
  I'll update that- the inputtype should be 2 for lattices...
  Chris
 
  On Wed, Feb 27, 2008 at 4:39 AM, ThuyLinh Nguyen [EMAIL PROTECTED] wrote:
 
   Hi Chris,
   Thanks for clarification, so the lattice format is different with 
  confusion
  network format
   but in moses binary, there are only two options for  inputtype: 
  -inputtype:
  text (0) or confusion network (1)
 
   It does n't recognize the lattice format input.
   This is an example of lattice translation error:
 
   echo ((('A',1.0,1),),(('B',1.0,1),),) | moses -f moses.ini -inputtype 1
   Defined parameters (per moses.ini or switch):
   config: moses.ini
   distortion-limit: 6
   input-factors: 0
   inputtype: 1
   lmodel-file: 0 0 3 /SMT/Workplace/Linh/IWSLT_0802/train.en.srilm
   mapping: 0 T 0
   ttable-file: 0 0 5
  /SMT/Workplace/Linh/IWSLT_0802/gold_morph_space/model/phrase-table.bin
   ttable-limit: 20 0
   weight-d: 0.6
   weight-l: 0.5000
   weight-t: 0.2 0.2 0.2 0.2 0.2
   weight-w: -1
   Loading lexical distortion models...
   have 0 models
   Start loading LanguageModel /SMT/Workplace/Linh/IWSLT_0802/train.en.srilm 
  :
  [0.000] seconds
   Finished loading LanguageModels : [0.000] seconds
   Start loading PhraseTable
  /SMT/Workplace/Linh/IWSLT_0802/gold_morph_space/model/phrase-table.bin :
  [0.000] seconds
   using binary phrase tables for idx 0
   reading bin ttable
   size of OFF_T 8
   binary phrasefile loaded, default OFF_T: -1
   Finished loading phrase tables : [0.000] seconds
   IO from STDOUT/STDIN
   Created input-output object : [0.000] seconds
   read confusion net with format 0
   End. : [0.000] seconds
   confusion net statistics:
created:   1
destroyed: 1
succ. read:0
columns:   0
words: 0
avg. word/column:  nan
avg. cols/sent:nan
 
 
   Let me know if I made mistake somewhere.
 
   Thanks
   Linh
 
 
 
 
 
   Chris Dyer wrote:
 
   I am still confused about the lattice format,
   In your examples:
 
   1 ((('A',1.0,1),),(('B',1.0,1),),)
   2 ((('A',1.0,1),('Z',1.0,2),),(('B',1.0,1),),(('C',1.0,1),),)
 
   Can I interpret it as:
   from node 0 to node 1 there are 2 lattices: (('A',1.0,1),) and
   (('B',1.0,1),)
 
   Each entire lattice is encoded on a single line. In line 1, there are
  two arcs from node 0 to node 1, 'A' and 'B'. The 1.0 is the cost of
  the arc and the 1 is the length of the arc (measured

Re: [Moses-support] [Fwd: Run mert-moses.pl with confusion network]

2008-02-21 Thread Nicola Bertoldi
Hi Linh,
MERT works fine with confusion networks.
I never uset the lattice as input for Moses.

You can transform lattices into confusion network
using the lattice-tool software distributed together with the SRILM-Toolkit.

Pay attention that you formats for confusion networks
used by lattice tool and Moses are different.

best,
Nicola


From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of Hieu Hoang [EMAIL 
PROTECTED]
Sent: Wednesday, February 20, 2008 10:30 PM
To: ThuyLinh Nguyen; moses-support@mit.edu
Subject: Re: [Moses-support] [Fwd: Run mert-moses.pl with confusion network]

chao anh/chi linh

i'm not sure if anyone answered your question and i'm prob not the best person 
to answer question on lattice/confusion net input. to my knowledge, mert should 
run fine with these input types.

perhaps you can find an example of the lattice input format from the regression 
test :
 
http://mosesdecoder.svn.sourceforge.net/viewvc/mosesdecoder/trunk/regression-testing/tests/


ThuyLinh Nguyen [EMAIL PROTECTED] wrote:


 Original Message 
Subject: Run mert-moses.pl with confusion network
Date: Sat, 16 Feb 2008 21:33:44 -0500
From: ThuyLinh Nguyen
To: moses-support@mit.edu



Hello,
I want to run mer for a development set which is the output of other
translation job.
therefore the development input is a set of lattices. Are there anyway
to run MER with lattice input and if so how can i represent the lattice
of multiple sentences?
Thank you
Linh


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support




Hieu Hoang
http//www.hoang.co.uk/hieu


Sent from 
Yahoo!http://us.rd.yahoo.com/mailuk/taglines/isp/control/*http://us.rd.yahoo.com/evt=51949/*http://uk.docs.yahoo.com/mail/winter07.html
 - a smarter inbox.

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] probem in cofiguration moses with srilm

2008-02-19 Thread Nicola Bertoldi
Dear Jespa
the right call of the configure script is the following
./configure --with-srilm=/root/jespa/project/language_model/srilm

Nicola



From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of Jespa   [EMAIL PROTECTED]
Sent: Tuesday, February 19, 2008 9:02 AM
To: moses-support@mit.edu
Subject: [Moses-support] probem in cofiguration moses with srilm

Sir,
I am jespa.
I have installed SRILM,GIZA++,MKCLS  MOSES properly.
but when I configure moses with srilm error comes like below.
please tell me the reason
[EMAIL PROTECTED] moses]# ./configure 
--with-srilm=-I/root/jespa/project/language_model/srilm/include
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking for g++... g++
checking for C++ compiler default output file name... a.out
checking whether the C++ compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking for style of include used by make... GNU
checking dependency style of g++... gcc3
checking for ranlib... ranlib
checking how to run the C++ preprocessor... g++ -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking Ngram.h usability... no
checking Ngram.h presence... no
checking for Ngram.h... no
configure: error: Cannot find SRILM!
[EMAIL PROTECTED] moses]# make -j 4
/bin/sh ./config.status --recheck
running /bin/sh ./configure  
--with-srilm=/home/vk/Dumps/Statistical-MT/programming/language_model/srilm  
--no-create --no-recursion
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking for g++... g++
checking for C++ compiler default output file name... a.out
checking whether the C++ compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C++ compiler... yes
checking whether g++ accepts -g... yes
checking for style of include used by make... GNU
checking dependency style of g++... gcc3
checking for ranlib... ranlib
checking how to run the C++ preprocessor... g++ -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking Ngram.h usability... no
checking Ngram.h presence... no
checking for Ngram.h... no
configure: error: Cannot find SRILM!
make: *** [config.status] Error 1
[EMAIL PROTECTED] moses]#


[http://imadworks.rediff.com/cgi-bin/AdWorks/adimage.cgi/2048400_2041094/creative_2048411.gif]http://adworks.rediff.com/cgi-bin/AdWorks/click.cgi/www.rediff.com/signature-home.htm/[EMAIL
 PROTECTED]/2048400_2041094/2048411/1?PARTNER=3OAS_QUERY=null

___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] MOSES

2008-02-14 Thread Nicola Bertoldi
Dear Deca,

Are  you sure that the irstlm library is saved in
/home/ced/moses/irstlm/lib/i486-pc-linux-gnu
Could you please send me the log file you get when compile irstlm?

best
Nicola


From: [EMAIL PROTECTED] [EMAIL PROTECTED] On Behalf Of Sandy [EMAIL PROTECTED]
Sent: Wednesday, February 13, 2008 6:42 AM
To: moses-support@mit.edu
Subject: [Moses-support] MOSES

Hi
I am trying to build moses using IRST toolkit but  i am facing problems in 
compiling MOSES with IRST.I will be highly thankful if you help me with my 
problem.
Details:
Installed Directory: /home/ced/moses/moses
Platform :Debian
IRST installed with no errors
Moses configured with IRST path
ERROR IN MAKE : /usr/bin/ld: cannot find -lirstlm



[EMAIL PROTECTED]:~/moses/moses$ make
cd .  /bin/sh /home/ced/moses/moses/missing --run autoheader
rm -f stamp-h1
touch config.h.inhttp://config.h.in
cd .  /bin/sh ./config.status config.h
config.status: creating config.h
config.status: config.h is unchanged
make  all-recursive
make[1]: Entering directory `/home/ced/moses/moses'
Making all in moses/src
make[2]: Entering directory `/home/ced/moses/moses/moses/src'
if g++ -DHAVE_CONFIG_H -I. -I. -I../..  -W -Wall -ffor-scope 
-D_FILE_OFFSET_BITS=64 -D_LARGE_FILES  -I/home/ced/moses/irstlm/include  -g -O2 
-MT ConfusionNet.o -MD -MP -MF .deps/ConfusionNet.Tpo -c -o ConfusionNet.o 
ConfusionNet.cpp; \
then mv -f .deps/ConfusionNet.Tpo .deps/ConfusionNet.Po; else rm -f 
.deps/ConfusionNet.Tpo; exit 1; fi
ConfusionNet.cpp:190: warning: ignoring #pragma warning
ConfusionNet.cpp:195: warning: ignoring #pragma warning
ConfusionNet.cpp:185: warning: unused parameter ‘factorsToPrint’
if g++ -DHAVE_CONFIG_H -I. -I. -I../..  -W -Wall -ffor-scope 
-D_FILE_OFFSET_BITS=64 -D_LARGE_FILES  -I/home/ced/moses/irstlm/include  -g -O2 
-MT DecodeStep.o -MD -MP -MF .deps/DecodeStep.Tpo -c -o DecodeStep.o 
DecodeStep.cpp; \
then mv -f .deps/DecodeStep.Tpo .deps/DecodeStep.Po; else rm -f 
.deps/DecodeStep.Tpo; exit 1; fi
if g++ -DHAVE_CONFIG_H -I. -I. -I../..  -W -Wall -ffor-scope 
-D_FILE_OFFSET_BITS=64 -D_LARGE_FILES  -I/home/ced/moses/irstlm/include  -g -O2 
-MT DecodeStepGeneration.o -MD -MP -MF .deps/DecodeStepGeneration.Tpo -c -o 
DecodeStepGeneration.o DecodeStepGeneration.cpp; \
then mv -f .deps/DecodeStepGeneration.Tpo .deps/DecodeStepGeneration.Po; 
else rm -f .deps/DecodeStepGeneration.Tpo; exit 1; fi
DecodeStepGeneration.cpp:82: warning: unused parameter ‘toc’
DecodeStepGeneration.cpp:82: warning: unused parameter ‘adhereTableLimit’
if g++ -DHAVE_CONFIG_H -I. -I. -I../..  -W -Wall -ffor-scope 
-D_FILE_OFFSET_BITS=64 -D_LARGE_FILES  -I/home/ced/moses/irstlm/include  -g -O2 
-MT DecodeStepTranslation.o -MD -MP -MF .deps/DecodeStepTranslation.Tpo -c -o 
DecodeStepTranslation.o DecodeStepTranslation.cpp; \
then mv -f .deps/DecodeStepTranslation.Tpo .deps/DecodeStepTranslation.Po; 
else rm -f .deps/DecodeStepTranslation.Tpo; exit 1; fi
if g++ -DHAVE_CONFIG_H -I. -I. -I../..  -W -Wall -ffor-scope 
-D_FILE_OFFSET_BITS=64 -D_LARGE_FILES  -I/home/ced/moses/irstlm/include  -g -O2 
-MT Dictionary.o -MD -MP -MF .deps/Dictionary.Tpo -c -o Dictionary.o 
Dictionary.cpp; \
then mv -f .deps/Dictionary.Tpo .deps/Dictionary.Po; else rm -f 
.deps/Dictionary.Tpo; exit 1; fi
if g++ -DHAVE_CONFIG_H -I. -I. -I../..  -W -Wall -ffor-scope 
-D_FILE_OFFSET_BITS=64 -D_LARGE_FILES  -I/home/ced/moses/irstlm/include  -g -O2 
-MT DummyScoreProducers.o -MD -MP -MF .deps/DummyScoreProducers.Tpo -c -o 
DummyScoreProducers.o DummyScoreProducers.cpp; \
then mv -f .deps/DummyScoreProducers.Tpo .deps/DummyScoreProducers.Po; else 
rm -f .deps/DummyScoreProducers.Tpo; exit 1; fi
if g++ -DHAVE_CONFIG_H -I. -I. -I../..  -W -Wall -ffor-scope 
-D_FILE_OFFSET_BITS=64 -D_LARGE_FILES  -I/home/ced/moses/irstlm/include  -g -O2 
-MT Factor.o -MD -MP -MF .deps/Factor.Tpo -c -o Factor.o Factor.cpp; \
then mv -f .deps/Factor.Tpo .deps/Factor.Po; else rm -f .deps/Factor.Tpo; 
exit 1; fi
if g++ -DHAVE_CONFIG_H -I. -I. -I../..  -W -Wall -ffor-scope 
-D_FILE_OFFSET_BITS=64 -D_LARGE_FILES  -I/home/ced/moses/irstlm/include  -g -O2 
-MT FactorCollection.o -MD -MP -MF .deps/FactorCollection.Tpo -c -o 
FactorCollection.o FactorCollection.cpp; \
then mv -f .deps/FactorCollection.Tpo .deps/FactorCollection.Po; else rm -f 
.deps/FactorCollection.Tpo; exit 1; fi
if g++ -DHAVE_CONFIG_H -I. -I. -I../..  -W -Wall -ffor-scope 
-D_FILE_OFFSET_BITS=64 -D_LARGE_FILES  -I/home/ced/moses/irstlm/include  -g -O2 
-MT FactorTypeSet.o -MD -MP -MF .deps/FactorTypeSet.Tpo -c -o FactorTypeSet.o 
FactorTypeSet.cpp; \
then mv -f .deps/FactorTypeSet.Tpo .deps/FactorTypeSet.Po; else rm -f 
.deps/FactorTypeSet.Tpo; exit 1; fi
if g++ -DHAVE_CONFIG_H -I. -I. -I../..  -W -Wall -ffor-scope 
-D_FILE_OFFSET_BITS=64 -D_LARGE_FILES  -I/home/ced/moses/irstlm/include  -g -O2 
-MT GenerationDictionary.o -MD -MP -MF .deps/GenerationDictionary.Tpo -c -o 
GenerationDictionary.o GenerationDictionary.cpp; \
then mv -f