Re: [Moses-support] IBM Models

2008-04-14 Thread Adam Lopez
That is a great paper for comparison of the IBM Models when used for word alignment. For a description of the models themselves there are none better than the original: http://aclweb.org/anthology-new/J/J93/J93-2003.pdf Cheers Adam On 14 Apr 2008, at 07:53, Germán Sanchis Trilles wrote:

Re: [Moses-support] [statmt] alignment in moses

2008-07-10 Thread Adam Lopez
This is normal. The percentage of unaligned words varies by alignment method; for grow-diag-final-and it should be fairly small. Adam On 10 Jul 2008, at 18:03, Chongde Shi wrote: Hi, I've just noticed that in training process, alignment using grow- diag-final-and seems to leave some

Re: [Moses-support] LM vs TM

2008-07-17 Thread Adam Lopez
Yes, that's correct. At minimum, you want to train the language model on the target language half of your parallel corpus, but you can also add data from other non-parallel sources to improve performance. For an extreme example see: http://aclweb.org/anthology-new/D/D07/D07-1090.pdf

Re: [Moses-support] New MERT

2008-07-24 Thread Adam Lopez
A couple of other possibilities for you to check: - MERT is non-deterministic and there are many local optima. If you re-run the same implementation on the same data, you will get a different BLEU. Usually the difference is small, but on a few occasions I have observed large differences

Re: [Moses-support] Trying to debug reduced performance with new Moses

2008-08-02 Thread Adam Lopez
I'm trying to replicate an experiment done with an older version of Moses, against the latest version. Everything goes identically up to tuning, where the newer version starts with a lower BLEU and runs only 8 as opposed to 19 iterations of MERT, resulting in a lower evaluation score. Hi

Re: [Moses-support] decoding: reordering only

2008-08-04 Thread Adam Lopez
You could use XML markup to specify the phrases you want to use. http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc8 Adam On 4 Aug 2008, at 16:23, Sanne Korzec wrote: Hi mailing, Is there a way to force the moses or pharaoh decoder, to use a certain set of phrases. I want the

Re: [Moses-support] What is lexical weighting lex(f|e) in the phrase tables?

2008-08-06 Thread Adam Lopez
It's described on p. 5 of this paper: http://aclweb.org/anthology-new/N/N03/N03-1017.pdf Adam On 6 Aug 2008, at 08:48, Miles Osborne wrote: if i remember, that is a word alignment score Miles 2008/8/6 Jason Katz-Brown [EMAIL PROTECTED] Hi, I am trying to take an inventory of all the

Re: [Moses-support] Lexical weighting

2008-09-16 Thread Adam Lopez
It's described on p. 5 of this paper: http://aclweb.org/anthology-new/N/N03/N03-1017.pdf Adam On 16 Sep 2008, at 13:21, Michael Zuckerman wrote: Hi, I would like to ask about the lexical weighting which appears in the phrase table. I was told that field number 2 in the phrase table is

Re: [Moses-support] Significance of BLEU using Multi-bleu

2008-09-18 Thread Adam Lopez
I would just like to know if there is a significant difference when scoring translations using multi-bleu. With multi-bleu i got the following scores for testing on 2000 sentences BLEU = 34.62, 63.4/38.8/27.8/21.3 (BP=0.996, ratio=0.996, hyp_len=16587, ref_len=16660) and the

Re: [Moses-support] basic knowledge on SMT

2008-09-25 Thread Adam Lopez
On 25 Sep 2008, at 17:19, musa ghurab wrote: Hi all i have 4 questions, I'm not clear enough about them 1.I'm using the same corpus for the following steps: 1.Build Language Model 2.Train Model 3.Tuning 4.Run System on Development Test Set 5.Evaluation

Re: [Moses-support] zlib compression and binarized tables

2009-02-20 Thread Adam Lopez
...@gmail.com tah...@gmail.com wrote: Thanks Adam. Following up, it looks like PhraseDictionaryTreeAdaptor.cpp is responsbile for reading the binarized phrase tables and language model. Is this correct? Tom On Fri, Feb 20, 2009 at 5:49 PM, Adam Lopez alo...@inf.ed.ac.uk wrote: Binarized tables

Re: [Moses-support] Fetching older versions of moses

2009-03-12 Thread Adam Lopez
You can also browse the revision history at this up-to-date mirror site: http://github.com/alopez/moses/commits/master/ Clicking on any of the revisions will take you to a page that will (among other things) let you download a tarball of that revision. No knowledge of version numbers is needed.

Re: [Moses-support] Is it a moses buG?

2009-04-13 Thread Adam Lopez
Looks like you need to configure --with-irstlm, not --with-srilm. On Mon, Apr 13, 2009 at 10:08 AM, laxmi khatiwada lkhatiw...@yahoo.com wrote: I have already installed irstlm. When I tried to install moses ./regenerate-makefiles.sh is ok ./configure –with-srilm=/home/laxmi/irstlm is also

Re: [Moses-support] factored memory leak

2009-10-14 Thread Adam Lopez
You may want to try changing the persistent cache options in Moses. By default Moses caches translation tables used for each sentence (to reduce disk IO and thus speed decoding), but if you have lots of translation/generation steps and big and/or complicated translation tables I suppose this could

Re: [Moses-support] Looking for non-CLI tool for aligning parallel text

2009-10-28 Thread Adam Lopez
? I am not bashing their authors, I am only surprised there weren't any authors of better programs... Catalin Braescu On Tue, Oct 27, 2009 at 9:57 PM, Adam Lopez alo...@inf.ed.ac.uk wrote: There are several of these around.  Note that I have not used any of them. http://www.cs.utah.edu

Re: [Moses-support] Looking for non-CLI tool for aligning parallel text

2009-10-28 Thread Adam Lopez
their authors, I am only surprised there weren't any authors of better programs... Catalin Braescu On Tue, Oct 27, 2009 at 9:57 PM, Adam Lopez alo...@inf.ed.ac.uk wrote: There are several of these around.  Note that I have not used any of them. http://www.cs.utah.edu/~hal/HandAlign/ http

Re: [Moses-support] modelling reordering in word alignment

2009-10-31 Thread Adam Lopez
This is a great list, but I would add Och Ney (CL 2003), which, in addition to synthesizing the papers below, contains substantial discussion and comprehensive experimental results on the benefits of modeling reordering. http://aclweb.org/anthology-new/J/J03/J03-1002.pdf On Sat, Oct 31, 2009 at

Re: [Moses-support] About giza++ options when running moses

2009-12-10 Thread Adam Lopez
Yes, that's right. Model 6 is described in this journal article. http://aclweb.org/anthology-new/J/J03/J03-1002.pdf It also explanains of some of the other parameters and reasonable sequences of model iterations. Adam 2009/12/11 李贤华 08lixian...@gmail.com: hi all, About Giza++ options, I

Re: [Moses-support] Moses liscencing terms when used in a commercial product

2010-02-01 Thread Adam Lopez
According to the page on the FBK website, IRSTLM is LGPL. http://hlt.fbk.eu/en/irstlm On Mon, Feb 1, 2010 at 2:02 PM, Miles Osborne mi...@inf.ed.ac.uk wrote: and remember that randLM is GPL;  i suspect IrstLM is also GPL http://sourceforge.net/projects/randlm/

Re: [Moses-support] question about recombination when trying to output phrase lattices

2010-02-05 Thread Adam Lopez
Hi Kevin -- the answer, which you have already guessed, is 1. This is a pretty common optimization, see e.g. Zhifei Li's description in Section 4.3 of this paper: http://aclweb.org/anthology-new/W/W08/W08-0402.pdf Cheers Adam On Fri, Feb 5, 2010 at 12:47 PM, Kevin Gimpel kgim...@cs.cmu.edu

Re: [Moses-support] 2.718 in the phrase-table

2010-02-22 Thread Adam Lopez
Anecdotally, this feature also isn't especially important, see e.g.: http://www.mt-archive.info/AMTA-2006-Lopez.pdf On Mon, Feb 22, 2010 at 12:57 PM, Carlos Henriquez carlosalberto.henriq...@yahoo.com wrote: The last weight from the phrase-table corresponds to the phrase penalty as explained

Re: [Moses-support] different bleu scores from nist and moses scripts

2010-03-19 Thread Adam Lopez
IIRC, the principle difference is the calculation of the brevity penalty, but there also seem to be some slight differences in tokenization between the scripts. On Fri, Mar 19, 2010 at 9:32 AM, Mark Fishel fis...@ut.ee wrote: Dear list, I am getting different BLEU scores from the NIST mteval

Re: [Moses-support] develop step scores

2010-10-24 Thread Adam Lopez
In that case they are estimated dev set scores from the optimizer and not actual dev set scores (which can only obtained after running the decoder). Adam On Sun, Oct 24, 2010 at 1:14 PM, Philipp Koehn pko...@inf.ed.ac.uk wrote: Hi, Yes, they are the ones after the parameter optimization before

[Moses-support] Fwd: PBSMT estimation

2010-10-26 Thread Adam Lopez
On Tue, Oct 26, 2010 at 9:43 AM,  sa...@kortec.nl wrote: Hi, I have a question about PBSMT estimation. If I understand it correctly this is done in the following manner: - first IBM alignments in both directions - then an aligment heuristic such as grow-diag final - from this we create all

[Moses-support] Fwd: Use of qsub array in moses-parallel.pl

2010-12-16 Thread Adam Lopez
Seems like a job for (ugh) autotools: it should be possible to add a configure flag so that you have a one-time, conditional modification of the script so that it either uses the feature or doesn't. On Thu, Dec 16, 2010 at 10:00 AM, Lane Schwartz dowob...@gmail.com wrote: Chris, That makes

Re: [Moses-support] Moses: 2 questions

2011-03-27 Thread Adam Lopez
It is easy to convert a STSG to a weakly equivalent SCFG, though. That's how many people deal with STSG in MT systems, and depending on what you want to do, it may be sufficient. On Sun, Mar 27, 2011 at 11:38 AM, Hieu Hoang hieuho...@gmail.com wrote: If you're creating a german-to-english system

Re: [Moses-support] Nondeterminism during decoding: same config, different n-best lists

2011-04-26 Thread Adam Lopez
Participants in this discussion from a few weeks ago will probably be interested in this upcoming ACL 2011 paper: http://www.cs.cmu.edu/~jhclark/pubs/significance.pdf Cheers Adam On Fri, Mar 25, 2011 at 8:49 AM, Lane Schwartz dowob...@gmail.com wrote: We know that there is nondeterminism during

Re: [Moses-support] is mert taking too much time here?

2011-05-30 Thread Adam Lopez
From the error it appears that you are running MERT on hundreds of thousands of sentences. It generally only needs ~1000 sentences. 2011/5/30 [Intra] Mariusz Hawryłkiewicz mari...@in-tra.com.pl: Hello Moses team! I just wanted to ask you if it’s normal for mert to run over 10 days on a

Re: [Moses-support] is mert taking too much time here?

2011-05-30 Thread Adam Lopez
using resources).  it would be interesting for someone with the time to run MERT with a drastically large tuning set. Miles On 30 May 2011 10:01, Adam Lopez alo...@inf.ed.ac.uk wrote: From the error it appears that you are running MERT on hundreds of thousands of sentences.  It generally only

Re: [Moses-support] A question on the calculation of lexical weighting

2011-06-01 Thread Adam Lopez
You can also simply throw away the observed alignments and compute the optimal alignment. But anecdotally, I haven't observed much difference between variants of lexical weighting. Adam On Wed, Jun 1, 2011 at 9:34 AM, ch c...@rax.ch wrote:  Indeed the default phrase scorer in the moses training

Re: [Moses-support] does mert usually enhance BLEU on a test set?

2011-07-07 Thread Adam Lopez
Another possibility is that the noise in the development set is simply that it has longer or shorter translations than the test set. The attached plot shows several variants of BLEU against many different systems obtained simply by varying the weight of the length feature, holding all others

Re: [Moses-support] does mert usually enhance BLEU on a test set?

2011-07-07 Thread Adam Lopez
Oops, linked to the wrong Jon Clark paper (but you should certainly read both of them). http://aclweb.org/anthology-new/P/P11/P11-2031.pdf On Thu, Jul 7, 2011 at 8:21 AM, Adam Lopez alo...@inf.ed.ac.uk wrote: Another possibility is that the noise in the development set is simply that it has

Re: [Moses-support] Distortion limit (-dl) and size of the extended search graph output (-osgx)

2011-07-12 Thread Adam Lopez
The size of the complete search space grows when you raise the distortion limit, but Moses is showing you the *pruned* search space. The pruned graph keeps a fixed number of hypotheses for each coverage cardinality, hence the number of states in the search graph should be roughly the same

Re: [Moses-support] phrase penalty (always exp(1) = 2.718) problem

2011-07-24 Thread Adam Lopez
These aren't really penalties; they are simply features that fire for every word of the output, and every phrase used to produce a translation, respectively. The weight of these features is normally set using minimum error rate training, in which case manual intervention is not required. Adam On

[Moses-support] Ugandan languages MT

2014-06-30 Thread Adam Lopez
Hi -- Asking on behalf of a colleague: does anyone know of MT systems and/ or parallel datasets for the languages of Uganda? (Swahili, Luganda, Soga, Karomojong, Alur, etc.) -Adam ___ Moses-support mailing list Moses-support@mit.edu

Re: [Moses-support] Major bug found in Moses

2015-06-20 Thread Adam Lopez
Can and should we make a wider effort to facilitate the reproduction of systems by disseminating settings or configuration files? This dissemination is partially done by system description papers, but they cannot cover all settings [this would make for a very boring paper]. I put some effort

Re: [Moses-support] MERT's Powell Search

2015-12-14 Thread Adam Lopez
> > On line 6 does the "score" in "compute line l: parameter value → score" > refer to (i) the MT evaluation metric score (e.g. BLEU) between the > translation and the reference sentence or (ii) nbest list weighted overall > score as we see in the last column of a moses generated nbest list (e.g.