Re: [Moses-support] Unique k-best lists in moses?

2015-03-12 Thread Chris Dyer
in Moses would involve more effort to implement for phrase-based than for syntax-based. Phil On 11 Mar 2015, at 20:03, Chris Dyer cd...@cs.cmu.edu wrote: Hi all, can anyone tell me if moses can be configured to generate unique k-best lists using the Huang et al. (2006) dynamic programming

[Moses-support] Unique k-best lists in moses?

2015-03-11 Thread Chris Dyer
Hi all, can anyone tell me if moses can be configured to generate unique k-best lists using the Huang et al. (2006) dynamic programming algorithm (Section 5.2 of http://www.cis.upenn.edu/~lhuang3/amta06-sdtedl.pdf)? Thanks-- Chris ___ Moses-support

Re: [Moses-support] Ugandan languages MT

2014-06-30 Thread Chris Dyer
We've put together a small corpus (15k sentences training, 2k each of dev/test) of Swahili-English which you can get here: http://demo.clab.cs.cmu.edu/cdyer/gv.sw-en.tar.gz It's roughly equivalent to the data used for the experiments reported in this paper:

Re: [Moses-support] lattices with EPSILON

2013-10-04 Thread Chris Dyer
It's useful to have epsilons since it simplifies the creation of lattices in some cases. Yes, you can convert them to a deterministic equivalent, but that involves implementing FSA determinatization (or using a tool like https://pypi.python.org/pypi/pyfst), which may not be convenient. Btw, I've

Re: [Moses-support] constrained decoding

2013-09-18 Thread Chris Dyer
http://i0.kym-cdn.com/photos/images/original/000/002/299/ze_house_.jpg On Wed, Sep 18, 2013 at 1:15 PM, Lane Schwartz dowob...@gmail.com wrote: Thanks! What happens if the reference isn't reachable? On Wed, Sep 18, 2013 at 4:50 AM, Hieu Hoang hieu.ho...@ed.ac.uk wrote: For anyone who's

Re: [Moses-support] Tuning and decoding of lattices in the new Moses.

2013-09-06 Thread Chris Dyer
Yes, there definitely should be a few checks in various places. I've got a list of recommendations to make lattice decoding a bit easier to get started with. We'll discuss this next week. -C On Fri, Sep 6, 2013 at 5:47 PM, Hieu Hoang hieuho...@gmail.com wrote: Good to know. I don't think it's

Re: [Moses-support] about using word lattices

2013-06-17 Thread Chris Dyer
demo, The id of node of 1 and 2 should be exchanged. But anyway, it doesn't matter right? I will just try feeding the input to moses without checking the format. Many thanks Best, Wei On Thu, Jun 6, 2013 at 5:19 PM, Chris Dyer cd...@cs.cmu.edu wrote: I think you've converted the lattice

Re: [Moses-support] Effect of tuning data size

2013-04-22 Thread Chris Dyer
The JHU summer workshop final report had some experiments on this: http://www.learningace.com/doc/3098660/be148017730f3f3a7b45d656276b482a/jhu-summer-workshop-final-report (See Fig. 6.7 and surrounding) In general: 1) MERT works on so few features that you don't need much dev data to learn them

Re: [Moses-support] mgiza: hillclimb different sums (-nan)

2013-04-15 Thread Chris Dyer
Giza? Who needs Giza? Consider this: http://www.ark.cs.cmu.edu/cdyer/fast_valign.pdf On Mon, Apr 15, 2013 at 7:58 AM, Hieu Hoang hieuho...@gmail.com wrote: You're unlikely to get a response to difficult questions on giza++/mgiza these days. IMO, the knowledge about them is being slowly lost as

Re: [Moses-support] statistical significance tests

2013-01-24 Thread Chris Dyer
If you're interested in statistical significant testing, you really ought to read the Clark et al. (2011) paper (http://www.cs.cmu.edu/~jhclark/pubs/significance.pdf). We showed that the Koehn technique and related methods can indicate significance for reasons that have little to do with the

Re: [Moses-support] BLEU/TER Scorer in c++

2012-09-13 Thread Chris Dyer
cdec has one in https://github.com/redpony/cdec/tree/master/mteval It depends only on the utils subfolder, so it's a bit easier to build than the whole decoder. The bleu implementation is in ns.cc, and there is also an implementation of TER in ns_ter.cc. -Chris On Thu, Sep 13, 2012 at 4:15 PM,

Re: [Moses-support] GIZA++ Question: Monotone Alignment possible?

2012-08-27 Thread Chris Dyer
, I would set this to 0 if i2real i1 (or perhaps i2real i1real ?). That may be all you need to do. -Chris On Mon, Aug 27, 2012 at 5:38 PM, Dario Ernst mo...@kanojo.de wrote: Hi Chris, On 08/24/2012 04:50 PM, Chris Dyer wrote: Although forcing monotone alignments sounds like a fairly large

Re: [Moses-support] GIZA++ Question: Monotone Alignment possible?

2012-08-24 Thread Chris Dyer
(in this case, a reverse alignment), the posterior distribution will also be 0. For this reason, if you initialize carefully, you'll be fine. On 08/24/2012 04:10 AM, Chris Dyer wrote: I think adding this would be have tremendous value. Yay! I'm not alone ;P. Especially in conjunction with PISA

Re: [Moses-support] GIZA++ Question: Monotone Alignment possible?

2012-08-23 Thread Chris Dyer
It should be possible to adapt Giza's HMM implementation to produce monotone alignments. These are the changes that would be necessary (and which should be fairly easy, if you can figure out the code): 1) alignment distribution initialization. by default Giza initializes the HMM transition

Re: [Moses-support] mkcls

2012-06-08 Thread Chris Dyer
Hi Marcin, mkcls is best understood as implementing the Brown et al (1992) clustering model (i.e., a bigram HMM with some extra hard constraints), although it uses a different algorithm for parameter learning than the algorithm proposed by Brown. Its performance has been analyzed and compared to

[Moses-support] Recommended citation for moses MERT?

2011-04-23 Thread Chris Dyer
Hi all, is there a particular citation that should be used for the moses MERT implementation? I seem to recall that there might have been a paper when the old c-mert got reimplemented, but I can't seem to find it. -C ___ Moses-support mailing list

Re: [Moses-support] Recommended citation for moses MERT?

2011-04-23 Thread Chris Dyer
}, url = {prague-mert.pdf} } On Sat, Apr 23, 2011 at 4:12 PM, Lane Schwartz dowob...@gmail.com wrote: Check the Prague Bulletin. I thought that's where they published the description. On Sat, Apr 23, 2011 at 2:35 PM, Chris Dyer cd...@cs.cmu.edu wrote: Hi all, is there a particular citation

Re: [Moses-support] Implementation of Lattice MERT

2011-03-25 Thread Chris Dyer
cdec (https://github.com/redpony/cdec) includes an implementation, called vest. But someone needs to write code that will cause moses to export its search lattices in the right format (which is a funny crappy json-based encoding). On Fri, Mar 25, 2011 at 2:59 PM, Lane Schwartz dowob...@gmail.com

Re: [Moses-support] producing the minimal number of LM-OOVs

2011-03-21 Thread Chris Dyer
On Mon, Mar 21, 2011 at 3:19 AM, Alex Fraser alexfra...@gmail.com wrote: 2) there seems to be some evidence that some translations in the phrase table are so bad that having leaving some words untranslated is better than using what's in the phrase table. I can see an argument that says that

Re: [Moses-support] producing the minimal number of LM-OOVs

2011-03-21 Thread Chris Dyer
I allow pass through of all words, with a penalty that is also learned by MERT. Interesting stuff. Do you have results published on this? This was easiest to implement when I wrote cdec, and the results seemed good enough, so I never did a proper comparison. I will describe the newer innovation

Re: [Moses-support] producing the minimal number of LM-OOVs

2011-03-19 Thread Chris Dyer
I've started using an OOV feature (fires for each LM-OOV) together with an open-vocabulary LM, and found that this improves the BLEU score. Typically, the weight learned on the OOV feature (by MERT) is quite a bit more negative than the default amount estimated during LM training, but it is still

Re: [Moses-support] the insides of lattice decoding

2011-01-25 Thread Chris Dyer
Hi Sylvain, I've gone ahead and added the relevant function to WordLattice.h/cpp that should make it a bit easier to construct lattices programmatically. You'll need to encode them in the data type defined in PCNTools.h, which is basically a programmatic representation of the PLF format described

Re: [Moses-support] ' character in word lattices

2010-12-18 Thread Chris Dyer
You can escape it with a backslash: ((('\'',1,1),),) On Sat, Dec 18, 2010 at 7:47 AM, Mehmet Tatlıcıoğlu tatlicio...@gmail.com wrote: Hi, How can I put ' character as a label on an edge in word lattices? eg. if the label is test, then the lattice component is the form of ((('test', 1.0, 1),

Re: [Moses-support] Use of qsub array in moses-parallel.pl

2010-12-16 Thread Chris Dyer
Would it be possible to have some kind of flag that turns this on or off? For a variety of reasons I've been working with the same software in a bunch of different environments that are similar (but just different enough) that I found it useful to make the parts that deal with the cluster sort of

Re: [Moses-support] Lower scores with Word Lattice

2010-11-16 Thread Chris Dyer
I had a query with regard to use of lattice input in moses. There is a little difference in the translations generated when I run moses using the 'normal' input format and when I run it with 'lattice input' format. The translations weren't radically different - only a few phrases were

Re: [Moses-support] compound spiltting for German

2010-11-16 Thread Chris Dyer
I have some software that will generate splits from German language compounds: https://github.com/redpony/cdec/tree/master/compound-split/ It can produce either lattices of high probability splits or just a 1-best split. The model used is a conditional random field trained on a small amount

Re: [Moses-support] Proposal to replace vertical bar as factor delimeter

2010-11-15 Thread Chris Dyer
--factorDelimiter=| There is such a flag. I implemented this about 4 years ago, but AFAIK I'm the only one who ever uses (and I still use it). -C etc. Miles On 15 November 2010 21:30, Hieu Hoang hieuho...@gmail.com wrote: That's a good idea. In the decoder, there's 4 places that has to

Re: [Moses-support] Word lattice representation for Moses (PLF)

2010-10-18 Thread Chris Dyer
Hi Mehmet, The following lattice will do what you are asking for, I think: ((('x',0.5,1),('xy',0.5,2)),(('yz',1,2),),(('z',1,1),),) The trick is ot use the last element of the tuples to indicate what node the edge ends up in. The first two nodes, have single edges leaving, but the edges don't

Re: [Moses-support] Decoding lattice with moses_chart?

2010-08-24 Thread Chris Dyer
I don't know if moses's chart decoder supports lattices, but two other chart decoders, Joshua and cdec, do. On Tue, Aug 24, 2010 at 8:27 PM, Hwidong Na le...@postech.ac.kr wrote: Hi all, I want to decode an input lattice with moses_chart. When I switch the decoder from moses to moses_chart,

[Moses-support] Lattice (PLF) verifier

2010-08-15 Thread Chris Dyer
Hi moses users, This message is only of interest if you use Moses's word lattice translation features. Since Moses is hardly graceful when it encounters malformed lattice input, I've added a simple binary (moses-cmd/src/checkplf) that you can use to verify that your lattice inputs are both

Re: [Moses-support] SLF to PLF - tests - moses crashing

2010-08-09 Thread Chris Dyer
Yes, that will do it. On Mon, Aug 9, 2010 at 4:13 PM, Sylvain Raybaud sylvain.rayb...@loria.fr wrote: On Monday 09 August 2010 20:43:53 Chris Dyer wrote: I'm sorry I haven't had an opportunity to look into this yet (hopefully later this evening). But, one thing that you need to do is make

Re: [Moses-support] SLF to PLF: how to remove links

2010-08-06 Thread Chris Dyer
Moses interprets the string *EPS* as an epsilon transition in the lattice, which means it can take the transition (and use any associated features), but the translation model will ignore the transition. -Chris On Fri, Aug 6, 2010 at 11:02 AM, Sylvain Raybaud sylvain.rayb...@loria.fr wrote: There

Re: [Moses-support] converting lattice from HTK to PLF

2010-08-02 Thread Chris Dyer
If you do put together a script to convert to PLF (using the Moses documentation), this would be a valuable contribution to the moses code base. I'll be happy to answer questions about the lattice format as they come up, although I'm starting a new job this week and may be delayed in responding

Re: [Moses-support] Assertion weightAll.size() = weightAllOffset + numScoreComponent failed

2010-05-16 Thread Chris Dyer
That doesn't happen too often...Can you send along your moses.ini file? On Sun, May 16, 2010 at 7:17 PM, David Edelstein dedelst...@ucdavis.edu wrote: Hello, I have trained, tuned, and prepared an eval set, trying to get Moses to decode Arabic to English. However, trying with either filtered

Re: [Moses-support] Assertion weightAll.size() = weightAllOffset + numScoreComponent failed

2010-05-16 Thread Chris Dyer
Jie's assessment is correct, your moses.ini is missing a value under the [weight-t] section. There should be 5 values there, but instead there are 4. I'm not familiar with the MERT implementation that is bundled with moses these days, so I can't really tell you where to look, but it should

Re: [Moses-support] Adding sentence-level flag features

2010-03-25 Thread Chris Dyer
Moses uses features to discriminate between alternative translations of individual sentences, so if the value is constant for all possible translations (for example, because it is a function of the input), the model won't be able to take advantage of it. It sounds like you might be proposing

Re: [Moses-support] Adding sentence-level flag features

2010-03-25 Thread Chris Dyer
any suggestions you have. Suzy On 26/03/2010, at 11:32 AM, Chris Dyer wrote: Moses uses features to discriminate between alternative translations of individual sentences, so if the value is constant for all possible translations (for example, because it is a function of the input

Re: [Moses-support] segmentation fault with lattice decoding

2010-03-14 Thread Chris Dyer
Oh right, I had completely forgotten about this. With non-lattice input, there is some logic that looks for phrases only up to a certain max phrase size. However, this does not work with lattices and must be disabled. I usually set the max phrase size to be 10 or something like that, which

Re: [Moses-support] search graph to word lattice

2010-03-04 Thread Chris Dyer
produce who is the bill which is     not necessarily an option ...     Thanks a lot for clarifying this to me!     Jörg     Chris Dyer wrote:         As long as you're just splitting, keeping the weights consistent         isn't         too hard- just keep all the weight in one segment

Re: [Moses-support] segmentation fault with lattice decoding

2010-03-04 Thread Chris Dyer
or directory.         in /usr/lib/gcc/x86_64-linux-gnu/4.1.2/../../../../include/c++/4.1.2/ext/new_allocator.h Indeed, the header file does not exist on my system. Do I need to install some additional packages and re-compile Moses in a certain way to get rid of this error? Jörg Chris Dyer

Re: [Moses-support] segmentation fault with lattice decoding

2010-03-04 Thread Chris Dyer
on the sentence that gave you the problems to see if you can reproduce the problem. regards Barry On Thursday 04 March 2010 15:24, Chris Dyer wrote: I'm not certain what's causing this.  From the part of the stack trace you're showing, it looks like it's probably when translations options

Re: [Moses-support] search graph to word lattice

2010-03-01 Thread Chris Dyer
to each part of the split ? Maybe it has not any real impact in the end, or has it ? Loïc 2010/3/1 Chris Dyer redp...@umd.edu I guess word-graph doesn't split phrases either (I was just guessing).  It appears to be in SLF format, which is used by a number of tools (like HTK and the SRI tools

Re: [Moses-support] moses_chart: tuning with mert-moses-new.pl doesn't change the moses.ini

2010-02-02 Thread Chris Dyer
For French-English translation, hierarchical models will probably not do any better than phrase-based models. The rule-of-thumb seems to be that language which large-scale reordering fare better with hierarchical models, but language pairs with only relatively local reordering are better with

Re: [Moses-support] word lattice + multiple translation tables optimization problem

2010-01-22 Thread Chris Dyer
I think the issue here has to do with the MERT configuration. When you use lattice input, the weights on the lattice edges are included in the translation model as a feature that MERT can optimize. I'm not familiar with the MERT optimizers that are included with Moses, but it sounds like it is

Re: [Moses-support] Stack smashing detected, died with signal 6

2010-01-21 Thread Chris Dyer
Stack smashing- that's a new one for Giza! Are you using the version of giza++ from google code (http://code.google.com/p/giza-pp/)? Some older versions had a few uninitialized variables that could conceivably cause crashes on some architectures. 2010/1/21 Guillem Massó Sanabre

Re: [Moses-support] Format of phrase reordering file extract.o.gz

2009-11-11 Thread Chris Dyer
Hi John- The first label is the orientation of the phrase pair with respect to its left context (on the source side), and the second is the orientation with respect to its right context. That's why you have to have swap other or other swap, since a phrase can only be inverted on one side. Hope

Re: [Moses-support] The flag -early-discarding-threshold in moses

2009-11-09 Thread Chris Dyer
This functionality is broken in the tip of the trunk. There was a project last january to check change the way hypothesis scoring was done to be more flexible that broke this. It needs to be fixed. One alternative is to roll back to the version of the code that was at the tip of the trunk in

Re: [Moses-support] modelling reordering in word alignment

2009-11-04 Thread Chris Dyer
results on the benefits of modeling reordering. http://aclweb.org/anthology-new/J/J03/J03-1002.pdf On Sat, Oct 31, 2009 at 7:56 PM, Chris Dyer redp...@umd.edu wrote: Modeling reordering is usually helpful, even during alignment.  This is especially true for lexical translation models (where words

Re: [Moses-support] Giza++ segv

2009-09-14 Thread Chris Dyer
Is it possible that you have a sentence of length zero? size2 is, I believe, one of the dimensions of the trellis, which in one direction is the source sentence length and in the other is the target sentence length. On Mon, Sep 14, 2009 at 1:28 AM, John Kolen johnfko...@gmail.com wrote: I'm

Re: [Moses-support] Looking for text corpora

2009-09-06 Thread Chris Dyer
This was recently announced on the corpora list: http://www.uncorpora.org/ -Chris On Sun, Sep 6, 2009 at 1:36 PM, Catalin Braescucata...@braescu.com wrote: Thanks, Miles! From your link I got http://www.statmt.org/europarl/ Any other such goodies? Catalin -- Omlulu.com On Sun, Sep

Re: [Moses-support] Word lattice distortion cost

2009-07-20 Thread Chris Dyer
This is probably a problem with the regression test. The two conditions ought to be identical, as you expect. However, keep in mind that the distortion model is incredibly weak, and the heuristic distance definition used in lattice decoding is also just an approximation, so an off-by-one error

Re: [Moses-support] Fwd: alignment problem

2009-06-18 Thread Chris Dyer
The alignment models are going to struggle quite a bit when the source to target length ratio is so skewed. I would recommend finding a way to retokenize/resegment the source and/or target language so as to induce a more even ratio. If this isn't possible, you may need to look into custom

Re: [Moses-support] MERT - optimal weights out of specified ranges?

2009-06-04 Thread Chris Dyer
Hi Thang, The ranges that are specified for moses are just suggestions for random starting points for the MERT algorithm. However, it may (and often does) find weights that end up outside of these ranges. -Chris On Thu, Jun 4, 2009 at 2:32 AM, Thang Luong Minh luong.m.th...@gmail.comwrote:

Re: [Moses-support] giza-pp in train-factored-phrase-model.perl script

2009-05-14 Thread Chris Dyer
These warnings aren't too serious. The alignments should be fine. Chris On Thu, May 14, 2009 at 8:56 PM, Tom Hoar tah...@gmail.com wrote: During the train-factored-phrase-model.perl script of my character-by-character aligment, giza-- reported the errors below. Will this trained model be

Re: [Moses-support] GIZA++ Configuration for HMM Alignment

2009-04-18 Thread Chris Dyer
There's actually an option for this in train-factored-phrase-model.perl. Just specify --hmm and it will automatically set the Giza++ options appropriately. -Chris 2009/4/18 Manoj C (మనోజ్ చిన్నకోట్ల) manoj.chinnako...@gmail.com: Hi All, I am training a standard translation model using moses

Re: [Moses-support] Error when run moses with lattices format as input

2009-04-16 Thread Chris Dyer
You need to add a -weight-i flag to the command line which specifies how much weighting to apply to the arc feature. e.g.: moses ... -weight-i 0.5 -Chris On Thu, Apr 16, 2009 at 9:58 AM, Nguyen Manh Hung manhh...@cl.ics.tut.ac.jp wrote: Hi, I'm using Moese to decode with lattices format as

Re: [Moses-support] Error when run moses with lattices format as input

2009-04-16 Thread Chris Dyer
Can you send me a stack trace for where the SEGV is happening? Once the phrase table has been binarized, there's no need to have any special temporary space. On Tue, Apr 28, 2009 at 10:46 AM, Nguyen Manh Hung manhh...@cl.ics.tut.ac.jp wrote: Chris Dyer さんは書きました: You need to add a -weight-i

Re: [Moses-support] Error when run moses with lattices format as input

2009-04-16 Thread Chris Dyer
2009-04-16 (木) の 11:34 -0400 に Chris Dyer さんは書きました: Can you send me a stack trace for where the SEGV is happening? Once the phrase table has been binarized, there's no need to have any special temporary space. On Tue, Apr 28, 2009 at 10:46 AM, Nguyen Manh Hung manhh...@cl.ics.tut.ac.jp

[Moses-support] New release of Giza++

2009-03-20 Thread Chris Dyer
Hi all- There's a new release of GIZA++ available from http://code.google.com/p/giza-pp/ . The changes address build issues on a variety of platforms and compilers, including: - better adherence to c++ header naming conventions (fixes build problems on gcc 3.4) - autodetection of MacOSX

Re: [Moses-support] Error in running moses with randlm

2009-03-04 Thread Chris Dyer
Yeah, sorry about this- I broke moses, at least for certain compilers. I'll fix it shortly. -Chris On Wed, Mar 4, 2009 at 12:17 PM, Miles Osborne mi...@inf.ed.ac.uk wrote: ok, it seems that the most recent version of Moses had a bad commit and broke the language model interface.  so, this is

Re: [Moses-support] Parallelising Giza++ for supercomputers

2009-02-20 Thread Chris Dyer
Another architecture to consider is storing/distributing the ttable from a single central repository. Most of the ttable is full of crap, and for each sentence, you know exactly what parameters will be required in advance of running your E step. However, by not distributing stuff that you don't

Re: [Moses-support] Beam thresholding

2009-02-06 Thread Chris Dyer
One way to do it is to just set a really high number for the threshold. The maximum ceiling used by moses for a feature value is 100, and then pick the largest total sum that your feature weights can have, double it (since you may have negative feature values), and set that... There may be an

Re: [Moses-support] 'proper' conditioning in phrase extract

2008-11-21 Thread Chris Dyer
Hi Ondrej, See below. And one additional question: when extracting phrases, phrase-extract actually extracts all phrases that *are not incompatible* with the alignment. I'm thinking about a different method: just phrases that *are 'strictly' compatible*, which means I would extract: a=A

Re: [Moses-support] How to solve? make GIZA++ TTables.cc:39 error: too few templ

2008-10-17 Thread Chris Dyer
Hi, There's a known bug with certain versions of g++ that GIZA++ hits. If feasible, you might switch to a different version (4.0 works for sure). If not, send your machine architecture/OS type and perhaps someone will have a binary they can provide. Thanks, Chris On Fri, Oct 17, 2008 at 1:43

Re: [Moses-support] giza model 1 uniform initialization

2008-08-10 Thread Chris Dyer
normalizeTable I use a more straightforward method (below) which yields in different output, could someone elaborate? Mycode: Probability(f|e) = Count(e,f) / count (e) The code that normalizes the counts to probabilities is in TTable::normalizeTable. But, for the initialization, it

Re: [Moses-support] Non-deterministic GIZA?

2008-07-16 Thread Chris Dyer
There's been a recent release of GIZA (July 8) that fixes some potential sources of non-determinism, specifically relating to how distortion models (model 2 or the HMM) get initialized. When did you download it from http://code.google.com/p/giza-pp/ ? --Chris On Wed, Jul 16, 2008 at 6:35 PM,

Re: [Moses-support] Cannot make GIZA++ executable

2008-07-07 Thread Chris Dyer
Some versions of the g++ compiler (3.4.x I think) have a bug that prevents this code from compiling properly. Working around the bug is not straightforward, so trying a different version of the compiler is probably your best bet. Chris On Mon, Jul 7, 2008 at 10:00 PM, Vineet Kashyap [EMAIL

Re: [Moses-support] confusion network vs. 1-best

2008-05-12 Thread Chris Dyer
Hi Hu- What tool are you using to generate the confusion networks? If you use SRILM's lattice-tool, you should make sure to set the LM weight and AM weights to something appropriate. Chris On Sun, May 11, 2008 at 11:33 PM, Hu Xiaoguang [EMAIL PROTECTED] wrote: hi, all I did some experiments

Re: [Moses-support] Giza HMM errors - NAN

2008-03-25 Thread Chris Dyer
:02 AM, John D. Burger [EMAIL PROTECTED] wrote: Chris Dyer wrote: I haven't looked into what's causing the particular problem on this corpus, but another known problem with the GIZA HMM model is that it doesn't do a fairly standard kind of normalization in the forward-backward training

Re: [Moses-support] lowercasing/recasing

2008-03-05 Thread Chris Dyer
There have been some advocates of preserving case information as you describe, although I've only seen them discussed in the context of small-coverage systems, such as in the IWSLT task. See, for example, the system description of the Carnegie Mellon Univ system from 2006's IWSLT entry:

Re: [Moses-support] lowercasing/recasing

2008-03-05 Thread Chris Dyer
Faced with improper input, would it not make more sense to try and fix it in the source language before translation, rather than distorting the translation with the induced errors, then trying to fix the translation ? That could be an interesting experiment. The transformation one

Re: [Moses-support] Giza HMM errors - NAN

2008-02-28 Thread Chris Dyer
I haven't looked into what's causing the particular problem on this corpus, but another known problem with the GIZA HMM model is that it doesn't do a fairly standard kind of normalization in the forward-backward training, which causes underflow errors in some sentences (especially quite long

Re: [Moses-support] [Fwd: Run mert-moses.pl with confusion network]

2008-02-27 Thread Chris Dyer
/column: nan avg. cols/sent:nan Let me know if I made mistake somewhere. Thanks Linh Chris Dyer wrote: I am still confused about the lattice format, In your examples: 1 ((('A',1.0,1),),(('B',1.0,1),),) 2 ((('A',1.0,1),('Z',1.0,2),),(('B',1.0,1),),(('C',1.0,1

Re: [Moses-support] [Fwd: Run mert-moses.pl with confusion network]

2008-02-20 Thread Chris Dyer
The lattice format isn't documented yet on the webpage, but you can see some examples of it in the lattice-distortion test directory Hieu mentions. It should be fairly straightforward to decipher. Since this format encodes a single lattice/CN per line of text, it can be used easily with MER

Re: [Moses-support] Minor GIZA bug

2008-02-15 Thread Chris Dyer
Hi Qin- Thanks for letting me know about this problem. I'll submit your recommended fix. I'm not completely familiar with the GIZA implementation of the HMM model, but this seems reasonable enough. Chris On Fri, Feb 15, 2008 at 4:40 PM, Qin Gao [EMAIL PROTECTED] wrote: Hi All, I found a

Re: [Moses-support] GIZA++ nan errors

2008-01-29 Thread Chris Dyer
What phase of GIZA training is this occuring in? GIZA runs several iterations in several stages, Model 1, HMM, Model 3, Model 4, etc. And what is the very first sign of trouble you see? Are there any errors before this? This problem generally means that you're trying to model something that