Re: [Moses-support] RandLM make Error

2014-11-20 Thread Miles Osborne
LDHT is not really supported, but looking at your error message it seems that you need to install Google Sparse Hash. On Wed Nov 19 2014 at 12:47:27 PM Hieu Hoang hieu.ho...@ed.ac.uk wrote: There is a script within the randlm project that compiles just the library needed to integrate the

Re: [Moses-support] embeddings

2014-07-02 Thread Miles Osborne
I would model them as feature functions over phrases. You might imagine that you can exploit vector similarity to do smoothing. Good luck Miles ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Moses-support Digest, Vol 91, Issue 52

2014-05-30 Thread Miles Osborne
this perl snippet: $line =~ tr/\040-\176/ /c; On 30 May 2014 12:17, moses-support-requ...@mit.edu wrote: Send Moses-support mailing list submissions to moses-support@mit.edu To subscribe or unsubscribe via the World Wide Web, visit

Re: [Moses-support] Moses-support Digest, Vol 91, Issue 52

2014-05-30 Thread Miles Osborne
gonzález to gonz lez On 30 May 2014 17:22, Miles Osborne mi...@inf.ed.ac.uk wrote: this perl snippet: $line =~ tr/\040-\176/ /c; On 30 May 2014 12:17, moses-support-requ...@mit.edu wrote: Send Moses-support mailing list submissions to moses-support@mit.edu

Re: [Moses-support] Moses-support Digest, Vol 91, Issue 52

2014-05-30 Thread Miles Osborne
friday nlp malaise On 30 May 2014 17:51, Miles Osborne mi...@inf.ed.ac.uk wrote: it is trivial to change it to say a ? mark. but I'm not sure what you want as output now. the original request was for removing non-printable characters, which the Perl does, Miles On 30 May 2014 12:43

Re: [Moses-support] Perplexity KenLM

2014-05-16 Thread Miles Osborne
you can get kenlm to report perplexity as follows: bin/query foo.arpa text | tail -n 1 note that you need to be careful with OOVs if you are comparing models that do not all use the same vocabulary. (SRILM is broekn in this respect in that an OOV will give you a probability of one) Miles --

Re: [Moses-support] about testing on part of training dataset

2013-12-21 Thread Miles Osborne
SMT systems such as Moses do not guarantee that they can reproduce the training set. For example, phrases might be pruned due to frequencies being too low, not all words might be aligned, the decoder might discard the true translation during etc etc. This doesn't really have much to do with

Re: [Moses-support] incremental training

2013-10-30 Thread Miles Osborne
Incremental training in Moses is based upon work we did a few years back: http://homepages.inf.ed.ac.uk/miles/papers/naacl10b.pdf Table 3 shows that there is essentially no quality difference between incremental training and standard GIZA++ training. incremental (re) training is a lot faster.

Re: [Moses-support] compile error with LDHT in randlm

2013-09-25 Thread Miles Osborne
If I recall the decoder was modified to allow batching of LM requests. Miles On 25 September 2013 10:22, Hieu Hoang hieuho...@gmail.com wrote: I'm not sure how to compile LDHT but when i compiled randlm from svn, i had to change 2 minor things to get it to compile on my mac: 1.

Re: [Moses-support] compile error with LDHT in randlm

2013-09-25 Thread Miles Osborne
this functionality. Cheers, Lane On Wed, Sep 25, 2013 at 10:24 AM, Miles Osborne mi...@inf.ed.ac.uk wrote: If I recall the decoder was modified to allow batching of LM requests. Miles On 25 September 2013 10:22, Hieu Hoang hieuho...@gmail.com wrote: I'm not sure how to compile LDHT but when i compiled

Re: [Moses-support] mosese decoder android and ios porting

2012-11-27 Thread Miles Osborne
For a long time now I've wanted to see Moses on a small device. Apart from all of the extra functionality that isn't needed, one would also need to work on shrinking the phrase table and perhaps also the search graph. KenLM / RandLM already deal with making the language model smaller. An

Re: [Moses-support] Including new features in moses decoding

2012-07-25 Thread Miles Osborne
this is a fairly typical result for MERT. i notice you are using MIRA, which is claimed to be more reliable. see http://www.aclweb.org/anthology/N/N09/N09-1025.pdf note that getting MIRA to work takes a lot of tweaking, so read the fine print carefully Miles On 25 July 2012 17:24, Cristina

Re: [Moses-support] Including new features in moses decoding

2012-07-25 Thread Miles Osborne
the way weights are estimated, translation changes when I add new features with zero weight (not in development but in test). They shouldn't contribute to score the final translation, right? Cristina On Wed, 25 Jul 2012, Miles Osborne wrote: this is a fairly typical result for MERT. i notice

Re: [Moses-support] Including new features in moses decoding

2012-07-25 Thread Miles Osborne
then something is wrong Miles On 25 July 2012 19:42, Cristina cristi...@lsi.upc.edu wrote: mmm... but the others were optimised altogether, without the new ones I'm giving a weight zero... On Wed, 25 Jul 2012, Miles Osborne wrote: if you have non-zero feature values at training time

Re: [Moses-support] Fwd: a question about moses

2012-05-01 Thread Miles Osborne
The standard way to do this is pretend that each word pair in a dictionary is a little sentence. Append this to the usual parallel corpus and train with Giza Miles On May 1, 2012 5:53 PM, Abby Levenberg leven...@gmail.com wrote: Hi, I assume the answer is no but wanted to be sure. Thanks,

Re: [Moses-support] Higher BLEU/METEOR score than usual for EN-DE

2012-04-26 Thread Miles Osborne
Very short sentences will give you high scores. Also multiple references will boost them Miles On Apr 26, 2012 8:13 PM, John D Burger j...@mitre.org wrote: I =think= I recall that pairwise BLEU scores for human translators are usually around 0.50, so anything much better than that is indeed

Re: [Moses-support] Evaluation

2012-04-20 Thread Miles Osborne
no it works as I just verified. On 20 April 2012 11:29, sara hamza sarahamz...@gmail.com wrote: Good Morning everyOne , Can anyone tell me please where can I get  the  mteval‐v11b.pl used in evaluation ?? I found this URL in some documentation : ftp://

Re: [Moses-support] Incremental training

2012-02-21 Thread Miles Osborne
incremental training for Giza is distinct from incremental training for the language model. we have worked on both --see Abby Levenberg's PhD http://homepages.inf.ed.ac.uk/miles/phd-projects/levenberg.pdf the short answer is yes, but I don't think the incremental LM code has migrated from

Re: [Moses-support] Remote LM protocol?

2012-02-14 Thread Miles Osborne
Oliver is in the process of finishing it. Miles On Feb 14, 2012 3:45 PM, Lane Schwartz dowob...@gmail.com wrote: Miles, Just ran across this email and thought I'd follow up. How is this coming along? :) Cheers, Lane On Thu, Nov 17, 2011 at 11:31 AM, Miles Osborne mi...@inf.ed.ac.uk

Re: [Moses-support] Remote LM protocol?

2012-02-14 Thread Miles Osborne
:33 AM, Miles Osborne mi...@inf.ed.ac.uk wrote: Oliver is in the process of finishing it. Miles On Feb 14, 2012 3:45 PM, Lane Schwartz dowob...@gmail.com wrote: Miles, Just ran across this email and thought I'd follow up. How is this coming along? :) Cheers, Lane

[Moses-support] New multi-parallel corpus available (Indic Languages)

2012-01-24 Thread Miles Osborne
source segments was translated redundantly by four different Turkers. Note that we have translated paragraphs, so the data should be of interest to researchers looking at discourse as well as machine translation. http://homepages.inf.ed.ac.uk/miles/babel.html Miles Osborne (Edinburgh) Chris Callison

Re: [Moses-support] Filtering LMs

2011-11-24 Thread Miles Osborne
this can be done, but it tends to not save much space. also it does not help deal with OOVs, which the language model can still score even though they are not in the parallel set. if you are worried about saving space then you should either look at KenLM or RandLM Miles On 24 November 2011

Re: [Moses-support] Randomisation by MGIZA and tuning result is worse than no tuning

2011-11-22 Thread Miles Osborne
--in general, Machine Translation training is non-convex.  this means that there are multiple solutions and each time you run a full training job, you will get different results.  in particular, you will see different results when running Giza++ (any flavour) and MERT. Is there no way to

Re: [Moses-support] Various questions about training and tuning

2011-11-18 Thread Miles Osborne
re: not tuning on training data, in principle this shouldn't matter (especially if the tuning set is large and/or representative of the task). in reality, Moses will assign far too much weight to these examples, at the detriment of the others. (it will drastically overfit). this is why the

Re: [Moses-support] Remote LM protocol?

2011-11-17 Thread Miles Osborne
we have been working on making distributed LMs efficient. stay tuned Miles On 17 November 2011 13:53, Hieu Hoang hieuho...@gmail.com wrote: hi peter i think christian federmann worked on the remote LM :

Re: [Moses-support] Multi-run mert to average non-deterministic results

2011-11-07 Thread Miles Osborne
Question: do you think it's better to run mert-moses.pl more times with smaller sets, or fewer times with larger sets? you should run tuning with larger sets, multiple times no amount of rerunning tuning on a small set will tell you anything Miles On 7 November 2011 13:45, Tom Hoar

Re: [Moses-support] Multi-threading / Boost lib / compile error for threaded Moses

2011-09-22 Thread Miles Osborne
that doesn't work, as all of the locking code etc would still be invoked. you really want something like --threads 0 which should bypass everything and truly run in single threaded mode Miles On 22 September 2011 10:26, Kenneth Heafield mo...@kheafield.com wrote: -threads 1 ? On 09/22/11

Re: [Moses-support] Multi-threading / Boost lib / compile error for threaded Moses

2011-09-22 Thread Miles Osborne
compile-time does a better job at meeting a goal that I don't buy into. On 09/22/11 10:31, Miles Osborne wrote: that doesn't work, as all of the locking code etc would still be invoked. you really want something like --threads 0 which should bypass everything and truly run in single threaded

Re: [Moses-support] Multi-threading / Boost lib / compile error for threaded Moses

2011-09-22 Thread Miles Osborne
On 22 September 2011 11:28, Kenneth Heafield mo...@kheafield.com wrote: But I don't see a use case for it.  I can run gdb just fine on a multithreaded program that happens to be running one thread.  And the stderr output will be in order. On 09/22/11 11:21, Miles Osborne wrote: should someone

Re: [Moses-support] Phrase probabilities

2011-09-20 Thread Miles Osborne
exactly, the only correct way to get real probabilities out would be to compute the normalising constant and renormalise the dot products for each phrase pair. remember that this is best thought of as a set of scores, weighted such that the relative proportions of each model are balanced Miles

Re: [Moses-support] Phrase probabilities

2011-09-20 Thread Miles Osborne
question is: What is that metric best indicative of? -- Taylor Rose Machine Translation Intern Language Intelligence IRC: Handle: trose     Server: freenode On Tue, 2011-09-20 at 16:14 +0100, Miles Osborne wrote: exactly,  the only correct way to get real probabilities out would be to compute

Re: [Moses-support] build 5 gram with SRILM and moses

2011-09-06 Thread Miles Osborne
yes On 6 September 2011 17:28, Cyrine NASRI cyrine.na...@gmail.com wrote: Hi all, Is it possible tu uses 5 gram Language model built bu SRILM with MOses? Thanks Best Cyrine -- *Cyrine Ph.D. Student in Computer Science* ___ Moses-support

Re: [Moses-support] KenLM build-binary trie problems (SRILM ARPA file)

2011-08-16 Thread Miles Osborne
for the SRILM, you use the -unk flag; RandLM does this by default if I recall Miles On 16 August 2011 06:28, Tom Hoar tah...@precisiontranslationtools.comwrote: Ken, Does the online moses documentation refer to how to ensure the language model has unk in the vocabulary? I've never seen it.

Re: [Moses-support] Improvements to MERT

2011-08-12 Thread Miles Osborne
good to see the variance reduction. why not repeat this with more features? you should see a greater effect this way. an easy way to do this is to just add more language models. Miles On 11 August 2011 19:53, Philipp Koehn pko...@inf.ed.ac.uk wrote: Hi, I added a number of improvements to

Re: [Moses-support] Running Giza++ on subsets of data

2011-06-15 Thread Miles Osborne
that isn't the expected answer here. i think the OP wants some kind of incremental (re) training. firstly: it is not really possible to guarantee that performance is not degraded when running from subsets up to the full set (compared with just running it on the full set). secondly, you may

Re: [Moses-support] Running Giza++ on subsets of data

2011-06-15 Thread Miles Osborne
it is this: Abby Levenberg, Chris Callison-Burch and Miles Osborne. Stream-based Translation Models for Statistical Machine Translationhttp://homepages.inf.ed.ac.uk/miles/papers/naacl10b.pdf. NAACL, Los Angeles, USA, 2010. http://homepages.inf.ed.ac.uk/miles/papers/naacl10b.pdf Miles On 15

Re: [Moses-support] How to change phrase representation

2011-06-13 Thread Miles Osborne
the simplest approach would be to use another character to join words together. the tokeniser thinks you have hyphenated words, which is probably what you don't want. Miles On 13 June 2011 18:39, Anna c annac...@hotmail.com wrote: Hi, I've tried what you suggested, but I'm not sure if I'm

Re: [Moses-support] experiment.perl with IRSTLM only (no SRILM installed)

2011-05-27 Thread Miles Osborne
is this after running with SRILM? if so, then look for the script which creates the LM and delete it. that should force it to be re-created, using IRSLM Miles On 27 May 2011 09:16, Greg Wilson gre...@gmail.com wrote: Hi, first let me thank the people who are making Moses available, your work

Re: [Moses-support] Can't compile latest Moses with irstlm and srilm

2011-05-21 Thread Miles Osborne
It looks like you are using 64 bit versions eg srilm. Make sure everything is 32 bit Miles On 21 May 2011 13:45, Bartosz Grabski bartosz.grab...@gmail.com wrote: Hello, I'm using quite fresh Ubuntu 11.04 (on a 32bit machine). I downloaded and compiled latest srilm and irstlm (not without some

Re: [Moses-support] How much Ram for Europarl?

2011-04-18 Thread Miles Osborne
naturally, the parallel data could be down-sampled (eg use 1/2 of it). you probably won't see a significant degradation in translation quality and the whole training process will use less RAM and will be quicker. Miles On 18 April 2011 15:05, Tom Hoar tah...@precisiontranslationtools.com wrote:

Re: [Moses-support] Nondeterminism during decoding: same config, different n-best lists

2011-03-25 Thread Miles Osborne
There is work published on making mert more stable (on the train so can't easily dig it up) Miles sent using Android On 25 Mar 2011 12:49, Lane Schwartz dowob...@gmail.com wrote: We know that there is nondeterminism during optimization, yet virtually all papers report results based on a single

Re: [Moses-support] Nondeterminism during decoding: same config, different n-best lists

2011-03-25 Thread Miles Osborne
/mtmarathon2010/ProjectFinalPresentation/MERT/StabilizingMert.pdf On Friday 25 March 2011 13:02, Miles Osborne wrote: There is work published on making mert more s... The University of Edinburgh is a charitable body, registered in Scotland, with registration number S

Re: [Moses-support] running moses on a cluster with sge

2011-02-02 Thread Miles Osborne
to add to Barry's excellent answer, we are currently working on a client-server language model. this will mean that a cluster of machines can be used, with a shared resource. it should also work with multicore but in the short-term, you are probably better off with multicore Miles On 2

Re: [Moses-support] skip tuning in ems

2011-01-31 Thread Miles Osborne
supply a weights file, eg weight-config = /home/miles/nist09/run9.moses.ini add this to the TUNING section. Miles On 31 January 2011 21:22, John Morgan johnjosephmor...@gmail.com wrote: -- Regards, John J Morgan Hello, I'd like to run an experiment with the ems without tuning.  Is it

Re: [Moses-support] skip tuning in ems

2011-01-31 Thread Miles Osborne
? Is there a way to use pas-unless, ignore-unless, or template-if for this? Thanks, John On 1/31/11, Miles Osborne mi...@inf.ed.ac.uk wrote: supply a weights file, eg weight-config = /home/miles/nist09/run9.moses.ini add this to the TUNING section. Miles On 31 January 2011 21:22, John Morgan

Re: [Moses-support] Train moses incrementally

2011-01-16 Thread Miles Osborne
Not yet Miles sent using Android On 15 Jan 2011 10:00, Sébastien Druon s.dr...@ml-technologies.com wrote: Thanks! Do you approximately know in what time frame? Regards, Sebastien On Wed, 2011-01-12 at 09:44 +, Miles Osborne wrote: sorry, the code is not publically availab

Re: [Moses-support] Train moses incrementally

2011-01-12 Thread Miles Osborne
Sebastien On 12 Jan 2011 09:21, Miles Osborne mi...@inf.ed.ac.uk wrote: yes.  we have done this for both Giza++ and for the language model: Stream-based Translation Models for Statistical Machine Translation, Abby Levenberg, Chris Callison-Burch and Miles Osborne, NAACL 2010 Stream-based

Re: [Moses-support] SRILM problem

2010-11-26 Thread Miles Osborne
in general you should send SRILM requests to their mailing list and not to this one. but i can tell you straight away that the ngram server is behaving correctly. it waits for requests ... Miles On 26 November 2010 11:28, Korzec, Sanne sanne.kor...@wur.nl wrote: Hi, I have compiled SRILM on

Re: [Moses-support] Proposal to replace vertical bar as factor delimeter

2010-11-15 Thread Miles Osborne
i second this. but can I make another suggestion. make the default be *non* factored input. i reckon that most people using Moses don't actually use factors (hands-up if you do). this means, plain input, with absolutely no meta chars in them. and if you are going to use meta-chars, why not

Re: [Moses-support] bag of words language model

2010-10-25 Thread Miles Osborne
i implemented this years ago (the idea then was to see if for free-word-order languages, phrases could be generalised). at the time it didn't seem that there was a more efficient way to do it than just generate permutations and score them. and if you think about it, this is essentially the

Re: [Moses-support] train-truecaser.perl proposed tweak

2010-10-25 Thread Miles Osborne
this sounds risky to me. it would be better to allow the user to specify the behaviour; for your suggestions, you would add an extra flag which would enable this. the default would be for truecasing to operate as it used to. Miles On 25 October 2010 17:37, Ben Gottesman

Re: [Moses-support] about Morph tagging

2010-10-20 Thread Miles Osborne
ah, my apologies --I didn't realise you also wanted morphological information. in that case, you will need something like Fran's suggestion Miles On 20 October 2010 11:12, Francis Tyers fty...@prompsit.com wrote: You could use the morphological analysers from the Apertium project.

Re: [Moses-support] mteval-v11b

2010-10-17 Thread Miles Osborne
note also that NIST changed to IBM BLEU recently which has a different treatment of multiple references. (mteval 13 uses IBM BLEU if i recall) generally the BLEU scores will be a little lower than before, but MERT performance should be more robust Miles On 17 October 2010 09:57, liu chang

Re: [Moses-support] max-phrase-length vs. number of scores

2010-10-06 Thread Miles Osborne
the phrase length refers to the number of words in a phrase and the number of scores to the number of feature function, per phrase. they have nothing to do with each other On 6 October 2010 11:31, supp...@precisiontranslationtools.com wrote: I found this message below, which mentions the

Re: [Moses-support] giza++ best alignment

2010-10-03 Thread Miles Osborne
clearly changing the configuration will change the alignment results. i suggest that before mailing the list again, you read this article: A Systematic Comparison of Various. Statistical Alignment Models. Franz Josef Och*. Hermann Ney http://acl.ldc.upenn.edu/J/J03/J03-1002.pdf Miles

Re: [Moses-support] Problem with tuning

2010-09-27 Thread Miles Osborne
looking at your output: [ERROR] Malformed input at Expected input to have words composed of 1 factor(s) (form FAC1|FAC2|...) but instead received input with 0 factor(s). sh: line 1: 5114 Aborted make sure you have no bar (|) characters in the data Miles On 27 September 2010 14:45, Souhir

Re: [Moses-support] wrong alignment

2010-09-24 Thread Miles Osborne
it is probably more helpful to give the number of sentences you used for language model training (and other details, eg ngram order). but at first glance that looks like a tiny amount of language model data --i would expect to see something closer to 2GB or so, depending upon representation

Re: [Moses-support] qsub and EMS again

2010-09-03 Thread Miles Osborne
don't really understand how the setup works. Thanks again, Suzy On 2/09/10 8:26 PM, Miles Osborne wrote: a better setup would be to have a loop which did the following: --for a given version number and step, check for STDERR, STDOUT and DONE --if they are all found, exit --otherwise sleep

Re: [Moses-support] mert-moses.pl working-dir tmp

2010-09-01 Thread Miles Osborne
this is after a crash I presume? if so, then you should delete the step which creates the first config file. this will force it to be recreated, using the current version. below is a small perl script I use (for an older version of experiment.perl, but it should work for you too). this was

Re: [Moses-support] problem with tokenizer.perl

2010-06-27 Thread Miles Osborne
see here: http://jeremy.zawodny.com/blog/archives/010546.html for a discussion of utf8 v UTF8 ... now off to see England triumphant against Germany Miles On 27 June 2010 13:23, Miles Osborne mi...@inf.ed.ac.uk wrote: on the subject of UTF8, i think the Moses tokeniser may be using

Re: [Moses-support] moses may 10

2010-05-11 Thread Miles Osborne
On 11 May 2010 17:33, Christian Hardmeier c...@rax.ch wrote: For my purposes, even a hard-coded assumption of 1, along with a more transparent error message if the model isn't found, would do. Does anybody actually decode with in-memory phrase tables in real life? (well, I suppose some people

Re: [Moses-support] A few MOSES questions (Arabic, missing scripts, Moses error)

2010-05-07 Thread Miles Osborne
MADA can create tokens that are bar characters (ie | ) you need to rename them to something like BAR. Moses treats these as factor delimiters, hence the message you are seeing (i've been using MADA+TOKAN for Arabic, using the D2 setting) Miles On 7 May 2010 23:26, David Edelstein

Re: [Moses-support] different tune set diferent tuned parameters !

2010-05-02 Thread Miles Osborne
there is a large amount of randomness involved with parameter tuning. each time you run it (using the same language resources) you might get different parameters, also, the parameters are not scaled. this means that one run might give you these values: 10 20 30 and the next run might give you

Re: [Moses-support] IRSTLM error: converting iARPA to ARPA format

2010-04-21 Thread Miles Osborne
this means you have run out of memory. you can either: --get more memory --use less data --use a lower-order LM --use RandLM, which can easily handle this amount of data (i am currently building LMs using more than 30 billion words with it for example) Miles On 21 April 2010 09:57, Zahurul

Re: [Moses-support] Moses-support Digest, Vol 41, Issue 36

2010-03-28 Thread Miles Osborne
a quick question. will this break compatibility with existing training runs? also, adding new features --even if they are not used-- can impact upon MERT and may slow things down / make things worse. have you verified (using multiple runs) that this new feature doesnt' make things worse than

Re: [Moses-support] Dictonary use during training

2010-02-23 Thread Miles Osborne
re: adding dictionary entries, this is certainly a hack. but the standard trick is to pretend that the dictionary actually consists of tiny parallel sentences. you therefore just append each word-entry as a new sentence pair. don't bother with that -d option. Miles On 23 February 2010 18:34,

Re: [Moses-support] skipping incompatible liboolm.a

2010-02-22 Thread Miles Osborne
this is a standard error. you need to build SRILM using 64-bit support (i686-m64) Miles On 22 February 2010 11:40, Marce van Velden marcevanvelde...@gmail.com wrote: Hi, I get the folowing error when trying to compile moses on a intel64 pc. What could cause the liboolm.a to be incompatible?

Re: [Moses-support] Build Moses for translating English to Chinese.

2010-02-11 Thread Miles Osborne
How words are tokenised / segmented etc is crucial when using small amounts of data. For the vast numbers of people using Moses (people not training-up on millions of sentence pairs) this is the kind of thing that needs to be done correctly. It would be a service to extend the Moses tokeniser to

Re: [Moses-support] moses for haitian relief

2010-01-27 Thread Miles Osborne
it looks to me like you have not correctly compiled / installed the srilm. Miles 2010/1/27 christopher taylor christopher.paul.tay...@gmail.com: hello everyone! i'm currently trying to build an instance of moses to support crisiscommons.org's machine translation project (i'm currently the

Re: [Moses-support] Moses on the iPhone

2010-01-12 Thread Miles Osborne
you should also look at RandLM, as it will enable you to run a language model in small space. that aside, i would look hard at pruning the various tables (eg phrase tables, reordering, language models) so you can just the core that you need. this will make for faster loading etc. note also that

Re: [Moses-support] different servers + different time - differentresult?

2010-01-11 Thread Miles Osborne
2010-01-11 发件人: Miles Osborne 发送时间: 2010-01-11 16:12:38 收件人: 李贤华 抄送: moses-support 主题: Re: [Moses-support] different servers + different time - differentresult? Giza++ and MERT both can produce different results, even when using the same

Re: [Moses-support] The results of your email commands

2009-12-23 Thread Miles Osborne
randlm is already in a binary format, so there is no extra conversion loading randomised models faster is not something that we have really looked at. Miles 2009/12/23 Arda Tezcan arda...@yahoo.com: Hi, I would really appreciate it if you could help me with the following question I have: I

Re: [Moses-support] moses threads compilation problem (with RandLM)

2009-12-17 Thread Miles Osborne
Making RandLM thread-safe is something I've been thinking about. There are a number of bug fixes which need dealing with too, so perhaps at some point I'll push out a new release. Miles 2009/12/17 Alexander Fraser fra...@ims.uni-stuttgart.de: Hi Barry and Philipp, Philipp is correct,

Re: [Moses-support] Looking for non-CLI tool for aligning parallel text

2009-10-28 Thread Miles Osborne
/anthology-new/W/W02/W02-1018.pdf On Tue, Oct 27, 2009 at 10:51 PM, Catalin Braescu cata...@braescu.com wrote: Then I wonder how can aligning be done automatically for phrases? And what's the accuracy of such process? Catalin Braescu On Wed, Oct 28, 2009 at 12:36 AM, Miles Osborne mi

Re: [Moses-support] Looking for non-CLI tool for aligning parallel text

2009-10-27 Thread Miles Osborne
data to do well at Chinese-English than with Spanish-English. Miles 2009/10/27 Catalin Braescu cata...@braescu.com: Then I wonder how can aligning be done automatically for phrases? And what's the accuracy of such process? Catalin Braescu On Wed, Oct 28, 2009 at 12:36 AM, Miles Osborne mi

Re: [Moses-support] How many and/or which language model(s) to use?

2009-10-22 Thread Miles Osborne
you can't supply language models for both directions: you need to supply them for the target and not the source Miles 2009/10/22 Ivan Uemlianin i.uemlia...@bangor.ac.uk: Dear All I am using Moses with irstlm.  The language pair I am developing is English and Welsh.  I have built language

Re: [Moses-support] Looking for text corpora

2009-09-06 Thread Miles Osborne
the only other source of lots of parallel data (I know about) is the LDC: http://www.ldc.upenn.edu/ but this is not free ... Miles 2009/9/6 Catalin Braescu cata...@braescu.com: Thanks, Miles! From your link I got http://www.statmt.org/europarl/ Any other such goodies? Catalin --

Re: [Moses-support] EM Model 1 question

2009-07-27 Thread Miles Osborne
the good thing about probabilities is that they should sum to one (but you can get numerical errors giving you slightly more / less ...) Miles 2009/7/27 James Read j.rea...@sms.ed.ac.uk Ok. Thanks. I think I understand this now. I also think I have found the bug in the code which was causing

Re: [Moses-support] How to create Two-way translator and accelerate.

2009-05-05 Thread Miles Osborne
and don't forget to look at RandLM -this can save you a lot of memory for your language model (a lot more than IRSTLM) plug over! Miles 2009/5/5 Marcin Miłkowski milek...@o2.pl: Jan Helak pisze:  I have one last question. Final version will be builded with ap. 50 MB of polish text and 50 MB

Re: [Moses-support] How to create Two-way translator and accelerate.

2009-05-04 Thread Miles Osborne
actually, i think Jan wants a speedup, not a space saving. your best bet is to reduce the size of the beam: http://www.statmt.org/moses/?n=Moses.Tutorial#ntoc6 Miles 2009/5/4 Francis Tyers fty...@prompsit.com: El lun, 04-05-2009 a las 14:54 +0200, Jan Helak escribió: Hello everyone :) I try

Re: [Moses-support] How to create Two-way translator and accelerate.

2009-05-04 Thread Miles Osborne
Miłkowski milek...@o2.pl: Miles Osborne pisze: filtering etc might give you a speed-up (eg a constant one --less stuff to load) but if filtering is safe w.r.t to the source data, then you shouldn't see much here. (pruning the table should make it faster since there will be fewer options to consider

Re: [Moses-support] How to create Two-way translator and accelerate.

2009-05-04 Thread Miles Osborne
also see fewer page faults and the like with a smaller model and that will help matters. but in general, the beam size is the most direct way to make it faster. Miles 2009/5/4 Francis Tyers fty...@prompsit.com: El lun, 04-05-2009 a las 14:08 +0100, Miles Osborne escribió: actually, i think Jan

Re: [Moses-support] Results quality when using moses with randlm

2009-04-16 Thread Miles Osborne
there are many factors here. firstly, the randomised LM makes errors as a function of the false positive rate and the values (quantisation) level. roughly, the smaller these parameters are, the smaller your LM will be, but there may be a performance drop. secondly, the default count-based

Re: [Moses-support] Error when run moses with lattices format as input

2009-04-16 Thread Miles Osborne
in general, when you compile a c or c++ program, you add the switch -g to the options (usually in a Makefile). this will tell the compiler to add stuff to the program so that it works with gde. you then do: gdb moses and you will see a prompt. you then run moses within that prompt, but

Re: [Moses-support] Fetching older versions of moses

2009-03-12 Thread Miles Osborne
assuming the current version hasn't been fixed to deal with the LM problem affecting older versions of gcc: --check-out the code using SVN as usual, ie svn co https://svn.sourceforge.net/svnroot/mosesdecoder/trunk mosesdecoder then look at the SVN logs: svn log | less find some version

Re: [Moses-support] How is the final LM score obtained?

2009-03-05 Thread Miles Osborne
a couple of points: --you are asking ngram for perplexities scores, but Moses uses log probs --Moses will append s and /s pseudo-words to the start and end ot a sentence; this will change the probabilities Miles 2009/3/5 Carlos Henriquez carlo...@gps.tsc.upc.es: Hi all. I'm making some

Re: [Moses-support] word alignment symmetrisation heuristics

2009-03-04 Thread Miles Osborne
one thing to remember is that the link between AER and BLEU is not obvious; in my view at least AER-like scores should be treated with skepticism and the real merit of an alignment approach should be the corresponding translation performance (BLEU etc). can you provide associated BLEU scores for

Re: [Moses-support] word alignment symmetrisation heuristics

2009-03-04 Thread Miles Osborne
anyway. this is, I guess, because it's better on recall. AER seems to strongly prefer precision. jorg On Wed, 4 Mar 2009 13:46:36 + Miles Osborne mi...@inf.ed.ac.uk wrote: one thing to remember is that the link between AER and BLEU is not obvious; in my view at least AER-like scores

Re: [Moses-support] Moses on a mac

2009-03-03 Thread Miles Osborne
there is a related bug with randlm which i'm looking at now. whilst i'm doing this, can you verify that it is some mac-specific problem and not say something due to the gcc version you are using? Miles 2009/3/4 Kemal Oflazer k...@cs.cmu.edu: Dear All I just install moses on  a large mac

Re: [Moses-support] Error in running moses with randlm

2009-02-24 Thread Miles Osborne
ok, i'll try to work it out. can you: --mail me your moses.ini file --mail me the commands you ran to create your language model --tell me exactly how much language model data you used and what it is; if it is europarl then that should be ok Miles 2009/2/24 Michael Zuckerman

Re: [Moses-support] Error in RandLM

2009-02-19 Thread Miles Osborne
to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Written by Torbjorn Granlund and Richard M. Stallman. Miles 2009/2/19 Miles Osborne mi...@inf.ed.ac.uk: what happens when you run this? Miles 2009/2/19 Michael Zuckerman michael90...@gmail.com: Hi, I am

Re: [Moses-support] Error in RandLM

2009-02-19 Thread Miles Osborne
that might be it. but i seem to have it working here, using a non-gzipped version of Europarl. in any case, Michael: tell us if it works when the corpus is gzipped Miles 2009/2/19 Barry Haddow bhad...@inf.ed.ac.uk: Hi I've seen this error before. The short answer is that you need to use a

Re: [Moses-support] RandLM compressor cat bug.

2008-12-14 Thread Miles Osborne
ah, ok. i think David hit it on the head: randlm is currently in the very first release and to my knowledge hasn't been extensively tested under various setups. we'll gather together these problems and add them into the next release. Miles 2008/12/7 Radek Bartoň xbart...@stud.fit.vutbr.cz:

Re: [Moses-support] RandLM compressor cat bug.

2008-12-06 Thread Miles Osborne
which version of unix are you using? MIles 2008/11/28 Radek Bartoň [EMAIL PROTECTED]: Hello. Since there is no RandLM mailing list (at least I haven't found one) I'm posting here. When creating language model with cat compressor, buildlm fails (on my system) with error: cat: invalid

Re: [Moses-support] translation result change from time to time

2008-11-21 Thread Miles Osborne
it could be due to things like the way ties are broken, floating-point errors and the like Miles 2008/11/21 Hieu Hoang [EMAIL PROTECTED]: that would be a worrying. are you sure all parameters are the same ? loading the models and memory shouldn't affect the results. there may rarely be

[Moses-support] Announcement: RandLM

2008-11-03 Thread Miles Osborne
about this, then look at our ACL and EMNLP papers: David Talbot and Miles Osborne. Smoothed Bloom filter language models: Tera-Scale LMs on the Cheap. EMNLP, Prague, Czech Republic 2007. http://www.iccs.informatics.ed.ac.uk/~osborne/papers/emnlp07.pdf David Talbot and Miles Osborne. Randomised

Re: [Moses-support] Significance of BLEU using Multi-bleu

2008-09-18 Thread Miles Osborne
firstly, do MERT and make sure that everything has reasonable parameters! this is how to think about testing. you are trying to estimate the error of your model (which you trained-up in the usual way). when estimating this error, the *training set* is the test set. so, the more `training'

[Moses-support] Fwd: Fwd: Moses: Prepare Data, Build Language Model and Train Model

2008-08-14 Thread Miles Osborne
(my message bounced as it was too long ... here is a truncated version) Miles -- Forwarded message -- From: Miles Osborne [EMAIL PROTECTED] Date: 2008/8/14 Subject: Re: [Moses-support] Fwd: Moses: Prepare Data, Build Language Model and Train Model To: Llio Humphreys [EMAIL

Re: [Moses-support] Fwd: Moses: Prepare Data, Build Language Model and Train Model

2008-08-14 Thread Miles Osborne
building language models (using for example ngram-count) is computationally expensive. from what you tell the list, it seems that you don't have enough physical memory to run it properly. you have a number of options: --specify a lower order model (eg 4 rather than 5, or even 3); depending

Re: [Moses-support] Moses: Prepare Data, Build Language Model and Train Model

2008-08-13 Thread Miles Osborne
an ugly hack is to simply create a soft link to the i686-m64 directory (as i recently did on a new 64 bit machine) Miles 2008/8/13 Sara Stymne [EMAIL PROTECTED] Hi! When we installed SRILM and Moses on our 64-bit Ubuntu machine we had some troubles with getting the machine type right. What

  1   2   >