Re: [Moses-support] reg. moses installation

2021-04-22 Thread Kenneth Heafield
Your training corpus is empty. cat ~/corpus/news-commentary-v8.fr-en.true.en On 4/22/21 9:50 PM, Namrata Hadimani wrote: > Hi All, > > I am trying to perform Language Model Training using the below command  > > ~/mosesdecoder/bin/lmplz -o 3 <~/corpus/news-commentary-v8.fr-en.true.en > >

Re: [Moses-support] Failed to get the language model by KenLM

2020-08-25 Thread Kenneth Heafield
It appears you are trying to run on a machine with very different libraries from the machine you compiled on. Don't do that. Compile on the same machine. On 8/8/20 12:09 PM, Chen, Y. wrote: > Dear Hieu,  > > Thank you for your help! I solved this problem and built the language > model. But

Re: [Moses-support] Segmentation fault in processPhraseTableMin

2020-06-17 Thread Kenneth Heafield
The CMPH software updated formats at some point but nobody changed Moses. Use a vintage CMPH or have fun hacking... On 6/17/20 10:08 PM, ser...@prompsit.com wrote: > The subject of my previous message is wrong. Actually the problem is > with queryPhraseTableMin as the content of the message

[Moses-support] Translation Efficiency Shared Task at WNGT 2020

2020-02-23 Thread Kenneth Heafield
The WNGT 2020 Efficiency Shared Task https://sites.google.com/view/wngt20/efficiency-task invites submissions of efficient machine translation systems. Participants build a WMT19 English-German system (or start from pre-built ones) and optimize for quality, speed, RAM, model size, or any

[Moses-support] Truecaser and < >

2020-02-20 Thread Kenneth Heafield
Dear Moses, I noticed some odd behavior in the truecaser whereby it tokenizes < and > at the end of a word. Is this intended? Maybe the answer is I should have run the tokenizer first so it would be and and therefore this is undefined. Input: a a< a> foo< Output: a a < a >

[Moses-support] Join the machine translation group in Edinburgh

2020-02-13 Thread Kenneth Heafield
Alexandra Birch and I are hiring five researchers in machine translation. Applicants can be pre-PhD or post-PhD. Apply before 17:00 GMT on 16 March 2020: https://www.vacancies.ed.ac.uk/pls/corehrrecruit/erq_jobspec_version_4.jobspec?p_id=051331 The researchers will work on EU/EPSRC research

[Moses-support] Faculty position in Natural Language Processing at the University of Edinburgh

2019-12-18 Thread Kenneth Heafield
FACULTY POSITIONS AT THE UNIVERSITY OF EDINBURGH Lecturer/Senior Lecturer/Reader in Natural Language Processing Lecturer/Senior Lecturer/Reader in Computational Social Science Applications are invited for two faculty positions in Natural Language Processing and Computational Social Science in

[Moses-support] PhD Studentships at the University of Edinburgh

2019-10-14 Thread Kenneth Heafield
PhD studentships in machine translation, computational linguistics, speech technology, and cognitive science Institute for Language, Cognition and Computation School of Informatics University of Edinburgh The Institute for Language, Cognition and Computation (ILCC) at the University of Edinburgh

Re: [Moses-support] Run Moses with GPU

2019-05-04 Thread Kenneth Heafield
Hi Arezoo, You can find GPU-based translation systems here: https://marian-nmt.github.io/ https://github.com/EdinburghNLP/nematus The quality will probably also be better. Be warned you need GPU RAM. Kenneth On 5/4/19 10:16 AM, Arezoo Arjomand wrote: > Hi > >

[Moses-support] 3-Year Postdocs in Machine Translation at the University of Edinburgh

2019-04-07 Thread Kenneth Heafield
for the current call is 30 April but another position will soon be open and due in early May. See the longer ad: https://neural.mt/jobs/ Kenneth Heafield Lecturer, University of Edinburgh ___ Moses-support mailing list Moses-support@mit.edu http

Re: [Moses-support] 5-gram discount out of range for adjusted count 2

2018-12-03 Thread Kenneth Heafield
f the following datasets from OPUS: > > * GNOME > * OpenSubtitles 2018 > * Tanzil > * Tatoeba > * Ubuntu > > Thanks, > James > > On Mon, 3 Dec 2018 at 11:58, Kenneth Heafield <mailto:mo...@kheafield.com>> wrote: > &

Re: [Moses-support] 5-gram discount out of range for adjusted count 2

2018-12-03 Thread Kenneth Heafield
Hi,     If I had to guess, you have a lot of duplicated text?  Kenneth On 12/3/18 11:23 AM, James Baker wrote: > Morning, > > I've been trying to train a language model using the following command: > >     /opt/model-builder/mosesdecoder/bin/lmplz -o 5 -S 80% -T /tmp < > lm_data.en > model.lm >

Re: [Moses-support] Binary file using KenLM

2018-06-24 Thread Kenneth Heafield
It's in your bin directory. bin/build_binary On 06/24/2018 01:33 PM, Kamal Deep Garg wrote: > Dear Sir > > i am using mose4. i created arpa file using KENLM. > > i want to convert it to binary format using this command. > > kenlm/build_binary filename.arpa filename.binary > > but i am able

Re: [Moses-support] Free cloud service to train NMT

2018-06-08 Thread Kenneth Heafield
https://www.microsoft.com/en-us/research/academic-program/data-science-award/ https://cloud.google.com/edu/?options=research-credits On 06/08/2018 05:06 PM, Hieu Hoang wrote: > try this >   https://developer.nvidia.com/academic_gpu_seeding > or search the web > > Hieu Hoang > > On 8 June 2018

Re: [Moses-support] Dual Licensing or relicensing Moses

2018-05-29 Thread Kenneth Heafield
Hi,     Just to clarify that employees of the University of Edinburgh would technically go to the university while PhD students keep the code they write.  Our IP people won't mind if we authors choose to allow another license.  Kenneth On 05/29/2018 11:03 AM, Lane Schwartz wrote: > The source

Re: [Moses-support] Dual Licensing or relicensing Moses

2018-04-10 Thread Kenneth Heafield
Looks like 19 people when the nonbreaking_prefixes is included and multiple e-mail addresses for the same person are collapsed. git log tokenizer.perl ../share/nonbreaking_prefixes/* |grep Author |sort -u Some of whom have invalid e-mail addresses, but can probably be tracked down. Kenneth On

Re: [Moses-support] detecting if a translation is machine or human translation

2018-04-06 Thread Kenneth Heafield
Google watermarked their translation output: https://research.google.com/pubs/archive/37162.pdf Would be good to check if they're still doing this with neural systems. On 04/06/2018 09:14 AM, Mathias Müller wrote: > Hi Ryan > > My two cents: > > First of all, a way of detecting

Re: [Moses-support] Error installing moses

2018-03-17 Thread Kenneth Heafield
Moses doesn't use NPLM option, so there's no point in compiling with --with-nplm .  For what it's worth, it's meant to compile against this fork https://github.com/kpu/nplm and NPLM has since changed.  On 03/17/2018 07:52 PM, krishna chaitanya gudipati wrote: > Hi, > I am getting some error

Re: [Moses-support] Moses 4 binaries for macOS

2018-01-12 Thread Kenneth Heafield
Hi Tom. lmplz doesn't need libxmlrpc_xmltok. Looks like a case of over-aggressive dependencies, resulting in a binary that needs a library it doesn't use. One could install xmlrpc-c (a third-party library used by Moses server) in the same path. Or I guess substitute lmplz from

Re: [Moses-support] thread number during tuning and decoding has to be the same ?

2017-12-12 Thread Kenneth Heafield
Hi Manli, Just edit the configuration to change the number of threads as you like. Kenneth On 12/12/2017 08:16 PM, Manli Zhu wrote: > Hello, > > I set the thread number to 12 during tuninng process bc my server has > 12 cpus. So the moses.ini has a line indicating thread = 12, which >

Re: [Moses-support] Most likely next word using KenLM

2017-11-06 Thread Kenneth Heafield
Hi Daniel, The data structures are keyed on the word being predicted, which is inefficient at predicting every possible continuation. A forward trie is much better at implementing these sorts of queries. I was designing for random query speed. You can eliminate backoff

Re: [Moses-support] Training Language Model on POS

2017-10-28 Thread Kenneth Heafield
Hi, You convert the words to part of speech using an external tagger (lmplz does not include POS detection). Then you'll probably need to run lmplz --discount_fallback because the vocabulary is small. Kenneth On 10/28/2017 02:06 AM, Aileen Joan Vicente wrote: > Hi! I am learning

[Moses-support] Funding to visit Edinburgh

2017-10-20 Thread Kenneth Heafield
Dear Moses, There is funding to visit Edinburgh for a minimum of 6 months. I may be able to get it for the right person. If you are interested, let me (not the list) know by 24 October (yes, this is very late notice!). Eligibility restrictions (theirs, not mine): * Visitors should

Re: [Moses-support] Error while giving lattice with epsilon as input to moses.

2017-09-07 Thread Kenneth Heafield
It seems nobody implemented epsilons. You're welcome to implement them. On 09/07/2017 09:40 PM, Sanket Gandhare wrote: > I am trying to give input to moses as word lattice having epsilons as > well, '*EPS*'. but it is giving this result : > > terminate called after throwing an instance of

Re: [Moses-support] Request for help w/ "The build failed."

2017-05-30 Thread Kenneth Heafield
x on...wax off... > > All the Best, > Chaz > -------- > On Mon, 5/29/17, Kenneth Heafield <mo...@kheafield.com> wrote: > > Subject: Re: [Moses-support] Request for help w/ "The build failed." > To: moses-support@mit.edu, "Hieu

Re: [Moses-support] Request for help w/ "The build failed."

2017-05-29 Thread Kenneth Heafield
A symlink for CreateProbingPT2 has nothing to do with KenLM. The symlink already exists and the build system is trying to make it again (this also means not windows). I suppose we should be using ln -sf. Try deleting CreateProbingPT2 then rebuilding. Kenneth On May 29, 2017 1:04:25 AM

Re: [Moses-support] Segfault from moses while tuning

2017-05-17 Thread Kenneth Heafield
Can we have a better error message than "Segmentation fault" when that happens? On 05/17/2017 01:26 PM, Hieu Hoang wrote: > ah yes, I think the phrase-table was created in the version when [ and ] > weren't reserved characters but now they are. So you have to use the > executables in that

Re: [Moses-support] Discount fallback in KenLM on super small file

2017-04-24 Thread Kenneth Heafield
Yes. Formally, the condition is in range, not just computable. On April 24, 2017 4:18:41 AM GMT+01:00, liling tan wrote: >Dear Moses community, > >Is it correct that when using --discount_fallback, if discount is >computable from Kneyser-Ney, the fallback will not be

Re: [Moses-support] Getting problem while creating language model

2017-04-17 Thread Kenneth Heafield
IRSTLM has its own mailing list: https://list.fbk.eu/sympa/info/user-irstlm . It appears IRSTLM is trying to create a temporary directory in the current working directory. Try switching to a directory where you have write permission before running. Advertisement:

Re: [Moses-support] Rebuilding moses binary only

2017-03-30 Thread Kenneth Heafield
cd moses-cmd bjam moses It will be hidden in some long bjam path that depends on your environment, not installed into bin though. On 03/30/2017 10:41 PM, Nikolay Bogoychev wrote: > I've been asking this same question since late 2013..? > > On Thu, Mar 30, 2017 at 10:30 PM, Marcin

Re: [Moses-support] lmplz crashed on joint_order

2017-03-29 Thread Kenneth Heafield
How embarrassing. Can you try on head from github.com/kpu/kenlm ? If that fails, I can take this off list. Kenneth On March 29, 2017 3:39:20 PM GMT+01:00, Dingyuan Wang wrote: >Dear list, > >lmplz crashed on my machine recently. Command is > >lmplz -o 4 -S 70% --text

Re: [Moses-support] feature function referring headers in phrase-extract?

2017-02-23 Thread Kenneth Heafield
unit test for the moment? > > Shuoyang > > > > >> On Feb 22, 2017, at 1:35 PM, Kenneth Heafield <mo...@kheafield.com >> <mailto:mo...@kheafield.com>> wrote: >> >> The main moses target already includes moses/*.cpp (with some exceptions >> that you

Re: [Moses-support] GIZA++

2017-02-23 Thread Kenneth Heafield
Hi, giza++ now lives at https://github.com/moses-smt/giza-pp . Can you point us to the place in the documentation where this outdated information appeared? The manual www.statmt.org/moses/manual/manual.pdf does have broken footnotes, but the wget command appears to be correct.

Re: [Moses-support] feature function referring headers in phrase-extract?

2017-02-22 Thread Kenneth Heafield
uggested later when it's working > > Hieu Hoang > http://moses-smt.org/ > > On 22 February 2017 at 17:10, Kenneth Heafield <mo...@kheafield.com > <mailto:mo...@kheafield.com>> wrote: > > Hi, > > phrase-extract depends on moses c

Re: [Moses-support] feature function referring headers in phrase-extract?

2017-02-22 Thread Kenneth Heafield
Hi, phrase-extract depends on moses c.f. phrase-extract/Jamfile:7. alias deps : $(most-deps:B).o ..//z ..//boost_iostreams ..//boost_filesystem ../moses//moses ../moses//ThreadPool ../moses//Util ../util//kenutil ; So rather than copy, move it to moses. More cleanly, you could extract

[Moses-support] 3-5 year postdoc/fellowship in fast neural machine translation

2016-11-22 Thread Kenneth Heafield
Dear Moses, The Alan Turing Institute, a joint venture of five universities, including the University of Edinburgh, is recruiting research fellows (~postdocs): https://www.turing.ac.uk/opportunities/ . These last 3-5 years. The position is in London or possibly Edinburgh depending on

Re: [Moses-support] Character ngrams using KenLM

2016-11-09 Thread Kenneth Heafield
No. Tokenizer and LM are separate tools. You can of course replace space with a token like or something. On November 9, 2016 6:04:07 AM GMT+00:00, Nat Gillin wrote: >Dear Moses community, > >Other than manually replacing space with an unused character and adding

Re: [Moses-support] Another compile question

2016-10-19 Thread Kenneth Heafield
Use the home directory strategy from https://kheafield.com/code/kenlm/dependencies/ On 10/19/2016 01:36 PM, Mike Ladwig wrote: > I seem to have run into the zlib "invalid distance" bug on Red Hat > enterprise linux 7. Is there a way to get the moses bjam build system to > ignore the system zlib

[Moses-support] Research Associate (postdoc) in Machine Translation at the University of Edinburgh

2016-09-26 Thread Kenneth Heafield
, the university now pays for application fees. Happy applying, Kenneth Heafield Lecturer (Assistant Professor in en-US), University of Edinburgh P.S. The system does show me your applications until 13 October, but feel free to contact me. ___ Moses-support

Re: [Moses-support] help me

2016-09-15 Thread Kenneth Heafield
https://github.com/moses-smt/mosesdecoder/archive/master.zip On 09/15/16 11:20, Selva Nalladurai wrote: > Hello guys, > > Please provide me with the link, where i can download the > complete moses toolkit > > > >Regards, >

Re: [Moses-support] cube pruning question

2016-08-17 Thread Kenneth Heafield
Moses: pass -cube-pruning-lazy-scoring and it will call the LM as items come out of the queue. Default is before they go into the queue. mtplz is both and everything in between. Initially they go into the queue with no LM, then items get incremental updates as they surface. A completely

Re: [Moses-support] Moses 3.0 cannot start with configuration file and models of moses 2.0

2016-08-17 Thread Kenneth Heafield
Hi, Ok, master now accepts "false" for 0 again. And I've made the error message more helpful. Kenneth On 08/17/2016 09:31 PM, Eleftherios Avramidis wrote: > Hi, > > I am looking again on this. The error occurs when the moses.ini file contains > this setting: > > KENLM lazyken=false

Re: [Moses-support] Error while compiling moses with boost using bjam

2016-08-04 Thread Kenneth Heafield
lto:ta...@erxindia.in>> wrote: > > Hi Kenneth, > > Thanks for letting me know. I will try and get back if there are > any other problems. > > On Wed, Aug 3, 2016 at 3:31 PM, Kenneth Heafield > <mo...@kheaf

Re: [Moses-support] Error while compiling moses with boost using bjam

2016-08-03 Thread Kenneth Heafield
You need to install zlib first, including any development version of zlib. Further,I suspect your Boost installation is incomplete if you didn't install zlib first. https://kheafield.com/code/kenlm/dependencies/ Kenneth On 08/03/2016 10:51 AM, Tarun Guntuka wrote: > Hi Experts, > > I could

Re: [Moses-support] problem with compiling moses on ubuntu 14.04

2016-07-04 Thread Kenneth Heafield
Are you using both IRSTLM and SRILM? I doubt it. The error appears to be due to IRSTLM version mismatches; simplest option is to remove --with-irstlm. Kenneth On 07/04/2016 11:33 AM, samane shahmohamadi wrote: > hi all > I got error while running this command > > ./bjam

Re: [Moses-support] Language model interpolation without SRILM

2016-06-30 Thread Kenneth Heafield
> > Thanks again, > Mathias > > On Tue, Jun 28, 2016 at 6:08 PM, Kenneth Heafield <mo...@kheafield.com > <mailto:mo...@kheafield.com>> wrote: > > Log-linear interpolation is in KenLM in the lm/interpolate directory. > You'll want to get KenLM from gi

Re: [Moses-support] Language model interpolation without SRILM

2016-06-28 Thread Kenneth Heafield
Oh also, use a small -S argument to the interpolate program because it doesn't quite budget memory properly yet. On 06/28/2016 05:08 PM, Kenneth Heafield wrote: > Log-linear interpolation is in KenLM in the lm/interpolate directory. > You'll want to get KenLM from github.com/kpu/kenlm and c

Re: [Moses-support] Language model interpolation without SRILM

2016-06-28 Thread Kenneth Heafield
Log-linear interpolation is in KenLM in the lm/interpolate directory. You'll want to get KenLM from github.com/kpu/kenlm and compile with Eigen. Tuning log-linear weights is super slow, but applying them is reasonably fast. In total the tuning + applying weights time is comparable to SRILM.

Re: [Moses-support] Extract list of n-grams from Trie Language Model that contains a certain word

2016-06-04 Thread Kenneth Heafield
Kidd wrote: > > Thanks, that’s given me a good starting point. The next problem is > that the dump_trie program expects a vocab file which isn’t provided. > Any idea how I could create one? > > > > Thanks again, > > Graeme > > > > *From:*Kenneth Heafield

Re: [Moses-support] Extract list of n-grams from Trie Language Model that contains a certain word

2016-06-04 Thread Kenneth Heafield
The trie file you have contains conditional probabilities and backoffs but not counts. If you're OK with that, check out/modify the dump_trie program in the bounded-noquant branch of github.com/kpu/kenlm . It can stream but you will need to do ulimit -v with something above 6 TB even though

Re: [Moses-support] "Feature name SRILM is not registered."

2016-05-29 Thread Kenneth Heafield
It's KENLM, not KenLM according to Hieu, who did name it after all. Kenneth On 05/29/2016 10:19 PM, Anna Garbar wrote: > Hi Sašo, > > Thanks for your reply. Before recompiling moses with srilm, I also tried > to changed SRILM to KenLM im the moses.ini (under feature functions), > but received

Re: [Moses-support] "Feature name SRILM is not registered."

2016-05-29 Thread Kenneth Heafield
Website edited, thanks for the excuse. Kenneth On 05/29/2016 10:10 PM, Sašo Kuntaric wrote: > Hi Anna, > > You are probably using KenLM as it's the default language model making > tool. The factored tutorial however has the parameter for using SRILM. > In the "lm

Re: [Moses-support] Factor pointer shared across sentences?

2016-05-26 Thread Kenneth Heafield
Yes. See FactorCollection. On 05/26/2016 10:45 PM, Shuoyang Ding wrote: > Hi all, > > I'm thinking about implementing some cache-based methods to speed up > feature score evaluation. Hence it'll be interesting to know whether the > factors are shared across sentences, or put it another way, if

Re: [Moses-support] kenlm multithreading

2016-04-28 Thread Kenneth Heafield
When replying, please edit your Subject line so it is more specific > than "Re: Contents of Moses-support digest..." > > > Today's Topics: > >1. Call for Participation: IEEE DIPDMWC2016 Moscow, Russia >

Re: [Moses-support] kenlm multithreading

2016-04-27 Thread Kenneth Heafield
Yes, it uses threads when it wants to. There is no option to turn threads off (and no code path that would do so). One has limited control using block size and counts. Ideally it would be more parallel. Kenneth On 04/27/2016 03:25 PM, koormoosh wrote: > Hello, > > Out of curiosity, does

Re: [Moses-support] terminate called recursively terminate called recursively

2016-04-21 Thread Kenneth Heafield
Looks like an exception triggering destructors that throw an exception. If you can compile with debug then get a backtrace, hopefully that will tell us where somebody is throwing an exception from a destructor. On 04/21/2016 02:12 PM, Jorg Tiedemann wrote: > Hi, > > > I have this rather large

[Moses-support] PhD or MSc+PhD in Machine Translation at the University of Edinburgh

2016-04-21 Thread Kenneth Heafield
. I am also happy to hear from potential postdocs or visitors. Kenneth Heafield ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] KenLM scoring of long target phrases

2016-04-19 Thread Kenneth Heafield
Hi, Any words beyond N-1 have full context and are included in the phrase's score. So it's hypothesis + target phrase + adjustments. And the routine you cite is computing adjustments. Kenneth On 04/19/16 10:50, Evgeny Matusov wrote: > > Hi, > > > my colleagues and I noticed the following

Re: [Moses-support] loading time for large LMs

2016-04-12 Thread Kenneth Heafield
t;>>>> rather than the pt or lexicalized reordering model etc? >>>>> >>>>> If there's a way to make the model files available for download or to >>>>> give >>>>> me access your machine, i might be able to debug it >>>>&g

Re: [Moses-support] loading time for large LMs

2016-04-12 Thread Kenneth Heafield
o.uk/hieu >>> On 12 Apr 2016 08:41, "Jorg Tiedemann" <tiede...@gmail.com> wrote: >>> >>>> >>>> Unfortunately, load=read didn’t help. It’s been loading for 7 hours >>> now >>>> and no sign to start decoding. >&g

Re: [Moses-support] loading time for large LMs

2016-04-10 Thread Kenneth Heafield
hing but I didn’t have the > impression that this changed a lot. Does it really help and how much > would you usually gain? Thanks again! > > > Jörg > > >> On 10 Apr 2016, at 12:55, Kenneth Heafield <mo...@kheafield.com >> <mailto:mo...@kheafield.com>> wr

Re: [Moses-support] loading time for large LMs

2016-04-10 Thread Kenneth Heafield
Hi, I'm assuming you have enough RAM to fit everything. The kernel seems to preferentially evict mmapped pages as memory usage approaches full (it doesn't have to be full). To work around this, use load=read in your moses.ini line for the models. REMOVE any "lazyken" argument which

Re: [Moses-support] Filtering Binarized LM

2016-04-06 Thread Kenneth Heafield
Probing format models can't be filtered because they only retain hashes of ngrams. Trie format models can be filtered and dumped, but only with the very hacky and undocumented dump_trie program in the bounded-noquant branch. Hasn't been a priority to make it release quality; volunteers? Kenneth

Re: [Moses-support] why we should recompile moses for Phrase table compression

2016-04-05 Thread Kenneth Heafield
The compact phrase table uses CMPH. Compiling the first time using --with-cmph is sufficient. On 04/05/2016 11:03 AM, Hegde, Sujay wrote: > Dear Moses Admin/Phillip, > > > > As per http://www.statmt.org/moses/?n=Advanced.RuleTables, > > > > Download the CMPH library from

Re: [Moses-support] compile.sh with --static

2016-04-05 Thread Kenneth Heafield
The default falls back to shared as you note. It also links the implicit libraries like glibc dynamically. --static forces everything to be static, including turning off libSegFault if necessary, and failing if anything else isn't available statically. The build process falls back to shared

Re: [Moses-support] IRSTLM: Trash sentences getting more probability scores than proper grammatical sentences

2016-03-23 Thread Kenneth Heafield
kangaroo is less probable than snake. Which more than explains the difference you observed. Film at 11. That p() is pretty high. What happened when you used lmplz to build the model? Kenneth On 03/23/2016 09:28 AM, Bhat Irshad wrote: > I build a language model using IRSTLM on 20 million

Re: [Moses-support] KenLM loading error

2016-03-22 Thread Kenneth Heafield
Hi, I don't see a problem with this in principle. The error means that the calling code provided an out-of-range word id. Can I have a backtrace after compiling with debug? Kenneth On 03/21/2016 11:34 PM, Lane Schwartz wrote: > Hi, > > I have a tiny LM that is giving me some

Re: [Moses-support] Compilation problem

2016-03-19 Thread Kenneth Heafield
8 --debug-configuration -d2 |gzip >build.log.gz > > I've attached the new build.log as well. > > On Tue, Mar 15, 2016 at 3:38 PM, Kenneth Heafield <mo...@kheafield.com > <mailto:mo...@kheafield.com>> wrote: > > Smells like boost was compiled with a diff

Re: [Moses-support] failed updating 1 target...

2016-03-16 Thread Kenneth Heafield
The build log you attached isn't consistent with the error you're reporting. Smells like you have a broken half-installed bjam on your system, in which case you need to run ./bjam not bjam. On 03/16/2016 10:42 AM, Zhanwang Chen wrote: > Dear all, > > I am trying to install Moses according to

Re: [Moses-support] Compilation problem

2016-03-15 Thread Kenneth Heafield
Smells like boost was compiled with a different version of gcc than the one you're using to compile Moses, which can occasionally cause problems. On 03/15/2016 09:46 AM, Pratik Mehta wrote: > Hello, > I tried to compile Moses with the following command: > ./bjam -j4 > > The process ended with

Re: [Moses-support] Training backward LM?

2016-03-11 Thread Kenneth Heafield
There were failing unit tests. Paging Lane Schwartz. On 03/11/2016 03:56 PM, Hieu Hoang wrote: > I remember there is compilation issues with it. I guess at some point > someone must have gotten tired of looking after it and took it out of > the build. > > On 10/03/2016 23:36, Michael Denkowski

Re: [Moses-support] Is memory mapping lazy?

2016-02-19 Thread Kenneth Heafield
On 02/19/2016 11:38 PM, Kenneth Heafield wrote: > Hi, > > The default is mmap with MAP_POPULATE (see man mmap). As to whether > GPFS implements MAP_POPULATE correctly, I defer to the former IBM > employee. > > KenLM implements the following options via config.load

Re: [Moses-support] Is memory mapping lazy?

2016-02-19 Thread Kenneth Heafield
Hi, The default is mmap with MAP_POPULATE (see man mmap). As to whether GPFS implements MAP_POPULATE correctly, I defer to the former IBM employee. KenLM implements the following options via config.load_method: typedef enum { // mmap with no prepopulate LAZY, // On linux,

Re: [Moses-support] Using lmplz instead of SRI's ngram-count while training transliteration model

2016-02-18 Thread Kenneth Heafield
Hi, There are a few differences, most of which I'd expect you're fine with. - The discounts are different but you're using --discount_fallback so you know that. - Unknown word handling is different. If you want an SRI's IMHO broken behavior pass --interpolate_unigrams 0 (though if your

Re: [Moses-support] Problem with processPhraseTableMin

2016-02-02 Thread Kenneth Heafield
That typically causes a bus error. Why is there an overly huge malloc? On 02/02/2016 03:53 PM, Marcin Junczys-Dowmunt wrote: > I think it fills up your temporary folder, try "-T ." to specify thew > local folder for temporary files. > > On 02.02.2016 16:21, Jeremy Gwinnup wrote: >> Hi, >> >>

Re: [Moses-support] Moses-support post from jasneet.sabhar...@sfu.ca requires approval

2016-01-29 Thread Kenneth Heafield
t; Kneser-Ney wasn’t able to cope up with the counts being generated for >> coarse language models. Sp, I’ll train my LM using SRILM with ngram >> order 8 and WB smoothing and use KenLM with order 8 in Moses. >> >> Best, >> Jasneet >>> On Jan 23, 2016, at 3:38

Re: [Moses-support] Moses-support post from jasneet.sabhar...@sfu.ca requires approval

2016-01-23 Thread Kenneth Heafield
Hi, You can compile with --max-kenlm-order=8 or change the setting in the Eclipse files. The ARPA file format is interchangeable. You can build an ARPA using SRILM and Witten-Bell (though Bob Moore once called me out at a conference for suggesting that as an alternative to

Re: [Moses-support] Query for moses

2016-01-17 Thread Kenneth Heafield
If I had to guess, you're running out of virtual address space on 32-bit. Try -S 1G. On 01/17/2016 10:37 AM, rmogla wrote: > Hi, > I am a new user of moses and using it for the first time. Ihave > installed moses and giza++ on a 32 bit machine with ubuntu 15.04 , but > while doing language model

Re: [Moses-support] Skip OOV when computing Language Model score

2016-01-15 Thread Kenneth Heafield
ns​ > with some OOV-token-identifier such as before sending for > translation. > > > /Best Regards,/ > Ergun > > Ergun Biçici > DFKI Projektbüro Berlin > > > On Fri, Jan 15, 2016 at 12:22 AM, Kenneth Heafield > <mo...@kheafield.c

Re: [Moses-support] Error on lmplz

2016-01-13 Thread Kenneth Heafield
rams with adjusted count 3; > Is this small or artificial data? > Try deduplicating the input. To override this error for e.g. a > class-based model, rerun with --discount_fallback > Aborted (core dumped) > > > > On Tue, Jan 12, 2016 at 5:40 PM, Kenneth Heafield <

Re: [Moses-support] Error on lmplz

2016-01-12 Thread Kenneth Heafield
16 2:107979354931 > tcmalloc: large alloc 107979358208 bytes == 0x192b4b6000 @ > lmplz: ./util/fixed_array.hh:104: T& > util::FixedArray::operator[](std::size_t) [with T = > lm::NGramStream; std::size_t = long > unsigned int]: Assertion `i < size()' failed.

Re: [Moses-support] decoder question

2015-12-04 Thread Kenneth Heafield
Indeed, you should split sentences into separate lines. Here's the script: https://github.com/moses-smt/mosesdecoder/blob/master/scripts/ems/support/split-sentences.perl Note that the script assumes you have placed tags in the text to force sentence boundaries. It will not assume that

Re: [Moses-support] moses.ini

2015-11-26 Thread Kenneth Heafield
You can use one toolkit to train a different one to query. They'll both work. Though I have a bias towards saying you should keep KENLM in your moses.ini. Kenneth On 11/26/2015 06:38 PM, Ouafa Benterki wrote: > hello, > > my question is regarding moses.ini, if we uses IRSTLM should we >

Re: [Moses-support] change the jamfile for integrating LM

2015-11-14 Thread Kenneth Heafield
bjam is a silly language that requires spaces before semicolons. $(with-rnnlm) ; On 11/14/15 10:18, Vu Thuong Huyen wrote: > obj RNNLMWrapper.o : RNNLMWrapper.cpp ..//headers : > $(with-rnnlm); > ___ Moses-support mailing list Moses-support@mit.edu

Re: [Moses-support] Moses on SGE clarification

2015-10-29 Thread Kenneth Heafield
So we're clear, it runs correctly on the local machine but not when you run it through SGE? In that case, I suspect it's library version differences. On 10/29/2015 03:09 PM, Vincent Nguyen wrote: > > I get this error : > > moses@sgenode1:/netshr/working-en-fr$ /netshr/mosesdecoder/bin/lmplz >

Re: [Moses-support] Moses on SAMBA filesystem

2015-10-29 Thread Kenneth Heafield
2 nodes) > > I think you should ne able to replicate without having to handle sge or > nodes. Just on 1 machine. > > > Le 29/10/2015 20:59, Kenneth Heafield a écrit : >> Yes. >> >> Also this is all very odd. What file system is /netshr ? >> >> O

Re: [Moses-support] SRILM Error command not found

2015-10-20 Thread Kenneth Heafield
Sounds like a documentation bug. Where in the tutorial does it say to use SRILM? On 10/20/2015 04:20 PM, Anysta Nysta wrote: > Hye, > I desperately need help to solve the following errors. I run the srilm > 1.4.6 on Cygwin and already install all the packages required for Moses. > When I run

Re: [Moses-support] how to copy output of moses.ini into text output file...

2015-10-15 Thread Kenneth Heafield
Hi, You can implement this (and much much more) by reading: - http://www.dest-unreach.org/socat/ - man bash In the UNIX philosophy, it's Moses's responsibility to be awesome at going from stdin to stdout, some other tool's responsibility to do things with stdin and stdout, and your

Re: [Moses-support] Moses vocabulary code

2015-10-10 Thread Kenneth Heafield
Agreed about the cuteness of const Factor *. Let's say you're reading space-delimited file input. std::string line("Foo Bar Baz Quux ."); One can make a StringPiece(line.data(), 3) that looks and for most purposes acts like std::string("Foo") but requires zero memory allocation. It's not null

Re: [Moses-support] Moses vocabulary code

2015-10-09 Thread Kenneth Heafield
The Moses common vocabulary is moses/FactorCollection.h. Common practice in core Moses code is to pass around a const Factor * (which can be resolved to a StringPiece or a consecutive ID). If a feature/phrase table has its own ids because e.g. it's baked into the binary file, then there's a

Re: [Moses-support] Faster decoding with multiple moses instances

2015-10-08 Thread Kenneth Heafield
There's a ton of object/malloc churn in creating Moses::TargetPhrase objects, most of which are thrown away. If PhraseDictionaryMemory (which creates and keeps the objects) scales better than CompactPT, that's the first thing I'd optimize. On 10/08/2015 08:30 PM, Marcin Junczys-Dowmunt wrote: >

Re: [Moses-support] Faster decoding with multiple moses instances

2015-10-08 Thread Kenneth Heafield
lu.m_clock = clock(); return std::make_pair(lu.m_tpv, lu.m_bitsLeft); } else return std::make_pair(TargetPhraseVectorPtr(), 0); } On 10/08/2015 08:39 PM, Marcin Junczys-Dowmunt wrote: > How is probing-pt avoiding the same problem then? > > W dniu 08.10.2015 o 21:36, Ken

Re: [Moses-support] KenLM poison

2015-10-05 Thread Kenneth Heafield
Hi, I'm still betting it's out of disk space writing the ARPA. Multithreaded exception handling is annoying. This is there to prevent deadlock. Kenneth On 10/05/2015 01:52 PM, 徐同学 wrote: > Dear all, > > I’m building the baseline system, and some error occurred during the > last step

Re: [Moses-support] Faster decoding with multiple moses instances

2015-10-05 Thread Kenneth Heafield
https://github.com/kpu/usage This injects code into shared executables that makes them print usage statistics on termination to stderr. grep stderr, collate. Kenneth On 10/05/2015 04:05 PM, Michael Denkowski wrote: > Hi Philipp, > > Unfortunately I don't have a precise measurement. If anyone

Re: [Moses-support] Error on lmplz

2015-09-30 Thread Kenneth Heafield
That's bad. Would you mind sending me privately a minimal example of the data that reproduces the problem? Kenneth On 09/30/2015 04:29 PM, Alex Martinez wrote: > Hello, > today I've pulled moses code and recompiled and some experiments (EMS) > that were already working are failing on the LM

Re: [Moses-support] Error

2015-09-11 Thread Kenneth Heafield
Hi, None of the words is in the vocabulary. Which suggests you did not properly train the model. However that part was not visible in the screenshot. Kenneth On 09/11/15 14:20, fatma elzahraa Eltaher wrote: > Dear All, > > I try to test Language model with two sentences ( I can do that ,

Re: [Moses-support] Illegal division by zero in mteval-v13a.pl

2015-08-31 Thread Kenneth Heafield
You have a zero-length line. BLEU isn't well defined for this case. Though it would be nicer if NIST's script provided a better error message/had an option to skip empty lines. On 08/31/2015 02:33 PM, Dingyuan Wang wrote: > Dear all, > > When using EMS, step EVALUATION:test:nist-bleu(-c)

Re: [Moses-support] Cannot install moses with manual boost installation

2015-08-30 Thread Kenneth Heafield
It sounds like you haven't set environment variables or used --with-boost=/path/where/you/installed/boost Here's directions on installing packages, including Boost, in home directories etc: https://kheafield.com/code/kenlm/dependencies On 08/30/2015 09:36 PM, Shyam Upadhyay wrote: Hi, I am

Re: [Moses-support] Memory efficient MT

2015-08-26 Thread Kenneth Heafield
Hi, How much of http://www.statmt.org/moses/?n=Moses.Optimize have you used? Be sure to read the last line of the page too. Kenneth On 08/26/2015 04:55 PM, Maxim Khalilov wrote: *Hi all,* *I am trying to use an MT engine trained on the 800,000 line EuroParlament corpus in an

Re: [Moses-support] OSM in EMS error

2015-08-16 Thread Kenneth Heafield
If I had to guess, you ran out of disk space. Can you find the stderr of lmplz? Kenneth On 08/16/2015 11:11 AM, Vincent Nguyen wrote: the build-osm crashes in EMS with following error any clue ? 23396000 23397000 23398000 23399000 2340Converting Bilingual Sentence Pair into

  1   2   3   4   5   >