Your training corpus is empty.
cat ~/corpus/news-commentary-v8.fr-en.true.en
On 4/22/21 9:50 PM, Namrata Hadimani wrote:
> Hi All,
>
> I am trying to perform Language Model Training using the below command
>
> ~/mosesdecoder/bin/lmplz -o 3 <~/corpus/news-commentary-v8.fr-en.true.en >
>
It appears you are trying to run on a machine with very different
libraries from the machine you compiled on. Don't do that. Compile on
the same machine.
On 8/8/20 12:09 PM, Chen, Y. wrote:
> Dear Hieu,
>
> Thank you for your help! I solved this problem and built the language
> model. But
The CMPH software updated formats at some point but nobody changed
Moses. Use a vintage CMPH or have fun hacking...
On 6/17/20 10:08 PM, ser...@prompsit.com wrote:
> The subject of my previous message is wrong. Actually the problem is
> with queryPhraseTableMin as the content of the message
The WNGT 2020 Efficiency Shared Task
https://sites.google.com/view/wngt20/efficiency-task
invites submissions of efficient machine translation systems.
Participants build a WMT19 English-German system (or start from
pre-built ones) and optimize for quality, speed, RAM, model size, or any
Dear Moses,
I noticed some odd behavior in the truecaser whereby it tokenizes < and
> at the end of a word. Is this intended?
Maybe the answer is I should have run the tokenizer first so it would
be and and therefore this is undefined.
Input:
a
a<
a>
foo<
Output:
a
a <
a >
Alexandra Birch and I are hiring five researchers in machine
translation. Applicants can be pre-PhD or post-PhD. Apply before 17:00
GMT on 16 March 2020:
https://www.vacancies.ed.ac.uk/pls/corehrrecruit/erq_jobspec_version_4.jobspec?p_id=051331
The researchers will work on EU/EPSRC research
FACULTY POSITIONS AT THE UNIVERSITY OF EDINBURGH
Lecturer/Senior Lecturer/Reader in Natural Language Processing
Lecturer/Senior Lecturer/Reader in Computational Social Science
Applications are invited for two faculty positions in Natural Language
Processing and Computational Social Science in
PhD studentships in machine translation, computational linguistics,
speech technology, and cognitive science
Institute for Language, Cognition and Computation
School of Informatics
University of Edinburgh
The Institute for Language, Cognition and Computation (ILCC) at the
University of Edinburgh
Hi Arezoo,
You can find GPU-based translation systems here:
https://marian-nmt.github.io/
https://github.com/EdinburghNLP/nematus
The quality will probably also be better. Be warned you need GPU RAM.
Kenneth
On 5/4/19 10:16 AM, Arezoo Arjomand wrote:
> Hi
>
>
for the current call is 30 April but another position will soon
be open and due in early May.
See the longer ad: https://neural.mt/jobs/
Kenneth Heafield
Lecturer, University of Edinburgh
___
Moses-support mailing list
Moses-support@mit.edu
http
f the following datasets from OPUS:
>
> * GNOME
> * OpenSubtitles 2018
> * Tanzil
> * Tatoeba
> * Ubuntu
>
> Thanks,
> James
>
> On Mon, 3 Dec 2018 at 11:58, Kenneth Heafield <mailto:mo...@kheafield.com>> wrote:
>
&
Hi,
If I had to guess, you have a lot of duplicated text?
Kenneth
On 12/3/18 11:23 AM, James Baker wrote:
> Morning,
>
> I've been trying to train a language model using the following command:
>
> /opt/model-builder/mosesdecoder/bin/lmplz -o 5 -S 80% -T /tmp <
> lm_data.en > model.lm
>
It's in your bin directory.
bin/build_binary
On 06/24/2018 01:33 PM, Kamal Deep Garg wrote:
> Dear Sir
>
> i am using mose4. i created arpa file using KENLM.
>
> i want to convert it to binary format using this command.
>
> kenlm/build_binary filename.arpa filename.binary
>
> but i am able
https://www.microsoft.com/en-us/research/academic-program/data-science-award/
https://cloud.google.com/edu/?options=research-credits
On 06/08/2018 05:06 PM, Hieu Hoang wrote:
> try this
> https://developer.nvidia.com/academic_gpu_seeding
> or search the web
>
> Hieu Hoang
>
> On 8 June 2018
Hi,
Just to clarify that employees of the University of Edinburgh would
technically go to the university while PhD students keep the code they
write. Our IP people won't mind if we authors choose to allow another
license.
Kenneth
On 05/29/2018 11:03 AM, Lane Schwartz wrote:
> The source
Looks like 19 people when the nonbreaking_prefixes is included and
multiple e-mail addresses for the same person are collapsed.
git log tokenizer.perl ../share/nonbreaking_prefixes/* |grep Author |sort -u
Some of whom have invalid e-mail addresses, but can probably be tracked
down.
Kenneth
On
Google watermarked their translation output:
https://research.google.com/pubs/archive/37162.pdf
Would be good to check if they're still doing this with neural systems.
On 04/06/2018 09:14 AM, Mathias Müller wrote:
> Hi Ryan
>
> My two cents:
>
> First of all, a way of detecting
Moses doesn't use NPLM option, so there's no point in compiling with
--with-nplm . For what it's worth, it's meant to compile against this
fork https://github.com/kpu/nplm and NPLM has since changed.
On 03/17/2018 07:52 PM, krishna chaitanya gudipati wrote:
> Hi,
> I am getting some error
Hi Tom.
lmplz doesn't need libxmlrpc_xmltok. Looks like a case of
over-aggressive dependencies, resulting in a binary that needs a library
it doesn't use.
One could install xmlrpc-c (a third-party library used by Moses server)
in the same path. Or I guess substitute lmplz from
Hi Manli,
Just edit the configuration to change the number of threads as you
like.
Kenneth
On 12/12/2017 08:16 PM, Manli Zhu wrote:
> Hello,
>
> I set the thread number to 12 during tuninng process bc my server has
> 12 cpus. So the moses.ini has a line indicating thread = 12, which
>
Hi Daniel,
The data structures are keyed on the word being predicted, which is
inefficient at predicting every possible continuation. A forward trie
is much better at implementing these sorts of queries. I was designing
for random query speed.
You can eliminate backoff
Hi,
You convert the words to part of speech using an external tagger (lmplz
does not include POS detection). Then you'll probably need to run lmplz
--discount_fallback because the vocabulary is small.
Kenneth
On 10/28/2017 02:06 AM, Aileen Joan Vicente wrote:
> Hi! I am learning
Dear Moses,
There is funding to visit Edinburgh for a minimum of 6 months. I
may be able to get it for the right person. If you are interested, let
me (not the list) know by 24 October (yes, this is very late notice!).
Eligibility restrictions (theirs, not mine):
*
Visitors should
It seems nobody implemented epsilons. You're welcome to implement them.
On 09/07/2017 09:40 PM, Sanket Gandhare wrote:
> I am trying to give input to moses as word lattice having epsilons as
> well, '*EPS*'. but it is giving this result :
>
> terminate called after throwing an instance of
x on...wax off...
>
> All the Best,
> Chaz
> --------
> On Mon, 5/29/17, Kenneth Heafield <mo...@kheafield.com> wrote:
>
> Subject: Re: [Moses-support] Request for help w/ "The build failed."
> To: moses-support@mit.edu, "Hieu
A symlink for CreateProbingPT2 has nothing to do with KenLM. The symlink
already exists and the build system is trying to make it again (this also means
not windows). I suppose we should be using ln -sf.
Try deleting CreateProbingPT2 then rebuilding.
Kenneth
On May 29, 2017 1:04:25 AM
Can we have a better error message than "Segmentation fault" when that
happens?
On 05/17/2017 01:26 PM, Hieu Hoang wrote:
> ah yes, I think the phrase-table was created in the version when [ and ]
> weren't reserved characters but now they are. So you have to use the
> executables in that
Yes. Formally, the condition is in range, not just computable.
On April 24, 2017 4:18:41 AM GMT+01:00, liling tan wrote:
>Dear Moses community,
>
>Is it correct that when using --discount_fallback, if discount is
>computable from Kneyser-Ney, the fallback will not be
IRSTLM has its own mailing list:
https://list.fbk.eu/sympa/info/user-irstlm .
It appears IRSTLM is trying to create a temporary directory in the
current working directory. Try switching to a directory where you have
write permission before running.
Advertisement:
cd moses-cmd
bjam moses
It will be hidden in some long bjam path that depends on your
environment, not installed into bin though.
On 03/30/2017 10:41 PM, Nikolay Bogoychev wrote:
> I've been asking this same question since late 2013..?
>
> On Thu, Mar 30, 2017 at 10:30 PM, Marcin
How embarrassing. Can you try on head from github.com/kpu/kenlm ? If that
fails, I can take this off list.
Kenneth
On March 29, 2017 3:39:20 PM GMT+01:00, Dingyuan Wang
wrote:
>Dear list,
>
>lmplz crashed on my machine recently. Command is
>
>lmplz -o 4 -S 70% --text
unit test for the moment?
>
> Shuoyang
>
>
>
>
>> On Feb 22, 2017, at 1:35 PM, Kenneth Heafield <mo...@kheafield.com
>> <mailto:mo...@kheafield.com>> wrote:
>>
>> The main moses target already includes moses/*.cpp (with some exceptions
>> that you
Hi,
giza++ now lives at https://github.com/moses-smt/giza-pp .
Can you point us to the place in the documentation where this outdated
information appeared? The manual www.statmt.org/moses/manual/manual.pdf
does have broken footnotes, but the wget command appears to be correct.
uggested later when it's working
>
> Hieu Hoang
> http://moses-smt.org/
>
> On 22 February 2017 at 17:10, Kenneth Heafield <mo...@kheafield.com
> <mailto:mo...@kheafield.com>> wrote:
>
> Hi,
>
> phrase-extract depends on moses c
Hi,
phrase-extract depends on moses c.f. phrase-extract/Jamfile:7.
alias deps : $(most-deps:B).o ..//z ..//boost_iostreams
..//boost_filesystem ../moses//moses ../moses//ThreadPool ../moses//Util
../util//kenutil ;
So rather than copy, move it to moses. More cleanly, you could extract
Dear Moses,
The Alan Turing Institute, a joint venture of five universities,
including the University of Edinburgh, is recruiting research fellows
(~postdocs): https://www.turing.ac.uk/opportunities/ . These last 3-5
years. The position is in London or possibly Edinburgh depending on
No. Tokenizer and LM are separate tools. You can of course replace space with
a token like or something.
On November 9, 2016 6:04:07 AM GMT+00:00, Nat Gillin
wrote:
>Dear Moses community,
>
>Other than manually replacing space with an unused character and adding
Use the home directory strategy from
https://kheafield.com/code/kenlm/dependencies/
On 10/19/2016 01:36 PM, Mike Ladwig wrote:
> I seem to have run into the zlib "invalid distance" bug on Red Hat
> enterprise linux 7. Is there a way to get the moses bjam build system to
> ignore the system zlib
, the university now pays for application fees.
Happy applying,
Kenneth Heafield
Lecturer (Assistant Professor in en-US), University of Edinburgh
P.S. The system does show me your applications until 13 October, but
feel free to contact me.
___
Moses-support
https://github.com/moses-smt/mosesdecoder/archive/master.zip
On 09/15/16 11:20, Selva Nalladurai wrote:
> Hello guys,
>
> Please provide me with the link, where i can download the
> complete moses toolkit
>
>
>
>Regards,
>
Moses: pass -cube-pruning-lazy-scoring and it will call the LM as items
come out of the queue. Default is before they go into the queue.
mtplz is both and everything in between. Initially they go into the
queue with no LM, then items get incremental updates as they surface. A
completely
Hi,
Ok, master now accepts "false" for 0 again. And I've made the error
message more helpful.
Kenneth
On 08/17/2016 09:31 PM, Eleftherios Avramidis wrote:
> Hi,
>
> I am looking again on this. The error occurs when the moses.ini file contains
> this setting:
>
> KENLM lazyken=false
lto:ta...@erxindia.in>> wrote:
>
> Hi Kenneth,
>
> Thanks for letting me know. I will try and get back if there are
> any other problems.
>
> On Wed, Aug 3, 2016 at 3:31 PM, Kenneth Heafield
> <mo...@kheaf
You need to install zlib first, including any development version of
zlib. Further,I suspect your Boost installation is incomplete if you
didn't install zlib first.
https://kheafield.com/code/kenlm/dependencies/
Kenneth
On 08/03/2016 10:51 AM, Tarun Guntuka wrote:
> Hi Experts,
>
> I could
Are you using both IRSTLM and SRILM? I doubt it. The error appears to
be due to IRSTLM version mismatches; simplest option is to remove
--with-irstlm.
Kenneth
On 07/04/2016 11:33 AM, samane shahmohamadi wrote:
> hi all
> I got error while running this command
>
> ./bjam
>
> Thanks again,
> Mathias
>
> On Tue, Jun 28, 2016 at 6:08 PM, Kenneth Heafield <mo...@kheafield.com
> <mailto:mo...@kheafield.com>> wrote:
>
> Log-linear interpolation is in KenLM in the lm/interpolate directory.
> You'll want to get KenLM from gi
Oh also, use a small -S argument to the interpolate program because it
doesn't quite budget memory properly yet.
On 06/28/2016 05:08 PM, Kenneth Heafield wrote:
> Log-linear interpolation is in KenLM in the lm/interpolate directory.
> You'll want to get KenLM from github.com/kpu/kenlm and c
Log-linear interpolation is in KenLM in the lm/interpolate directory.
You'll want to get KenLM from github.com/kpu/kenlm and compile with Eigen.
Tuning log-linear weights is super slow, but applying them is reasonably
fast. In total the tuning + applying weights time is comparable to SRILM.
Kidd wrote:
>
> Thanks, that’s given me a good starting point. The next problem is
> that the dump_trie program expects a vocab file which isn’t provided.
> Any idea how I could create one?
>
>
>
> Thanks again,
>
> Graeme
>
>
>
> *From:*Kenneth Heafield
The trie file you have contains conditional probabilities and backoffs but not
counts. If you're OK with that, check out/modify the dump_trie program in the
bounded-noquant branch of github.com/kpu/kenlm . It can stream but you will
need to do ulimit -v with something above 6 TB even though
It's KENLM, not KenLM according to Hieu, who did name it after all.
Kenneth
On 05/29/2016 10:19 PM, Anna Garbar wrote:
> Hi Sašo,
>
> Thanks for your reply. Before recompiling moses with srilm, I also tried
> to changed SRILM to KenLM im the moses.ini (under feature functions),
> but received
Website edited, thanks for the excuse.
Kenneth
On 05/29/2016 10:10 PM, Sašo Kuntaric wrote:
> Hi Anna,
>
> You are probably using KenLM as it's the default language model making
> tool. The factored tutorial however has the parameter for using SRILM.
> In the "lm
Yes. See FactorCollection.
On 05/26/2016 10:45 PM, Shuoyang Ding wrote:
> Hi all,
>
> I'm thinking about implementing some cache-based methods to speed up
> feature score evaluation. Hence it'll be interesting to know whether the
> factors are shared across sentences, or put it another way, if
When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Moses-support digest..."
>
>
> Today's Topics:
>
>1. Call for Participation: IEEE DIPDMWC2016 Moscow, Russia
>
Yes, it uses threads when it wants to. There is no option to turn
threads off (and no code path that would do so). One has limited
control using block size and counts. Ideally it would be more parallel.
Kenneth
On 04/27/2016 03:25 PM, koormoosh wrote:
> Hello,
>
> Out of curiosity, does
Looks like an exception triggering destructors that throw an exception.
If you can compile with debug then get a backtrace, hopefully that will
tell us where somebody is throwing an exception from a destructor.
On 04/21/2016 02:12 PM, Jorg Tiedemann wrote:
> Hi,
>
>
> I have this rather large
.
I am also happy to hear from potential postdocs or visitors.
Kenneth Heafield
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support
Hi,
Any words beyond N-1 have full context and are included in the
phrase's score. So it's hypothesis + target phrase + adjustments. And
the routine you cite is computing adjustments.
Kenneth
On 04/19/16 10:50, Evgeny Matusov wrote:
>
> Hi,
>
>
> my colleagues and I noticed the following
t;>>>> rather than the pt or lexicalized reordering model etc?
>>>>>
>>>>> If there's a way to make the model files available for download or to
>>>>> give
>>>>> me access your machine, i might be able to debug it
>>>>&g
o.uk/hieu
>>> On 12 Apr 2016 08:41, "Jorg Tiedemann" <tiede...@gmail.com> wrote:
>>>
>>>>
>>>> Unfortunately, load=read didn’t help. It’s been loading for 7 hours
>>> now
>>>> and no sign to start decoding.
>&g
hing but I didn’t have the
> impression that this changed a lot. Does it really help and how much
> would you usually gain? Thanks again!
>
>
> Jörg
>
>
>> On 10 Apr 2016, at 12:55, Kenneth Heafield <mo...@kheafield.com
>> <mailto:mo...@kheafield.com>> wr
Hi,
I'm assuming you have enough RAM to fit everything. The kernel seems
to preferentially evict mmapped pages as memory usage approaches full
(it doesn't have to be full). To work around this, use
load=read
in your moses.ini line for the models. REMOVE any "lazyken" argument
which
Probing format models can't be filtered because they only retain hashes
of ngrams.
Trie format models can be filtered and dumped, but only with the very
hacky and undocumented dump_trie program in the bounded-noquant branch.
Hasn't been a priority to make it release quality; volunteers?
Kenneth
The compact phrase table uses CMPH. Compiling the first time using
--with-cmph is sufficient.
On 04/05/2016 11:03 AM, Hegde, Sujay wrote:
> Dear Moses Admin/Phillip,
>
>
>
> As per http://www.statmt.org/moses/?n=Advanced.RuleTables,
>
>
>
> Download the CMPH library from
The default falls back to shared as you note. It also links the
implicit libraries like glibc dynamically.
--static forces everything to be static, including turning off
libSegFault if necessary, and failing if anything else isn't available
statically.
The build process falls back to shared
kangaroo is less probable than snake. Which more than explains the
difference you observed. Film at 11.
That p() is pretty high. What happened when you used lmplz to
build the model?
Kenneth
On 03/23/2016 09:28 AM, Bhat Irshad wrote:
> I build a language model using IRSTLM on 20 million
Hi,
I don't see a problem with this in principle. The error means that the
calling code provided an out-of-range word id. Can I have a backtrace
after compiling with debug?
Kenneth
On 03/21/2016 11:34 PM, Lane Schwartz wrote:
> Hi,
>
> I have a tiny LM that is giving me some
8 --debug-configuration -d2 |gzip >build.log.gz
>
> I've attached the new build.log as well.
>
> On Tue, Mar 15, 2016 at 3:38 PM, Kenneth Heafield <mo...@kheafield.com
> <mailto:mo...@kheafield.com>> wrote:
>
> Smells like boost was compiled with a diff
The build log you attached isn't consistent with the error you're
reporting. Smells like you have a broken half-installed bjam on your
system, in which case you need to run ./bjam not bjam.
On 03/16/2016 10:42 AM, Zhanwang Chen wrote:
> Dear all,
>
> I am trying to install Moses according to
Smells like boost was compiled with a different version of gcc than the
one you're using to compile Moses, which can occasionally cause problems.
On 03/15/2016 09:46 AM, Pratik Mehta wrote:
> Hello,
> I tried to compile Moses with the following command:
> ./bjam -j4
>
> The process ended with
There were failing unit tests. Paging Lane Schwartz.
On 03/11/2016 03:56 PM, Hieu Hoang wrote:
> I remember there is compilation issues with it. I guess at some point
> someone must have gotten tired of looking after it and took it out of
> the build.
>
> On 10/03/2016 23:36, Michael Denkowski
On 02/19/2016 11:38 PM, Kenneth Heafield wrote:
> Hi,
>
> The default is mmap with MAP_POPULATE (see man mmap). As to whether
> GPFS implements MAP_POPULATE correctly, I defer to the former IBM
> employee.
>
> KenLM implements the following options via config.load
Hi,
The default is mmap with MAP_POPULATE (see man mmap). As to whether
GPFS implements MAP_POPULATE correctly, I defer to the former IBM
employee.
KenLM implements the following options via config.load_method:
typedef enum {
// mmap with no prepopulate
LAZY,
// On linux,
Hi,
There are a few differences, most of which I'd expect you're fine with.
- The discounts are different but you're using --discount_fallback so
you know that.
- Unknown word handling is different. If you want an SRI's IMHO broken
behavior pass --interpolate_unigrams 0 (though if your
That typically causes a bus error. Why is there an overly huge malloc?
On 02/02/2016 03:53 PM, Marcin Junczys-Dowmunt wrote:
> I think it fills up your temporary folder, try "-T ." to specify thew
> local folder for temporary files.
>
> On 02.02.2016 16:21, Jeremy Gwinnup wrote:
>> Hi,
>>
>>
t; Kneser-Ney wasn’t able to cope up with the counts being generated for
>> coarse language models. Sp, I’ll train my LM using SRILM with ngram
>> order 8 and WB smoothing and use KenLM with order 8 in Moses.
>>
>> Best,
>> Jasneet
>>> On Jan 23, 2016, at 3:38
Hi,
You can compile with --max-kenlm-order=8 or change the setting in the
Eclipse files.
The ARPA file format is interchangeable. You can build an ARPA using
SRILM and Witten-Bell (though Bob Moore once called me out at a
conference for suggesting that as an alternative to
If I had to guess, you're running out of virtual address space on
32-bit. Try -S 1G.
On 01/17/2016 10:37 AM, rmogla wrote:
> Hi,
> I am a new user of moses and using it for the first time. Ihave
> installed moses and giza++ on a 32 bit machine with ubuntu 15.04 , but
> while doing language model
ns
> with some OOV-token-identifier such as before sending for
> translation.
>
>
> /Best Regards,/
> Ergun
>
> Ergun Biçici
> DFKI Projektbüro Berlin
>
>
> On Fri, Jan 15, 2016 at 12:22 AM, Kenneth Heafield
> <mo...@kheafield.c
rams with adjusted count 3;
> Is this small or artificial data?
> Try deduplicating the input. To override this error for e.g. a
> class-based model, rerun with --discount_fallback
> Aborted (core dumped)
>
>
>
> On Tue, Jan 12, 2016 at 5:40 PM, Kenneth Heafield <
16 2:107979354931
> tcmalloc: large alloc 107979358208 bytes == 0x192b4b6000 @
> lmplz: ./util/fixed_array.hh:104: T&
> util::FixedArray::operator[](std::size_t) [with T =
> lm::NGramStream; std::size_t = long
> unsigned int]: Assertion `i < size()' failed.
Indeed, you should split sentences into separate lines. Here's the script:
https://github.com/moses-smt/mosesdecoder/blob/master/scripts/ems/support/split-sentences.perl
Note that the script assumes you have placed tags in the text to
force sentence boundaries. It will not assume that
You can use one toolkit to train a different one to query. They'll both
work. Though I have a bias towards saying you should keep KENLM in your
moses.ini.
Kenneth
On 11/26/2015 06:38 PM, Ouafa Benterki wrote:
> hello,
>
> my question is regarding moses.ini, if we uses IRSTLM should we
>
bjam is a silly language that requires spaces before semicolons.
$(with-rnnlm) ;
On 11/14/15 10:18, Vu Thuong Huyen wrote:
> obj RNNLMWrapper.o : RNNLMWrapper.cpp ..//headers :
> $(with-rnnlm);
>
___
Moses-support mailing list
Moses-support@mit.edu
So we're clear, it runs correctly on the local machine but not when you
run it through SGE? In that case, I suspect it's library version
differences.
On 10/29/2015 03:09 PM, Vincent Nguyen wrote:
>
> I get this error :
>
> moses@sgenode1:/netshr/working-en-fr$ /netshr/mosesdecoder/bin/lmplz
>
2 nodes)
>
> I think you should ne able to replicate without having to handle sge or
> nodes. Just on 1 machine.
>
>
> Le 29/10/2015 20:59, Kenneth Heafield a écrit :
>> Yes.
>>
>> Also this is all very odd. What file system is /netshr ?
>>
>> O
Sounds like a documentation bug. Where in the tutorial does it say to
use SRILM?
On 10/20/2015 04:20 PM, Anysta Nysta wrote:
> Hye,
> I desperately need help to solve the following errors. I run the srilm
> 1.4.6 on Cygwin and already install all the packages required for Moses.
> When I run
Hi,
You can implement this (and much much more) by reading:
- http://www.dest-unreach.org/socat/
- man bash
In the UNIX philosophy, it's Moses's responsibility to be awesome at
going from stdin to stdout, some other tool's responsibility to do
things with stdin and stdout, and your
Agreed about the cuteness of const Factor *.
Let's say you're reading space-delimited file input.
std::string line("Foo Bar Baz Quux .");
One can make a StringPiece(line.data(), 3) that looks and for most
purposes acts like std::string("Foo") but requires zero memory
allocation. It's not null
The Moses common vocabulary is moses/FactorCollection.h. Common
practice in core Moses code is to pass around a const Factor * (which
can be resolved to a StringPiece or a consecutive ID).
If a feature/phrase table has its own ids because e.g. it's baked into
the binary file, then there's a
There's a ton of object/malloc churn in creating Moses::TargetPhrase
objects, most of which are thrown away. If PhraseDictionaryMemory
(which creates and keeps the objects) scales better than CompactPT,
that's the first thing I'd optimize.
On 10/08/2015 08:30 PM, Marcin Junczys-Dowmunt wrote:
>
lu.m_clock = clock();
return std::make_pair(lu.m_tpv, lu.m_bitsLeft);
} else
return std::make_pair(TargetPhraseVectorPtr(), 0);
}
On 10/08/2015 08:39 PM, Marcin Junczys-Dowmunt wrote:
> How is probing-pt avoiding the same problem then?
>
> W dniu 08.10.2015 o 21:36, Ken
Hi,
I'm still betting it's out of disk space writing the ARPA.
Multithreaded exception handling is annoying. This is there to prevent
deadlock.
Kenneth
On 10/05/2015 01:52 PM, 徐同学 wrote:
> Dear all,
>
> I’m building the baseline system, and some error occurred during the
> last step
https://github.com/kpu/usage
This injects code into shared executables that makes them print usage
statistics on termination to stderr. grep stderr, collate.
Kenneth
On 10/05/2015 04:05 PM, Michael Denkowski wrote:
> Hi Philipp,
>
> Unfortunately I don't have a precise measurement. If anyone
That's bad. Would you mind sending me privately a minimal example of
the data that reproduces the problem?
Kenneth
On 09/30/2015 04:29 PM, Alex Martinez wrote:
> Hello,
> today I've pulled moses code and recompiled and some experiments (EMS)
> that were already working are failing on the LM
Hi,
None of the words is in the vocabulary. Which suggests you did not
properly train the model. However that part was not visible in the
screenshot.
Kenneth
On 09/11/15 14:20, fatma elzahraa Eltaher wrote:
> Dear All,
>
> I try to test Language model with two sentences ( I can do that ,
You have a zero-length line. BLEU isn't well defined for this case.
Though it would be nicer if NIST's script provided a better error
message/had an option to skip empty lines.
On 08/31/2015 02:33 PM, Dingyuan Wang wrote:
> Dear all,
>
> When using EMS, step EVALUATION:test:nist-bleu(-c)
It sounds like you haven't set environment variables or used
--with-boost=/path/where/you/installed/boost
Here's directions on installing packages, including Boost, in home
directories etc:
https://kheafield.com/code/kenlm/dependencies
On 08/30/2015 09:36 PM, Shyam Upadhyay wrote:
Hi,
I am
Hi,
How much of http://www.statmt.org/moses/?n=Moses.Optimize have you
used? Be sure to read the last line of the page too.
Kenneth
On 08/26/2015 04:55 PM, Maxim Khalilov wrote:
*Hi all,*
*I am trying to use an MT engine trained on the 800,000 line
EuroParlament corpus in an
If I had to guess, you ran out of disk space. Can you find the stderr
of lmplz?
Kenneth
On 08/16/2015 11:11 AM, Vincent Nguyen wrote:
the build-osm crashes in EMS with following error
any clue ?
23396000 23397000 23398000 23399000 2340Converting Bilingual
Sentence Pair into
1 - 100 of 497 matches
Mail list logo