Hi Moses,
Introducing kenlm in Moses trunk. You no longer need to download a
separate language model to use Moses; it's distributed with Moses and
compiled in by default on UNIX. This is threadsafe language model
inference code that returns the same probabilities as SRI (up to
floating p
do we need to
> train with another tool, like SRILM or convert IRSTLM to full ARPA format?
>
> Thanks again,
> Tom
>
>
>
> On Mon, 18 Oct 2010 20:31:38 -0400, Kenneth Heafield
> wrote:
>> Hi Moses,
>>
>> Introducing kenlm in Moses trunk. You no
owing an instance of 'lm::FormatLoadException'
>> what(): Expected blank line after 3-grams at byte 22348989 in file
>> arpa.en.lm
>> Aborted
>>
>> What am I missing?
>>
>> Thanks,
>> Tom
>>
>>
>> On Fri,
n-grams) and
> the error disappeared.
>
> It's pretty fast now. I look forward to testing the optimized code.
>
> Tom
>
>
>
> On Tue, 26 Oct 2010 10:18:17 -0400, Kenneth Heafield
> wrote:
>> I've fixed this in revision 3657 and tested that it wo
Revision 3671 introduces an updated version of kenlm. Queries are
faster now (no more string vocab lookups, state is kept so backoffs cost
less). The binary format has changed as a result; please rebuild your
binary files. Timing is forthcoming.
Kenneth
On 10/18/10 20:31, Kenneth Heafield
Hi Felipe,
Please run $recent_moses_build/kenlm/query langmodel.lm Hello all,
>
> My question is about SRILM and IRSTLM, it is not directly related to
> Moses, but I did not know where to ask.
>
> I am scoring individual sentences with a 5-gram language model and I get
> different sco
sys 0.00047
> rss 316656 kB
> Total time including destruction:
> user18.0001
> sys 0.00051
> rss 1312 kB
>
> It seems that it is adding the end-of-sentence token, but not that of
> the begin of sentence.
>
> Score (-55.599) is different from SRILM (
50 minutes
> BLEU Score: 0.2514
>
>
>
>
> On Wed, 27 Oct 2010 14:15:39 -0400, Kenneth Heafield
> wrote:
>> Revision 3671 introduces an updated version of kenlm. Queries are
>> faster now (no more string vocab lookups, state is kep
That documentation was specific to kenlm's query tool. kenlm does the
same thing as SRI with respect to sentence boundary tokens. As to what
that is, I'm deferring to Edinburgh.
Kenneth
On 10/29/10 10:28, John Burger wrote:
> Kenneth Heafield wrote:
>
>> kenlm's q
esigned to score internal and tokens so you'll get
weird results if they're duplicated. . .
Kenneth
On 10/29/10 10:37, Kenneth Heafield wrote:
> That documentation was specific to kenlm's query tool. kenlm does the
> same thing as SRI with respect to sentence boundary tok
Dear Moses,
Can I interest you in an ARPA language model filter?
http://kheafield.com/code/mt/filter.html . It enforces phrase and
sentence-level constraints, not just vocabulary. You might have to
modify your perl scripts.
Kenneth
___
Moses-s
ndly CompanyThink of the environment; please
> don't print this e-mail unless you really need to.
>
> Fast Track 100 2009Queens Award for Business
>
>
>
>
> On 19 October 2010 01:31, Kenneth Heafield <mailto:mo...@kheafield.com>> wrote:
>
> Hi Moses
Try KenLM. Run ./configure (no argument), change your moses.ini so the
first digit is 8, and get the same results with less time, memory, and
compilation headache.
If you still want to use SRI with moses:
Is your machine actually 64-bit but SRI annoyingly decided to compile
32-bit? If so, modif
Interesting. Do you have the file
/home/deeps/mosesdecoder/kenlm/libkenlm.a? Does this happen after:
make clean
./regenerate-makefiles.sh
./configure --with-irstlm=/usr/local/lib
make
Try using single-threaded make so we can tell if this is a
parallelization issue.
What Linux distribution are
, , and (but your tokenizer might split these anyway).
On 11/24/10 11:16, Philipp Koehn wrote:
> Hi,
>
> this would probably good to spell out in the documentation.
>
> The short answer is:
>
> * if you use the default setup, only the bar '|' is a special character
>
> * if you use XML input
You're missing PhraseDictionaryMyImpl::~PhraseDictionaryMyImpl() {} in
your cc file.
On 11/26/10 12:58, Fabienne Braune wrote:
> Hi,
>
> I have implemented a new type of phrase dictionnary in order to write my
> own GetChartRuleCollection(...) method. I get the error-message
> "/.../mosesdecoder/
Ooh a big-endian user. Guess I'll have to write those routines. For
now you can comment out the offending #error but don't use kenlm's trie
implementation (the default probing hash table is fine). It looks like
the switchable endianness on PPC is a choice made by the operating
system and you're
reads, Linux does little-endian on Itanium, and
running Moses on my MIPS-based wireless router doesn't sound like a good
idea. A shame we threw out the PA-RISC machines.
Kenneth
On 11/27/10 10:15, Kenneth Heafield wrote:
> Ooh a big-endian user. Guess I'll have to write those routines.
Dear Moses,
Of SRI and IRST, the fastest is SRI's default. KenLM's trie
implementation uses 16% less CPU. The smallest [without quantization]
is IRST with lazy loading. KenLM's trie implementation uses 42% less
memory. Simultaneously. Full benchmarks at
http://kheafield.com/code/kenlm
make -j sets the number of processes to compile Moses. It impacts the
speed with which Moses compiles. It has no impact on the binary
produced and, therefore, no impact on training or decoding time. There
is no maximum but, as some files depend on others, only so many files
can be compiled simul
kenlm doesn't build ARPA files; you will need SRILM/IRSTLM to build one.
So for example,
1. Compile Moses. I put this before the install SRI step to emphasize
that Moses does not need to be linked to SRI.
2. Install SRILM
3. Run SRILM's ngram program to generate an ARPA file
4. Pass --lm 0:5:foo
Hey IRST, why are you generating positive log probabilities?
I'll have to fix the error message to print the number 4 instead of
ASCII value 4.
On 01/06/11 04:13, supp...@precisiontranslationtools.com wrote:
> I've been using IRSTLM's build-lm.sh to build an LM. Then converted from
> iARPA to ARP
That code is inside SRILM. You might get an answer by posting to
srilm-u...@speech.sri.com .
Or use kenlm. . .
On 01/06/11 08:02, John Morgan wrote:
> Hello,
>
> I'm trying to build systems with multiple LMs as features in the ems.
> I have 7 subcorpora, c1,c2,...c7.
> I use [LM:c1], [LM:c2]
I've just checked in revision 3796 which fixes this problem, including
the OnDiskWrapper issue for bonus kicks.
Tested with: ./configure, ./configure --without-kenlm, ./configure
--enable-shared, and ./configure --enable-shared --without-kenlm .
I would have just added -lkenlm to LIBS in configur
I've checked in an updated kenlm as revision 3847. This involves a
binary format change, so you'll need to rebuild from your ARPA files,
sorry.
- There's an important correctness fix. Some models contain n-grams
like "foo bar baz quux" without their n-grams e.g. "bar baz quux" and
"baz quux" bec
If you created a clean checkout with revision [3849,3859) then you might
have gotten an error from ./configure about not finding KEN-LM in
lm/model.hh . I've fixed this in revision 3859. Existing checkouts
updated to these revisions were fine.
Sorry,
Kenneth "can we please stop using autotools?
The first error you report (body != 0) means malloc returned 0. That's
an out of memory condition (or a bug in SRI asking for 0 memory). Are
you you compiling 32-bit or running with any other hard limit on RAM?
Don't know what your second error is.
Try kenlm. It uses less memory and has more i
***
>> [/home/staff/joerg/projects/UUMT/wmt11/data/training-monolingual/news.shuffled.low.de.kenlm]
>> Segmentation fault
>> make: *** Deleting file
>> `/home/staff/joerg/projects/UUMT/wmt11/data/training-monolingual/news.shuffled.low.de.kenlm'
>>
>> Thanks
Hi,
What revision of Moses are you using? Does this still happen after you
run svn up and recompile Moses?
Kenneth
On 02/07/11 10:53, Kārlis Goba wrote:
> Hi,
>
> My preferred way to build large LMs has been IRSTLM as it can handle large
> corpora nicely by splitting the task. The pro
Just to get the word out more, trie is broken before 3847 for common
pruning strategies as announced in "[Moses-support] kenlm updated in
3847". Admittedly the subject could have yelled more, but it's also
easy to miss posts.
On 02/07/11 11:11, Kārlis Goba wrote:
> Thanks, Kenneth,
>
> This was
What architecture are you on? 64-bit x86? I'm assuming you compiled
64-bit.
Could you send me either the ARPA or a tarball of the temporary building
directory snapshotted by hitting ctrl+c in while the second progress bar
is running? I've sent you off-list instructions on how transfer a file
t
Weird. It's already checked that contextFactor is non-empty. This
could be a bad or NULL Word * object or factor set incorrectly.
Are you using factors? What are your LM lines from moses.ini?
On 02/10/11 04:39, Christian Rishøj Jensen wrote:
>
> I am seeing a segmentation fault in KenLM this
Please update to revision 3877 or above. I've checked in fix that's
probably it.
Sorry,
Kenneth
On 02/10/11 01:07, Kenneth Heafield wrote:
> What architecture are you on? 64-bit x86? I'm assuming you compiled
> 64-bit.
>
> Could you send me either the ARPA or
Does this work if you substitute IRST or SRI? I'm using essentially the
same calls they are to get vocab IDs here.
On 02/10/11 04:39, Christian Rishøj Jensen wrote:
>
> I am seeing a segmentation fault in KenLM this morning:
>
> reading bin ttable
> size of OFF_T 8
> binary phrasefile loaded, d
odels, sorry.
Kenneth
On 02/10/11 20:54, Kenneth Heafield wrote:
> Please update to revision 3877 or above. I've checked in fix that's
> probably it.
>
> Sorry,
>
> Kenneth
>
> On 02/10/11 01:07, Kenneth Heafield wrote:
>> What architecture are you on?
I don't really know how to use EMS, so hopefully the mailing list can
answer this question.
Original Message
Subject:Build LM using IRSTLM
Date: Sun, 13 Feb 2011 22:05:13 +0330
From: amin farajian
To: mo...@kheafield.com
Hello Dear Heafield,
I'm trying to bui
, unsigned int*) const:
>> Assertion `(*contextFactor[count-1])[factorType] != __null' failed.
>>
>> I am not quite sure what is causing this.
>> Could it be related to the use of binarized phrase tables?
>>
>>
>>
>> On Feb 10, 2011, at 4:00 PM, Ken
Hiya Moses,
There are a fair number of exceptions thrown that are not intended to
be caught e.g. Sentence.cpp: 107:
if (!ProcessAndStripXMLTags(line, xmlOptionsList,
m_reorderingConstraint, xmlWalls )) {
const string msg("Unable to parse XML in line: " + line);
TRACE
owever, they note that "new projects" could
benefit. Also, I'm responsible for getting pointer container on the
list of approved Boost libraries.
>
>
> cheers
> Barry
>
>
> On Wednesday 23 Feb 2011 20:30:27 Kenneth Heafield wrote:
>> Hiya Moses,
&
On 02/23/11 17:02, Barry Haddow wrote:
>
>> There's a question of location: for my purposes this should be linked
>> into kenlm/build_binary, kenlm/query, moses-cmd/src/moses, etc. I see
>> the mert implementation and lmserver also throw exceptions, so it should
>> probably be linked in there as
I thought Ondrej Bojar had changed the regression tests to use KenLM but
perhaps this was only a partial change.
On 03/12/11 09:01, Alexander Fraser wrote:
> 2) The regression tests fail with no external LMs because of some
> problem. This is also not true, the regression tests require you to
> co
I think you'd be better off implementing your own
StatefulFeatureFunction, bypassing LanguageModel.{h,cpp} which mostly
handles n-grams crossing phrase boundaries, and calling the
LanguageModelImplementation as the backend. You'll probably want larger
beams too.
Kenneth
On 03/18/11 13:38, Dennis
I believe the right answer to this is adding an OOV count feature to
Moses. In fact, I've gone through and made all the language models
return a struct indicating if the word just scored was OOV. However,
this needs to make in into the phrases and ultimately the features.
Also, there's the fun of
ack to a very low floor.
> So it may be that Alex's desired feature is just a bug, which can
> be reproduced with kenlm by not training with "-unk", hence
> also falling back to the floor probability (if that is what kenlm
> is doing).
>
> -phi
>
> On Sat,
With a closed vocabulary LM, SRILM returns -inf on OOV and moses floors
this to LOWEST_SCORE which is -100.0. If you want identical behavior
from KenLM,
kenlm/build_binary -u -100.0 foo.arpa foo.binary
Unless you passed -vocab to SRILM (and most people don't), never
appears except as a unigram.
atever; this is what I thought the error message
> was referring to. Yes, that is what is causing the problem.
>
> Cheers, Alex
>
>
> On Sat, Mar 19, 2011 at 6:25 PM, Kenneth Heafield wrote:
>> The original behavior was to refuse to load any model without .
>>
gt; get a further improvement.
>
> Cheers, Alex
>
>
> On Sat, Mar 19, 2011 at 7:18 PM, Kenneth Heafield wrote:
>> With a closed vocabulary LM, SRILM returns -inf on OOV and moses floors
>> this to LOWEST_SCORE which is -100.0. If you want identical behavior
>>
op-unknown).
> All translations will have them.
>
> Otherwise, all words in the translation model should be known.
>
> So, what is the choice here?
>
> -phi
>
> On Sat, Mar 19, 2011 at 7:19 PM, Kenneth Heafield wrote:
>> I believe -vocab takes a file containing the
many cases. I believe this behavior is better than the
situation with SRI where no backoff penalty is charged, and therefore
you may encounter different results when using KenLM on any language
model without .
Kenneth
On 03/21/11 09:56, Kenneth Heafield wrote:
> So, assuming the parallel data is
Many distributions randomize shared library addresses each time you run
an executable in order to make buffer overflow attacks harder. There's
plenty of things that will make addresses returned by malloc/mmap vary
without threading.
Kenneth
On 03/24/11 10:15, Lane Schwartz wrote:
> On Thu, Mar 2
I haven't tested kenlm on Cygwin, but it could work. Can you run tests?
1) Install Boost. Cygwin's package manager should provide it.
2) Run kenlm tests.
wget http://kheafield.com/code/kenlm.tar.gz
tar xzf kenlm.tar.gz
cd kenlm
./test.sh
On 03/25/11 06:44, Sudip Datta wrote:
> I've used gcc i
I've had this happen too when running benchmarks. The latest IRSTLM is
actually 5.60.01: http://hlt.fbk.eu/en/irstlm and appears to resolve
your issue. The sourceforge page is out of date.
#include
On 03/30/11 10:10, Arda Tezcan wrote:
> Hi Everyone,
> After working with SRILM for a while, I j
+0100, Nicola Bertoldi
>> wrote:
>>> Indeed this should not happen
>>>
>>> Tom, could you please upload the following data in our ftp area?
>>>
>>> - textual training data (if possible)
>>> - LM in iARPA format
>>> - LM in binary f
Barry is correct.
Also kenlm doesn't care what the third field is. I just read it from
the ARPA file. Using a model with lower order that it was trained for
is incorrect under most smoothing methods.
On 04/30/11 16:39, Barry Haddow wrote:
> Hi Alexandre
>
> The format of the language model spe
Hi,
There's http://statmt.org/wmt09/scripts.tgz but these are only for
select European languages.
The post you refer to suggests MADA+TOKAN for Arabic:
http://www1.ccls.columbia.edu/~cadim/MADA.html .
Kenneth
On 05/14/11 11:03, ahmed sabry rizk wrote:
> Hi,
> I am trying to toke
> Tom
>
>
>
>> -Original Message-
>> *From*: Kenneth Heafield > <mailto:kenneth%20heafield%20%3cmo...@kheafield.com%3e>>
>> *To*: moses-support@mit.edu <mailto:moses-support@mit.edu>
>> *Subject*: Re: [Moses-support] KenLM build_binary exception
Hmmm. . . looks like it's crashing on lm/lm_exception.cc and
lm/config.cc which are mine. But the compiler should throw you an error
instead of taking infinite memory. See if I can reproduce.
On 05/19/11 22:58, supp...@precisiontranslationtools.com wrote:
> I'm updating to the newest moses trun
ebuild.
>
> @Others: regarding configure's WARNING: unrecognized options:
> --with-boost-thread, is this option still required?
>
> Tom
>
>
>
> On Thu, 19 May 2011 23:05:53 -0400, Kenneth Heafield
> wrote:
>> Hmmm. . . looks like it's crashing on lm
e because ltmain.sh won't be
in the repository.
On 05/19/11 23:47, Tom Hoar wrote:
> I'm glad you can replicate the problem. Easier to fix that way.
>
> On Thu, 19 May 2011 23:42:55 -0400, Kenneth Heafield
> wrote:
>> Apparently this is a libtool issue, not one wit
head config.log
Not aware of e.g. a runtime query.
On 05/23/11 12:48, Barry Haddow wrote:
> On Monday 23 May 2011 17:39, Tom Hoar wrote:
>> Is there a way to query the moses binary to report what configure
>> options were used? i.e. such as which --with-[xxxlm]=
>
>
> No.
>
> Do you want to kn
Edit kenlm/lm/max_order.hh and recompile.
The reason is to minimize the size of the State object held by each
hypothesis while avoiding dynamic memory allocation.
On 05/23/11 15:39, Tom Hoar wrote:
> I use KenLM's build_binary for language models. There are no problems
> order values up to 6 gram
irements/allocation for the State object? I.e. if I
> compile with kMaxOrder = 12, and use Kenlm for a model with order = 6,
> is more memory required/allocated and if so, how much? Or, does the
> additional allocation only occur when the model has a higher order?
>
> Tom
>
>
Moses outputs translations to stdout and advisory messages to stderr.
This is the correct behavior.
I think you're referring to Java's rudimentary process IO handling.
http://stackoverflow.com/questions/60302/starting-a-process-with-inherited-stdin-stdout-stderr-in-java-6
On 06/03/11 05:50, nakul
Try using MGIZA: http://geek.kyloo.net/software/doku.php/mgiza:overview
On 06/15/11 04:51, Prasanth K wrote:
> Hello All,
>
> I am conducting a series of experiments to build translation systems
> using Moses in which the corpus of the current experiment is a subset of
> the corpora used in the p
Hi,
KenLM was accepted to WMT 2011 as a research paper :-). That means my
July 1 camera-ready deadline is the same as many WMT participants, some
of whom have asked me how to cite. To resolve this race condition,
here's a BibTeX:
@InProceedings{kenlm,
author = {Kenneth Hea
I don't change the binary file format without updating the version
number so old versions won't load. The recent versions shouldn't impact
that.
Sounds like a case for gdb.
On 06/26/11 08:22, Hieu Hoang wrote:
> i believe there's been changes to the binary phrase table (to the
> support word ali
kenlm now supports quantization. To use it, svn up then run
build_binary with -q:
kenlm/build_binary -q 8 trie foo.arpa foo.out
for 8 bits. You can choose from 2 to 25 bits, inclusive. Currently,
probability and backoff are quantized separately (in this case using 8
bits each). By default, -q
Folks,
Don't use revisions [4037,4040) with the trie model. I accidentally
changed the file format and you'll get segfaults on existing binary
files. Also, the binary files it builds are corrupt. This doesn't sole
Tom Hoar's problem because his segfault came before revision 4037.
Kenne
ns to the file format compatible with rev's
> 4036 and before?
>
> Tom
>
>
>
> On Mon, 27 Jun 2011 17:29:58 -0400, Kenneth Heafield
> wrote:
>> Folks,
>>
>> Don't use revisions [4037,4040) with the trie model. I accidentally
>> chan
Since we're playing optimize Moses memory usage, what's your language
model?
On 06/29/11 14:32, Dennis Mehay wrote:
> Hi Phil,
>
> Thanks for the tips. I already tried reducing the max span for the
> re-ordering grammar (to 35, which is ~5 words more than the average span
> of the training sente
http://kheafield.com/code/scoring.tar.gz
On 07/12/11 11:56, Lane Schwartz wrote:
> Does anyone have a good script for taking plain-text versions of
> source, reference, and hypothesis files and wrapping them in XML for
> use by metric tools like TERp and the NIST scripts that require XML?
>
> I'm
ough the Moses abstraction layers to
>>> retrieve a pointer to a lm::Model from kenlm, but the
>>> Moses::LanguageModelKen header is not part of the public headers of
>>> Moses ; that's why I tried to use only Moses interface.
>>>
>>> (I did I did not m
> running the decoder, I wanted to use the already loaded LM.
> > >
> > > I first tried to dig my way through the Moses abstraction layers to
> > > retrieve a pointer to a lm::Model from kenlm, but the
> > > Moses::LanguageModelKen header is not
On 07/13/11 15:53, Philipp Koehn wrote:
> Hi,
>
> But you're asking for a third piece of information. If you query for
> "foo bar baz" and I can tell you that it will never extend to "* foo bar
> baz" for any word * (due to pruning or filtering), then you need only
> remember "foo
Hi Moses,
If trie uses too much memory, svn up to revision >= 4074 then pass "-a
#bits" to build_binary. It will minimize memory usage subject to the
maximum number of bits you specify (so e.g. pass bits 40 to minimize
memory usage). Compressing in this manner is lossless, but takes
addi
I use the following:
errno, sterror_r, open, close, mmap, munmap, ftruncate, fstat (for file
size), lseek, read, and write
Apparently the Windows equivalent to mmap is CreateFileMapping. If
there's a windows user out there who wants native calls and is willing
to help #ifdef, contact me. I pro
> If any of the IRSTLM/KenLM/$foo-LM –using folks on here have
> instructions or experience with compiling their particular tool under
> Cygwin, lemme know, and I’ll either include it or point to it. I
> guarantee dozens of extra downloads!
Sure, here's how you compile KenLM and link it into Moses
ications in every other wrapper ?"
>>
>> How do you, Moses developers, feel about this ?
>> Is it acceptable / outrageously stupid if I set the value to -1 in the other
>> wrappers,
>> maybe with a TODO, and properly document it in the super class ?
>&
tenance cost for us (me and the peop... Well, you know).
>
> ----- Mail original -
>> De: "Hieu Hoang"
>> À: "Kenneth Heafield"
>> Cc: moses-support@mit.edu
>> Envoyé: Vendredi 22 Juillet 2011 04:50:14
>> Objet: Re: [Moses-support] Using Moses langu
Hi,
Which ASCII character sequence represents newline in your file? Try
converting to UNIX newlines. Also can you send me the output of
zcat /home/moses/languagemodels/model.es.lm.gz |head -n 10 |gzip >send.gz
(I'm asking you to rezip so that your mail client doesn't convert the
enter
FYI we resolved the problem off-list. KenLM does not load IRST's iARPA
format. You must first run IRST"s compile-lm to generate an ARPA. I
might add an error message specific to this case.
On 07/27/11 09:27, Lee Ball (Applied Language) wrote:
> Hi guys,
>
> I just tried using KenLM out of inte
Hi,
There was a problem with this; thought it was fixed but maybe it came
back. Which revision are you running? Does it still happen if you run
single-threaded?
Kenneth
On 07/29/11 09:39, Alex Fraser wrote:
> Hi Folks,
>
> Tom Hoar previously mentioned that he had a problem with KenLM
Sorry I am slow to respond. This is my first thing to look at, but I am
traveling a lot through the 14th.
Alex Fraser wrote:
Hi Kenneth --
Latest revision, 4096. Single threaded also crashes.
Cheers, Alex
On Fri, Jul 29, 2011 at 6:00 PM, Kenneth Heafield wrote:
> Hi,
>
>
rebuilding with build_binary that ships
with Moses.
- Ran threaded and not threaded.
Can you send me your very small SRILM model? Does it have ?
Kenneth
On 08/04/11 11:42, Kenneth Heafield wrote:
> Sorry I am slow to respond. This is my first thing to look at, but I
> am traveling a lot
Ok I have reproduced the problem. It only happens when the ARPA file is
missing and is probably an off-by-one on vocabulary size. I'll
have a fix soon.
Kenneth
On 08/15/11 19:20, Kenneth Heafield wrote:
> Hi,
>
> Back from vacation and sorry but I'm having trouble
sed on the counts
given in the ARPA file. When is missing from the ARPA file, I now
pad the vocabulary to the size it expects for the corrected count.
Sorry it took so long!
Kenneth
On 08/15/11 22:12, Kenneth Heafield wrote:
> Ok I have reproduced the problem. It only happens when the A
Do you have in your input or phrase table target side?
On 08/18/11 15:04, Sriram venkatapathy wrote:
>
> Hello,
>
> For a particular translation experiment, I get the following error in
> Moses decoder, and then the decoder aborts.
>
> moses: LanguageModel.cpp:115: void
> Moses::LanguageModel::C
be finally merged into the trunk ?
> (not the useless changes to PhraseDictionaryTree)
>
> Thanks, (And sorry for my low reactivity, I hope you remember me!)
>
> Marc
>
> - Mail original -
>> De: "Hieu Hoang"
>> À: "Marc LEGENDRE"
&g
Valgrind ; but hey, don't we all strive for perfection
> ? :-)
>
> I don't need this, I guess I should have removed it from my branch if I
> wanted to merge.
> It's done.
>
> - Mail original -
>> De: "Kenneth Heafield"
>> À:
You're in trunk as of 4160.
On 08/24/11 11:33, Marc LEGENDRE wrote:
> Absolutely no problem about the name thing, thank you for asking.
>
> Marc
>
> - Mail original -----
>> De: "Kenneth Heafield"
>> À: moses-support@mit.edu
>> Envoyé: Mer
Or just run kenlm/build_binary lm.arpa and it will spit out a memory
usage estimate (covering the LM only).
On 08/26/11 09:24, Hieu Hoang wrote:
> barry's right.
>
> Binarize the phrase table and the LM with irstlm or kenlm. Then just
> look at the file sizes & add a few 100mb and that's your me
Hi,
Edit your moses.ini and find [lmodel-file]. Change the first number to
8.
[lmodel-file]
8 0 5 /path/to/model.arpa
Or you can try to link against SRI, use more memory, and take longer. . .
Kenneth
On 09/06/11 19:43, Cyrine NASRI wrote:
> Hi , thank you for your reply
> I buit a 5gr
So what exactly is the issue? Progress can be monitored with stdout.
If stderr is queued, then you won't get sub-sentential progress anyway.
I'd rather stderr tell me what it's doing so if/when there's a segfault,
I have a place to start.
Kenneth
On 09/14/11 13:32, Phil Williams wrote:
> Yes
I took at look at the existing FactorCollection code and it made me cry,
so I rewrote it for revision 4242 including a better locking strategy.
On 09/20/11 12:10, Marcin Junczys-Dowmunt wrote:
> Hi Barry,
> very high lock contention. Deadlock is the wrong word. With 48 threads
> 'top' shows me ro
Dear Moses,
Trunk revision 4247 incorporates KenLM changes from MT Marathon
(team: Hieu Hoang, Tetsuo Kiso, Marcello Federico, and myself) to
minimize left language model state for chart decoding. This resulted in
a binary file format change.
Previously, if you used e.g. a 5-gram langua
My fault. Sorry. Fixed.
On 09/22/11 09:41, Hieu Hoang wrote:
> hiya
>
> There's currently a compile error in trunk when multi-threading is
> enabled. However, I think the root cause of the problem is that
> there's currently too many compile flags so developers can't test the
> different combin
-threads 1 ?
On 09/22/11 10:06, Tom Hoar wrote:
>
> Re: the survey. I suggest if multi-threading is always enabled, there
> should be a command-line option that allows users to disable
> multi-threading for debugging.
>
> Tom
>
>
>
> On Thu, 22 Sep 2011 09
ally want something like
>
> --threads 0
>
> which should bypass everything and truly run in single threaded mode
>
> Miles
>
> On 22 September 2011 10:26, Kenneth Heafield wrote:
>> -threads 1 ?
>>
>> On 09/22/11 10:06, Tom Hoar wrote:
>>
>> Re: th
But I don't see a use case for it. I can run gdb just fine on a
multithreaded program that happens to be running one thread. And the
stderr output will be in order.
On 09/22/11 11:21, Miles Osborne wrote:
> should someone want to debug with no threading, then there would need
> to be a mess of
gt;>>> Hi
>>>>
>>>> Here's my thoughts:
>>>>
>>>> - there should be single and multi-thread compile paths so single-thread
>>>> users don't pay the lock penalty. Maybe a -threads 0 works, but then you
>>>> have to check
1 - 100 of 602 matches
Mail list logo