Hi
The backtrace would be more informative if you run with a debug build
(add variant=debug to bjam). Sometimes this makes bugs go away, or new
bugs appear, but if not then it will give more information. You can run
with core files enabled (ulimit -c unlimited) to save having to run
Moses inside gdb.
If the bug is random, but not thread related, then it could well be
memory corruption. Running Moses in valgrind can help track this down
(again, using a debug build is better). Note that the suffix arrays
crash valgrind (last time I checked) so don't build them in,
cheers - Barry
On 13/04/16 11:25, Ales Tamchyna wrote:
Hi,
Let me add some more information to this: when running Moses in gdb, I get the
following backtrace:
#0 0x00000000006e3ba4 in
Moses::PhraseDecoder::CreateTargetPhraseCollection(Moses::Phrase const&, bool,
bool) ()
#1 0x00000000005cd2a7 in
Moses::PhraseDictionaryCompact::GetTargetPhraseCollectionNonCacheLEGACY(Moses::Phrase
const&) const ()
#2 0x000000000048efe4 in
Moses::PhraseDictionary::GetTargetPhraseCollectionLEGACY(Moses::Phrase const&)
const ()
#3 0x000000000048e6a0 in
Moses::PhraseDictionary::GetTargetPhraseCollectionBatch(std::vector<Moses::InputPath*,
std::allocator<Moses::InputPath*> > const&) const ()
#4 0x0000000000560948 in
Moses::TranslationOptionCollection::GetTargetPhraseCollectionBatch() ()
#5 0x0000000000551a39 in
Moses::TranslationOptionCollectionText::CreateTranslationOptions() ()
#6 0x00000000004bddfc in Moses::Manager::Decode() ()
#7 0x0000000000433bd4 in Moses::TranslationTask::Run() ()
#8 0x0000000000496088 in Moses::ThreadPool::Execute() ()
#9 0x00000000007cbdba in thread_proxy ()
#10 0x00007fffc210c182 in start_thread (arg=0x7ffc23a5d700) at
pthread_create.c:312
#11 0x00007fffc1e3947d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
This suggests the problem is somewhere in loading phrase translations from the
compact phrase table.
I’m not sure why the LEGACY functions are called but I’m assuming that these
are “future” legacy methods and that they are in fact still used by phrase
dictionary implementations (?).
Best,
Ales
From: Ondrej Bojar
Sent: středa 13. dubna 2016 12:19
To:[email protected]
Cc: Roman Sudarikov; Ales Tamchyna
Subject: Random segfaults with alternative decoding paths
Hi,
we're experiencing random segfaults when we use two phrase tables in
alternative decoding paths. The exact commit of moses we use is
6a06e7776a58b09e4ed5b1cf11eb64fbdd6b02a2, from April 1.
We do have test runs on the exact same 200 input sentences, exact same
moses.ini, on the very same machine, where one of the runs succeeds and the
other dies after 45 sentences.
Would anyone have any idea what should we be chasing?
- it doesn't seem to be thread-related (segfault experienced with -threads 1 as
well as -threads 8)
- not related to nbest-list construction (we first had this problem in mert
tuning so we isolated this)
- not related to more LMs (we first had several LMs in the setup, we get the
crash with just one as well)
- not related to -search, the bug is there with -search set to 0, 1 or 4
- seems related to data or data size: when we trained the first ttable on just
a very small corpus, we did not get the segfault (yet)
- not related to translation options caching, the bug is there even with
-no-cache
- not related to the specification of output-factors; left unspecified or set to
0<CR>1, the bug is there
Here is the moses.ini:
[input-factors]
0
[mapping]
0 T 0
1 T 1
[distortion-limit]
6
[feature]
Distortion
KENLM lazyken=0 name=LM0 factor=0 path=lm.1.trie.lm order=4
PhraseDictionaryCompact name=TranslationModel0 num-features=4
path=phrase-table.0-0,1.1.1 input-factor=0 output-factor=0,1 table-limit=100
PhraseDictionaryCompact name=TranslationModel1 num-features=4
path=phrase-table.0-0,1.2.1 input-factor=0 output-factor=0,1 table-limit=100
PhrasePenalty
UnknownWordPenalty
WordPenalty
[weight]
Distortion0= 0.3
LM0= 0.5
PhrasePenalty0= 0.2
TranslationModel0= 0.2 0.2 0.2 0.2
TranslationModel1= 0.2 0.2 0.2 0.2
UnknownWordPenalty0= 1
WordPenalty0= -1
The large setup that shows these crashes uses this big files:
-rw-r--r-- 1 bojar ufal 584M Apr 13 09:19 lm.1.trie.lm
-rw-r--r-- 1 bojar ufal 1.1G Apr 13 09:24 phrase-table.0-0,1.1.1.minphr
-rw-r--r-- 1 bojar ufal 5.7M Apr 13 09:24 phrase-table.0-0,1.2.1.minphr
Thanks,
Ondrej.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support