[Moses-support] stack Backtrace error with trainNeuralNetwork and bilintual-lm train_nplm.py

Tom Hoar Sun, 28 Jan 2018 21:58:13 -0800

I'm using the NPLM fork in the Moses git repository: 
https://github.com/moses-smt/nplm. I ran the trainNeuralNetwork binary 
two or three times on the same data successfully. As I validated results 
with several runs, I experienced ths stack backtrace error with 
trainNeuralNetwork binary with zero changes to the command line 
configuration. This run successfully created the configured 10 epochs 
and then crashed with the following logs.


Have I configured something wrong?


train_nplm.py's log file:

Command line:
/home/tahoar/slatetoolkit/src/trainNeuralNetwork --train_file 
/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k/bitext.numberized
 
--num_epochs 10 --model_prefix 
/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k/en-nl-test_ep15k.model.nplm
 
--learning_rate 1 --minibatch_size 1000 --num_noise_samples 100 
--num_hidden 0 --input_embedding_dimension 150 
--output_embedding_dimension 750 --num_threads 7 --activation_function 
rectifier --ngram_size 14 --input_words_file 
/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k/vocab.source 
--output_words_file 
/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k/vocab.target
(required)  Training data (one numberized example per line). Value: 
/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k/bitext.numberized
Validation data (one numberized example per line). Value:
Vocabulary. Value: 
/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k/vocab.source
Vocabulary. Value: 
/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k/vocab.target
Prefix for output model files. Value: 
/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k/en-nl-test_ep15k.model.nplm
Size of n-grams. Default: auto. Value: 14
Vocabulary size. Default: auto. Value: 0
Vocabulary size. Default: auto. Value: 0
Use memory mapped files. This is useful if the entire data cannot fit in 
memory. prepareNeuralLM can generate memory mapped files Value: 0
Number of input embedding dimensions. Default: 50. Value: 150
Number of output embedding dimensions. Default: 50. Value: 750
Share input and output embeddings. 1 = yes, 0 = no. Default: 0. Value: 0
Number of hidden nodes. Default: 100. Value: 0
Activation function (identity, rectifier, tanh, hardtanh). Default: 
rectifier. Value: rectifier
Loss function (log, nce). Default: nce. Value: nce
Initialize parameters from a normal distribution. 1 = normal, 0 = 
uniform. Default: 0. Value: 0
Maximum (of uniform) or standard deviation (of normal) for 
initialization. Default: 0.01 Value: 0.01
Number of epochs. Default: 10. Value: 10
Minibatch size (for training). Default: 1000. Value: 1000
Learning rate for stochastic gradient ascent. Default: 1. Value: 1
L2 regularization strength (hidden layer weights only). Default: 0. Value: 0
Number of noise samples for noise-contrastive estimation. Default: 100. 
Value: 100
Learn individual normalization factors during training. 1 = yes, 0 = no. 
Default: 0. Value: 0
Use momentum (hidden layer weights only). 1 = yes, 0 = no. Default: 0. 
Value: 0
Number of threads. Default: maximum. Value: 7
Using 7 threads
Reading data from regular text file
Reading minibatches from file 
/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k/bitext.numberized:
 
done.
Number of training instances: 231813
Randomly shuffling data...
Reading word list from: 
/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k/vocab.source
Reading word list from: 
/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k/vocab.target
Number of training minibatches: 232
Epoch 1
Current learning rate: 1
Training minibatches: done.
Training NCE log-likelihood: -1.43319e+06
Writing model
Epoch 2
Current learning rate: 1
Training minibatches: done.
Training NCE log-likelihood: -1.19785e+06
Writing model
Epoch 3
Current learning rate: 1
Training minibatches: done.
Training NCE log-likelihood: -1.11136e+06
Writing model
Epoch 4
Current learning rate: 1
Training minibatches: done.
Training NCE log-likelihood: -1.0542e+06
Writing model
Epoch 5
Current learning rate: 1
Training minibatches: done.
Training NCE log-likelihood: -1.01097e+06
Writing model
Epoch 6
Current learning rate: 1
Training minibatches: done.
Training NCE log-likelihood: -975214
Writing model
Epoch 7
Current learning rate: 1
Training minibatches: done.
Training NCE log-likelihood: -942636
Writing model
Epoch 8
Current learning rate: 1
Training minibatches: done.
Training NCE log-likelihood: -915340
Writing model
Epoch 9
Current learning rate: 1
Training minibatches: done.
Training NCE log-likelihood: -890005
Writing model
Epoch 10
Current learning rate: 1
Training minibatches: done.
Training NCE log-likelihood: -869231
Writing model
Training output:
Return code: -6



Terminal output:

python 
/home/tahoar/slatetoolkit/scripts/training/bilingual-lm/train_nplm.py 
--working-dir=/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k 
--corpus=/home/tahoar/slatedesktop/var/TRAININGS/smt-tm-en-nl-test_ep15k/bitext 
--nplm-home=/home/tahoar/slatetoolkit --ngram-size=14 --hidden=0 
--epochs=10 --output-embedding=750 --threads=$(($(nproc) -1)) 
--model-stem=en-nl-test_ep15k 
--model-dir=/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k 
--input-words-file=/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k/vocab.source
 
--output-words-file=/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k/vocab.target
Train model command:
/home/tahoar/slatetoolkit/src/trainNeuralNetwork --train_file 
/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k/bitext.numberized
 
--num_epochs 10 --model_prefix 
/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k/en-nl-test_ep15k.model.nplm
 
--learning_rate 1 --minibatch_size 1000 --num_noise_samples 100 
--num_hidden 0 --input_embedding_dimension 150 
--output_embedding_dimension 750 --num_threads 7 --activation_function 
rectifier --ngram_size 14 --input_words_file 
/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k/vocab.source 
--output_words_file 
/home/tahoar/slatedesktop/var/TRAININGS/smt-blm-en-nl-test_ep15k/vocab.target
*** Error in `/home/tahoar/slatetoolkit/src/trainNeuralNetwork': 
munmap_chunk(): invalid pointer: 0x00007fd06b4d0010 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7fd06d5977e5]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x1a8)[0x7fd06d5a4698]
/home/tahoar/slatetoolkit/src/trainNeuralNetwork[0x40df88]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7fd06d540830]
/home/tahoar/slatetoolkit/src/trainNeuralNetwork[0x410949]
======= Memory map: ========
00400000-00473000 r-xp 00000000 00:00 428625 
/home/tahoar/slatetoolkit/src/trainNeuralNetwork
00672000-00673000 r--p 00072000 00:00 428625 
/home/tahoar/slatetoolkit/src/trainNeuralNetwork
00673000-00674000 rw-p 00073000 00:00 428625 
/home/tahoar/slatetoolkit/src/trainNeuralNetwork
01ac6000-02802000 rw-p 00000000 00:00 0 [heap]
7fd034000000-7fd034065000 rw-p 00000000 00:00 0
7fd034065000-7fd038000000 ---p 00000000 00:00 0
7fd03c000000-7fd03c065000 rw-p 00000000 00:00 0
7fd03c065000-7fd040000000 ---p 00000000 00:00 0
7fd044000000-7fd044065000 rw-p 00000000 00:00 0
7fd044065000-7fd048000000 ---p 00000000 00:00 0
7fd04c000000-7fd04c065000 rw-p 00000000 00:00 0
7fd04c065000-7fd050000000 ---p 00000000 00:00 0
7fd054000000-7fd054065000 rw-p 00000000 00:00 0
7fd054065000-7fd058000000 ---p 00000000 00:00 0
7fd05c000000-7fd05c071000 rw-p 00000000 00:00 0
7fd05c071000-7fd060000000 ---p 00000000 00:00 0
7fd0608b0000-7fd0608b1000 ---p 00000000 00:00 0
7fd0608b1000-7fd0610b1000 rw-p 00000000 00:00 0
7fd0610c0000-7fd0610c1000 ---p 00000000 00:00 0
7fd0610c1000-7fd0618c1000 rw-p 00000000 00:00 0
7fd0618d0000-7fd0618d1000 ---p 00000000 00:00 0
7fd0618d1000-7fd0620d1000 rw-p 00000000 00:00 0
7fd0620e0000-7fd0620e1000 ---p 00000000 00:00 0
7fd0620e1000-7fd0628e1000 rw-p 00000000 00:00 0
7fd0628f0000-7fd0628f1000 ---p 00000000 00:00 0
7fd0628f1000-7fd0630f1000 rw-p 00000000 00:00 0
7fd063100000-7fd063101000 ---p 00000000 00:00 0
7fd063101000-7fd063901000 rw-p 00000000 00:00 0
7fd06b4d0000-7fd06c4d1000 rw-p 00000000 00:00 0
7fd06d310000-7fd06d313000 r-xp 00000000 00:00 174181 
/lib/x86_64-linux-gnu/libdl-2.23.so
7fd06d313000-7fd06d314000 ---p 00003000 00:00 174181 
/lib/x86_64-linux-gnu/libdl-2.23.so
7fd06d314000-7fd06d512000 ---p 00004000 00:00 174181 
/lib/x86_64-linux-gnu/libdl-2.23.so
7fd06d512000-7fd06d513000 r--p 00002000 00:00 174181 
/lib/x86_64-linux-gnu/libdl-2.23.so
7fd06d513000-7fd06d514000 rw-p 00003000 00:00 174181 
/lib/x86_64-linux-gnu/libdl-2.23.so
7fd06d520000-7fd06d6e0000 r-xp 00000000 00:00 174162 
/lib/x86_64-linux-gnu/libc-2.23.so
7fd06d6e0000-7fd06d6e9000 ---p 001c0000 00:00 174162 
/lib/x86_64-linux-gnu/libc-2.23.so
7fd06d6e9000-7fd06d8e0000 ---p 001c9000 00:00 174162 
/lib/x86_64-linux-gnu/libc-2.23.so
7fd06d8e0000-7fd06d8e4000 r--p 001c0000 00:00 174162 
/lib/x86_64-linux-gnu/libc-2.23.so
7fd06d8e4000-7fd06d8e6000 rw-p 001c4000 00:00 174162 
/lib/x86_64-linux-gnu/libc-2.23.so
7fd06d8e6000-7fd06d8ea000 rw-p 00000000 00:00 0
7fd06d8f0000-7fd06d908000 r-xp 00000000 00:00 174160 
/lib/x86_64-linux-gnu/libpthread-2.23.so
7fd06d908000-7fd06d912000 ---p 00018000 00:00 174160 
/lib/x86_64-linux-gnu/libpthread-2.23.so
7fd06d912000-7fd06db07000 ---p 00022000 00:00 174160 
/lib/x86_64-linux-gnu/libpthread-2.23.so
7fd06db07000-7fd06db08000 r--p 00017000 00:00 174160 
/lib/x86_64-linux-gnu/libpthread-2.23.so
7fd06db08000-7fd06db09000 rw-p 00018000 00:00 174160 
/lib/x86_64-linux-gnu/libpthread-2.23.so
7fd06db09000-7fd06db0d000 rw-p 00000000 00:00 0
7fd06db10000-7fd06db26000 r-xp 00000000 00:00 116184 
/lib/x86_64-linux-gnu/libgcc_s.so.1
7fd06db26000-7fd06dd25000 ---p 00016000 00:00 116184 
/lib/x86_64-linux-gnu/libgcc_s.so.1
7fd06dd25000-7fd06dd26000 rw-p 00015000 00:00 116184 
/lib/x86_64-linux-gnu/libgcc_s.so.1
7fd06dd30000-7fd06dd51000 r-xp 00000000 00:00 213006 
/usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
7fd06dd51000-7fd06dd52000 ---p 00021000 00:00 213006 
/usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
7fd06dd52000-7fd06df50000 ---p 00022000 00:00 213006 
/usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
7fd06df50000-7fd06df51000 r--p 00020000 00:00 213006 
/usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
7fd06df51000-7fd06df52000 rw-p 00021000 00:00 213006 
/usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0
7fd06df60000-7fd06e068000 r-xp 00000000 00:00 174151 
/lib/x86_64-linux-gnu/libm-2.23.so
7fd06e068000-7fd06e06a000 ---p 00108000 00:00 174151 
/lib/x86_64-linux-gnu/libm-2.23.so
7fd06e06a000-7fd06e267000 ---p 0010a000 00:00 174151 
/lib/x86_64-linux-gnu/libm-2.23.so
7fd06e267000-7fd06e268000 r--p 00107000 00:00 174151 
/lib/x86_64-linux-gnu/libm-2.23.so
7fd06e268000-7fd06e269000 rw-p 00108000 00:00 174151 
/lib/x86_64-linux-gnu/libm-2.23.so
7fd06e270000-7fd06e3e2000 r-xp 00000000 00:00 179192 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21
7fd06e3e2000-7fd06e3ef000 ---p 00172000 00:00 179192 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21
7fd06e3ef000-7fd06e5e2000 ---p 0017f000 00:00 179192 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21
7fd06e5e2000-7fd06e5ec000 r--p 00172000 00:00 179192 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21
7fd06e5ec000-7fd06e5ee000 rw-p 0017c000 00:00 179192 
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21
7fd06e5ee000-7fd06e5f2000 rw-p 00000000 00:00 0
7fd06e600000-7fd06e626000 r-xp 00000000 00:00 174159 
/lib/x86_64-linux-gnu/ld-2.23.so
7fd06e825000-7fd06e826000 r--p 00025000 00:00 174159 
/lib/x86_64-linux-gnu/ld-2.23.so
7fd06e826000-7fd06e827000 rw-p 00026000 00:00 174159 
/lib/x86_64-linux-gnu/ld-2.23.so
7fd06e827000-7fd06e828000 rw-p 00000000 00:00 0
7fd06e970000-7fd06e971000 rw-p 00000000 00:00 0
7fd06e980000-7fd06e981000 rw-p 00000000 00:00 0
7fd06e990000-7fd06e992000 rw-p 00000000 00:00 0
7fd06e9a0000-7fd06e9a1000 rw-p 00000000 00:00 0
7fd06e9b0000-7fd06e9b1000 rw-p 00000000 00:00 0
7fd06e9c0000-7fd06e9c1000 rw-p 00000000 00:00 0
7fd06e9d0000-7fd06e9d1000 rw-p 00000000 00:00 0
7fffc028a000-7fffc0a8a000 rw-p 00000000 00:00 0 [stack]
7fffc1199000-7fffc119a000 r-xp 00000000 00:00 0 [vdso]
Traceback (most recent call last):
   File 
"/home/tahoar/slatetoolkit/scripts/training/bilingual-lm/train_nplm.py", 
line 159, in <module>
     main(options)
   File 
"/home/tahoar/slatetoolkit/scripts/training/bilingual-lm/train_nplm.py", 
line 154, in main
     raise Exception("Training failed")
Exception: Training failed






_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

[Moses-support] stack Backtrace error with trainNeuralNetwork and bilintual-lm train_nplm.py

Reply via email to