Weird. Jeremy, I binarized your phrase-table a couple of times with different commits (also the most recent one), and I cannot reproduce the error. Try maybe -threads 10 or 12. I can make the binarized versions available for download.
W dniu 02.02.2016 o 18:21, Marcin Junczys-Dowmunt pisze: > Looks fine, I had no problems running it with 18 and more domain > indicators. Your machine is certainly more than suitable. Just one > remark, using more than 8-12 threads usually slows things down, but > should not cause crashes. Any chance to have a look at that table? > > W dniu 02.02.2016 o 18:16, Jeremy Gwinnup pisze: >> Marcin, >> >> I was able to use -T with processLexicalTableMin successfully. I also tried >> processPhraseTableMin using a local tmp dir with 200G free and it still >> crashed at step 3 with the huge malloc message. Phrase table is nothing >> fancy - just standard 4 scores and 3 domain indicator features. Here’s a >> complete output with more info about the phrase table: >> >> Phrase table in question: >> >> -rw-rw-r-- 1 jgwinnup scream 2.2G Feb 1 23:58 phrase-table.1.gz >> >> Machine in question has 1TB RAM/32 cores - should be more than enough for >> the jobe >> >> Moses git-rev ends with: 80572b4 (Jan. 27) >> >> 1tqoct1:model> $MOSES/bin/processPhraseTableMin -in phrase-table.1.gz -out >> phrase-table.1 -threads all -nscores 7 -T /tmp_with_200G_free >> WARNING: You are using a nonstandard number of scores (7) with PREnc. Set >> the index of P(t|s) with -rankscore int if it is not 2. >> Used options: >> Text phrase table will be read from: phrase-table.1.gz >> Output phrase table will be written to: phrase-table.1.minphr >> Step size for source landmark phrases: 2^10=1024 >> Source phrase fingerprint size: 16 bits / P(fp)=1.52588e-05 >> Selected target phrase encoding: Huffman + PREnc >> Maxiumum allowed rank for PREnc: 100 >> Number of score components in phrase table: 7 >> Single Huffman code set for score components: no >> Using score quantization: no >> Explicitly included alignment information: yes >> Running with 32 threads >> >> Pass 1/3: Creating hash function for rank assignment >> ..................................................[5000000] >> ..................................................[10000000] >> ..................................................[15000000] >> ..................................................[20000000] >> ..................................................[25000000] >> ..................................................[30000000] >> ..................................................[35000000] >> ..................................................[40000000] >> ..................................................[45000000] >> .... >> >> Pass 2/3: Creating source phrase index + Encoding target phrases >> ..................................................[5000000] >> ..................................................[10000000] >> ..................................................[15000000] >> ..................................................[20000000] >> ..................................................[25000000] >> ..................................................[30000000] >> ..................................................[35000000] >> ..................................................[40000000] >> ..................................................[45000000] >> .... >> >> Intermezzo: Calculating Huffman code sets >> Creating Huffman codes for 471366 target phrase symbols >> tcmalloc: large alloc 13808820224 bytes == 0xb0592000 @ >> tcmalloc: large alloc 27617640448 bytes == 0x3e86b0000 @ >> tcmalloc: large alloc 5187358422106112 bytes == (nil) @ >> terminate called after throwing an instance of 'std::bad_alloc' >> what(): std::bad_alloc >> >> >> >> >>> On Feb 2, 2016, at 10:21 AM, Jeremy Gwinnup <[email protected]> wrote: >>> >>> Hi, >>> >>> I’m having a problem using processPhraseTableMin to compress a phrase table >>> with 7 scores - the program consistently coredumps at step 3 - command and >>> relevant output below. Is there anything I’m doing glaringly wrong? >>> >>> Thanks! >>> -Jeremy >>> >>> Command: >>> >>> 1tqoct1:model> $MOSES/bin/processPhraseTableMin -in phrase-table.1.gz -out >>> phrase-table.1 -threads all -nscores 7 >>> >>> Once we get to step 3: >>> >>> Intermezzo: Calculating Huffman code sets >>> Creating Huffman codes for 471366 target phrase symbols >>> tcmalloc: large alloc 13983629312 bytes == 0xb14ce000 @ >>> tcmalloc: large alloc 27967250432 bytes == 0x3f3ca4000 @ >>> tcmalloc: large alloc 15681406635450368 bytes == (nil) @ >>> terminate called after throwing an instance of 'std::bad_alloc' >>> what(): std::bad_alloc >>> >>> Top looked like this when the program ran into trouble: >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>> 27416 jgwinnup 20 0 45.9g 30g 4.0g R 10.6 3.0 1589:17 processPhraseTa >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
