Looks fine, I had no problems running it with 18 and more domain 
indicators. Your machine is certainly more than suitable. Just one 
remark, using more than 8-12 threads usually slows things down, but 
should not cause crashes. Any chance to have a look at that table?

W dniu 02.02.2016 o 18:16, Jeremy Gwinnup pisze:
> Marcin,
>
> I was able to use -T with processLexicalTableMin successfully. I also tried 
> processPhraseTableMin using a local tmp dir with 200G free and it still 
> crashed at step 3 with the huge malloc message. Phrase table is nothing fancy 
> - just standard 4 scores and 3 domain indicator features. Here’s a complete 
> output with more info about the phrase table:
>
> Phrase table in question:
>
> -rw-rw-r-- 1 jgwinnup scream 2.2G Feb  1 23:58 phrase-table.1.gz
>
> Machine in question has 1TB RAM/32 cores - should be more than enough for the 
> job
>
> Moses git-rev ends with: 80572b4 (Jan. 27)
>
> 1tqoct1:model> $MOSES/bin/processPhraseTableMin -in phrase-table.1.gz -out 
> phrase-table.1 -threads all -nscores 7 -T /tmp_with_200G_free
> WARNING: You are using a nonstandard number of scores (7) with PREnc. Set the 
> index of P(t|s) with  -rankscore int  if it is not 2.
> Used options:
>       Text phrase table will be read from: phrase-table.1.gz
>       Output phrase table will be written to: phrase-table.1.minphr
>       Step size for source landmark phrases: 2^10=1024
>       Source phrase fingerprint size: 16 bits / P(fp)=1.52588e-05
>       Selected target phrase encoding: Huffman + PREnc
>       Maxiumum allowed rank for PREnc: 100
>       Number of score components in phrase table: 7
>       Single Huffman code set for score components: no
>       Using score quantization: no
>       Explicitly included alignment information: yes
>       Running with 32 threads
>
> Pass 1/3: Creating hash function for rank assignment
> ..................................................[5000000]
> ..................................................[10000000]
> ..................................................[15000000]
> ..................................................[20000000]
> ..................................................[25000000]
> ..................................................[30000000]
> ..................................................[35000000]
> ..................................................[40000000]
> ..................................................[45000000]
> ....
>
> Pass 2/3: Creating source phrase index + Encoding target phrases
> ..................................................[5000000]
> ..................................................[10000000]
> ..................................................[15000000]
> ..................................................[20000000]
> ..................................................[25000000]
> ..................................................[30000000]
> ..................................................[35000000]
> ..................................................[40000000]
> ..................................................[45000000]
> ....
>
> Intermezzo: Calculating Huffman code sets
>       Creating Huffman codes for 471366 target phrase symbols
> tcmalloc: large alloc 13808820224 bytes == 0xb0592000 @
> tcmalloc: large alloc 27617640448 bytes == 0x3e86b0000 @
> tcmalloc: large alloc 5187358422106112 bytes == (nil) @
> terminate called after throwing an instance of 'std::bad_alloc'
>    what():  std::bad_alloc
>
>
>
>
>> On Feb 2, 2016, at 10:21 AM, Jeremy Gwinnup <jer...@gwinnup.org> wrote:
>>
>> Hi,
>>
>> I’m having a problem using processPhraseTableMin to compress a phrase table 
>> with 7 scores - the program consistently coredumps at step 3 - command and 
>> relevant output below. Is there anything I’m doing glaringly wrong?
>>
>> Thanks!
>> -Jeremy
>>
>> Command:
>>
>> 1tqoct1:model> $MOSES/bin/processPhraseTableMin -in phrase-table.1.gz -out 
>> phrase-table.1 -threads all -nscores 7
>>
>> Once we get to step 3:
>>
>> Intermezzo: Calculating Huffman code sets
>>      Creating Huffman codes for 471366 target phrase symbols
>> tcmalloc: large alloc 13983629312 bytes == 0xb14ce000 @
>> tcmalloc: large alloc 27967250432 bytes == 0x3f3ca4000 @
>> tcmalloc: large alloc 15681406635450368 bytes == (nil) @
>> terminate called after throwing an instance of 'std::bad_alloc'
>>   what():  std::bad_alloc
>>
>> Top looked like this when the program ran into trouble:
>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>> 27416 jgwinnup  20   0 45.9g  30g 4.0g R 10.6  3.0   1589:17 processPhraseTa
>
> _______________________________________________
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support


_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to