Weird.

Jeremy, I binarized your phrase-table a couple of times with different 
commits (also the most recent one), and I cannot reproduce the error. 
Try maybe -threads 10 or 12.
I can make the binarized versions available for download.

W dniu 02.02.2016 o 18:21, Marcin Junczys-Dowmunt pisze:
> Looks fine, I had no problems running it with 18 and more domain
> indicators. Your machine is certainly more than suitable. Just one
> remark, using more than 8-12 threads usually slows things down, but
> should not cause crashes. Any chance to have a look at that table?
>
> W dniu 02.02.2016 o 18:16, Jeremy Gwinnup pisze:
>> Marcin,
>>
>> I was able to use -T with processLexicalTableMin successfully. I also tried 
>> processPhraseTableMin using a local tmp dir with 200G free and it still 
>> crashed at step 3 with the huge malloc message. Phrase table is nothing 
>> fancy - just standard 4 scores and 3 domain indicator features. Here’s a 
>> complete output with more info about the phrase table:
>>
>> Phrase table in question:
>>
>> -rw-rw-r-- 1 jgwinnup scream 2.2G Feb  1 23:58 phrase-table.1.gz
>>
>> Machine in question has 1TB RAM/32 cores - should be more than enough for 
>> the jobe
>>
>> Moses git-rev ends with: 80572b4 (Jan. 27)
>>
>> 1tqoct1:model> $MOSES/bin/processPhraseTableMin -in phrase-table.1.gz -out 
>> phrase-table.1 -threads all -nscores 7 -T /tmp_with_200G_free
>> WARNING: You are using a nonstandard number of scores (7) with PREnc. Set 
>> the index of P(t|s) with  -rankscore int  if it is not 2.
>> Used options:
>>      Text phrase table will be read from: phrase-table.1.gz
>>      Output phrase table will be written to: phrase-table.1.minphr
>>      Step size for source landmark phrases: 2^10=1024
>>      Source phrase fingerprint size: 16 bits / P(fp)=1.52588e-05
>>      Selected target phrase encoding: Huffman + PREnc
>>      Maxiumum allowed rank for PREnc: 100
>>      Number of score components in phrase table: 7
>>      Single Huffman code set for score components: no
>>      Using score quantization: no
>>      Explicitly included alignment information: yes
>>      Running with 32 threads
>>
>> Pass 1/3: Creating hash function for rank assignment
>> ..................................................[5000000]
>> ..................................................[10000000]
>> ..................................................[15000000]
>> ..................................................[20000000]
>> ..................................................[25000000]
>> ..................................................[30000000]
>> ..................................................[35000000]
>> ..................................................[40000000]
>> ..................................................[45000000]
>> ....
>>
>> Pass 2/3: Creating source phrase index + Encoding target phrases
>> ..................................................[5000000]
>> ..................................................[10000000]
>> ..................................................[15000000]
>> ..................................................[20000000]
>> ..................................................[25000000]
>> ..................................................[30000000]
>> ..................................................[35000000]
>> ..................................................[40000000]
>> ..................................................[45000000]
>> ....
>>
>> Intermezzo: Calculating Huffman code sets
>>      Creating Huffman codes for 471366 target phrase symbols
>> tcmalloc: large alloc 13808820224 bytes == 0xb0592000 @
>> tcmalloc: large alloc 27617640448 bytes == 0x3e86b0000 @
>> tcmalloc: large alloc 5187358422106112 bytes == (nil) @
>> terminate called after throwing an instance of 'std::bad_alloc'
>>     what():  std::bad_alloc
>>
>>
>>
>>
>>> On Feb 2, 2016, at 10:21 AM, Jeremy Gwinnup <[email protected]> wrote:
>>>
>>> Hi,
>>>
>>> I’m having a problem using processPhraseTableMin to compress a phrase table 
>>> with 7 scores - the program consistently coredumps at step 3 - command and 
>>> relevant output below. Is there anything I’m doing glaringly wrong?
>>>
>>> Thanks!
>>> -Jeremy
>>>
>>> Command:
>>>
>>> 1tqoct1:model> $MOSES/bin/processPhraseTableMin -in phrase-table.1.gz -out 
>>> phrase-table.1 -threads all -nscores 7
>>>
>>> Once we get to step 3:
>>>
>>> Intermezzo: Calculating Huffman code sets
>>>     Creating Huffman codes for 471366 target phrase symbols
>>> tcmalloc: large alloc 13983629312 bytes == 0xb14ce000 @
>>> tcmalloc: large alloc 27967250432 bytes == 0x3f3ca4000 @
>>> tcmalloc: large alloc 15681406635450368 bytes == (nil) @
>>> terminate called after throwing an instance of 'std::bad_alloc'
>>>    what():  std::bad_alloc
>>>
>>> Top looked like this when the program ran into trouble:
>>>    PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>> 27416 jgwinnup  20   0 45.9g  30g 4.0g R 10.6  3.0   1589:17 processPhraseTa
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to