The binary implementation seems to become unusable with more than 10-12
threads. Speed drops as more threads are used until it nearly deadlocks
at around 30 threads. I am using a 48-core server with 512 GB ram. Even
copying the binary phrase tables to a ramdisk does not solve the
problem. The behavior stays the same. The in-memory version works fine
with 48 threads, but uses nearly all our ram.

Pruning is also not enough, our filtered phrase table still takes around
300 GB when loaded into memory, I did not even dare to try and load the
unfiltered phrase-table into memory :). But I will take a look at the
implementation from the marathon, thanks.

At the moment I am thinking about using a perfect hash function as an
index and keeping target phrases as packed strings in memory. That
should use about as much memory as a gzipped phrase table on disk, it
will be slower though, but probably still faster than the binary version.

18/9/2011, "Hieu Hoang" <[email protected]> napisaƂ/a:

>phil williams was implementing a faster (& prob smaller) phrase table
>for the syntax/hierarchical model at the mt marathon
>
>http://statmt.org/mtm6/?n=Main.EfficientDecodingWithLabelledGrammarsInMoses
>
>if you want to use less memory, your best bet is to use the binary
>phrase-table. If it's too slow, 'encourage' the files into the OS cache with
>    cat *.binphr.* > /dev/null
>
>another way is to filter the phrase-table with the program in
>   /sigtest-filter/
>
>On 16/09/2011 20:24, Marcin Junczys-Dowmunt wrote:
>> Hi,
>> I am thinking about writing a phrase-table implementation that can be
>> fully kept in memory without using up hundreds of gigabytes like it is
>> the case for PhraseDictionaryMemory. Has something like that been done
>> earlier? In that case I would not duplicate the work :)
>>
>> I tried to browse through the branches, but could not find anything.
>> This brings me to my second question. Is there any place where I can
>> check what is done in the branches other than the commit messages in the
>> repository? It is not particularly easy to get a picture from those.
>>
>> Regards,
>> Marcin
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>_______________________________________________
>Moses-support mailing list
>[email protected]
>http://mailman.mit.edu/mailman/listinfo/moses-support

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to