I've implemented a multi-threaded Python wrapper that loads moses
decoder and pipes strings through the moses binary. It's similar to Ivan
Uemlianin's code from May 04, 2010 on this listserv, but achieves a
throughput efficiency 398% CPU load on a quad-core host across multiple
documents processed in a queue.  

Here's the rub. The decoder & the
wrapper run great for about 2 hours. Then they halt with an unknown
error. It's difficult to trace because it takes hours to reproduce. I
can see that the Moses binary doesn't generate an error exit code.
There's no error message about a "broken" pipe. When I restart the
script on the file that was in-process at the time of hault, it runs
just fine and continues processing. Since the error occurs consistently
at the 2 hour mark, and it's not the file causing the halt, I suspect at
a cache or buffer somewhere is overloaded. I've checked my python code,
and don't believe there are any buffer overruns there.  

I'm hoping
someone can review my comments and give me some pointers about Moses'
caches and how to verify manage the caches. The Moses manual describes
three cache: 

        * "-clean-lm-cache: clean language model caches after N
translations (default N=1)" : If -clean-lm-cache defaults to cleaning
the lm cache after each translation, I don't think this is a problem. 

        * "-persistent-cache-size: maximum size of cache for translation
options (default 10,000 input phrases)" : Some of my files have my files
have 2,500 or more pages with 20-25 sentence lines each. This could
exceed the default 10,000 input phrase cache. Would it be better to bump
up the -persistent-cache-size value, or manage the number of phrase I
pass to the input? 
        * "-use-persistent-cache: cache translation
options across sentences (default true)" : Regarding cashing across
sentences (which presumably apples to -use-persistent-cache), the manual
says, "you should also make sure that the cache is frequently cleared."
How do I clear the cache? Does this require forcing moses itself to
unload, and then reload it? Also, the -use-persistent-cache value
defaults to "true". What is the effect of changing this to "false"? Does
it effectively disable this cache and eliminate the requirement to clear
it?

Thanks,
Tom
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to