Hi Folks, I have set out with the aim of learning more about the underlying Joshua language model serialization(s) e.g. statistical n-gram model in ARPA format [0] as well as trying to JProfile a Joshua server running to better understand how objects are used and what runtime memory usage looks like for typical translation tasks. This has lead me to think about the fundamental performance issues we experience when loading large LM's into memory in the first place... and the efficiency of searching models regardless of whether they are cached in memory (e.g. Joshua server), or not. Does anyone have detailed technical/journal documentation which would set me in the right direction to address the above area? Thanks Lewis
[0] http://cmusphinx.sourceforge.net/wiki/sphinx4:standardgrammarformats#statistical_n-gram_models_in_the_arpa_format -- http://home.apache.org/~lewismc/ @hectorMcSpector http://www.linkedin.com/in/lmcgibbney
