Nicola, Our Python code is a wrapper/plugin that invokes srilm, randlm or irstlm, and relies on our work-flow manager for i/o and has much more overhead than you need. Plus, we never started the actual queue manager. I think it's just as easy to describe what we did here.
Build-lm.sh runs two loops (not counting Hieu's updates). One loop for ngt and another for build-sublm.pl. Our code uses only one loop by serializing ngt and build-sublm.pl subprocesses for each split. Using Hieu's approach, I think a shell script can do the same thing if build-lm.sh calls a separate shell script that serializes ngt/build-sublm.pl. We thought about re-writing split-dict.pl and merge-sublm.pl to become part of the queue manager, but never looked to see how much work was involved. Regarding experimentation to find the right balance between # of splits and memory size, our current serialized code estimates the host's free RAM using the Linux's "free -b". It's not perfect, but it's close. Then, we calculate: # of splits = (getsize(input) / (freeram()/(4 * cpu_count()))) + 1 This has worked well to keep the serialized splits within the available RAM without overflowing into swap file space. I think a more accurate formula would also account for the increased memory demands based on the order size. Tom On Wed, 11 Apr 2012 10:24:51 +0000, Nicola Bertoldi <[email protected]> wrote: > Hi Tom > > I try to answer below > >> On Apr 9, 2012, at 3:58 PM, Tom Hoar wrote: >> >> >> I sent this to the irstlm list, but also include it here in case >> this >> team has some comments. >> >> Hieu recently checked in changes to the build-lm.sh script to run >> the >> splits in parallel. About 6 months ago, we replaced IRSTLM's shell >> script with a Python wrapper to give us more control in our >> environment. We also prepared to multi-process the splits. We >> stopped >> work because of concerns that parallel processing might overloading >> system RAM resources. > > I think that the Hieu change had the empirical assumption that the > number of split could not exceed the number of CPUs. And in any case > using parallelization we are not assured to run out-of-memory. > > >> As we know, building LM's is memory intensive. Without the parallel >> processing, each serialized split can use 100% of the host's RAM, >> but >> the extra CPU cores sit idle. Parallel processing uses all CPU's, >> but >> each CPU competes for RAM resources. >> >> 1. Is the final result of a build identical if you build with one >> chunk or 3 splits or 30 splits? > > YES, the regression test build-lm-sublm2 check for that (1 split > vs 5 splits) > > >> 2. Are there any advantages/disadvantages to use a large number >> of >> splits with a queue manager so-as to only parallel process up to the >> max number of CPU's and reduce the RAM requirements with more but >> smaller splits? > > The main rules to take into account are the following: > - the smaller the splits, the less RAM requirement (for the single > split) > - the larger the number of splits, the larger is the time for merging > results (even this is not a very big issue) > > Hence, I think that, if a queue manager is available like that one > you are proposing, the best policy should be to use more but > smaller splits. > > I am going to write such a manager, because I think it is a good > enhancement of IRSTLM toolokit > Have you already something written in Python I can mimic in my > scripts? > > > The best tradeoff between number of splits (and hence RAM > requirements) and computation time should be found by means some > experimentation, on different machine with different RAM size > different number of threads, and so on. > >> 3. Has anyone experimented with other ways to reduce the RAM >> requirement for each process while still allowing them to run in >> parallel? > > No in FBK. > >> >> Tom > > > best regards, > Nicola Bertoldi > > > _______________________________________________ > Moses-support mailing list > [email protected]<mailto:[email protected]> > http://mailman.mit.edu/mailman/listinfo/moses-support > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
