Hi Dennis,
moses_chart maintains a lot of state information during rule lookup and if large numbers of rules can be applied at each span then the memory use can get pretty huge. Your re-ordering grammar does sound a likely candidate for triggering this. We're looking at ways to reduce memory use and improvements should trickle into SVN over the next few weeks. You could also try:
* reducing the numbers of threads: each thread translates a separate sentence so has its own rule lookup state (plus hypothesis stacks, etc)
* upgrading to revision 4050 -- this should give at least a small improvement (I got a 4GB saving in a string-to-tree experiment that originally used 37GB (approx 20GB static model storage + 17GB active decoding state))
* reducing the decoder's -max-chart-span limit for the non-glue grammars
* reducing the decoder's -cube-pruning-pop-limit
* [unsightly hack] changing this line in moses/src/TypeDef.h:
const size_t MAX_NUM_FACTORS = 4;
to:
const size_t MAX_NUM_FACTORS = 1;
and recompiling.
Phil
On 28 Jun, 2011,at 11:34 PM, Dennis Mehay <[email protected]> wrote:
Hi all,
I am MERTing using multithreaded moses_chart on a machine with quad-core processors (4 threads). What I find amazing is that this MERT run is consuming 20G of RAM (yes, 20G!). The main rule table is binarized (and it only had 1,926,150 entries in it to begin with -- those are non-Continental commas, i.e., ~2 million). So I thought maybe it's the second rule table, which consists entirely of syntactic re-ordering rules (only ~12 thousand entries).
First, should multithreaded moses_chart be using so much memory? I gave it ttable limits of 50 (for the main rule table), 25 (for the purely syntactic table) and 1000 (for the glue grammar -- don't know why, just seemed like it wouldn't matter much).
Second, I tried to binarize (convert to on-disk repr) the 12K entry table, but CreateOnDistPt isn't co-operating.
$ CreateOnDiskPt 1 1 2 25 1 reordering-table.gz binarized.reordering-table
Starting : [0] seconds
CreateOnDiskPt: PhraseNode.cpp:98: void OnDiskPt::PhraseNode::Save(OnDiskPt::OnDiskWrapper&, size_t, size_t): Assertion `!m_saved' failed.
Aborted
What's going on here. I told it that the index of p(e | f) is 1, because the first score is p(mother | e-children, f-children), which is as close to p(e | f) as we're going to get here. There are 2 scores in the table, so it shouldn't be the third parameter
Here's a sample of what the (home brewed) reordering-table.gz file looks like:
-----------------------------------------------------------
...
[X][N_num_] [X] ||| [X][N_num_] [(S\NP)\((S\NP)/N_num_)] ||| 0.666666666667 1.0 ||| 0-0 ||| 3 2
[X][((S/S)\(S/S))/(S\NP)] [X][(S\NP)/N] [X] ||| [X][((S/S)\(S/S))/(S\NP)] [X][(S\NP)/N] [((S/S)\(S/S))/N] ||| 0.5 0.950152353227 ||| 0-0 1-1 ||| 8 4
[X][((S/S)\(S/S))/NP] [X][NP/N] [X] ||| [X][((S/S)\(S/S))/NP] [X][NP/N] [((S/S)\(S/S))/N] ||| 0.25 0.900064286132 ||| 0-0 1-1 ||| 8 2
[X][((S/S)\(S/S))/N] [X][N/N] [X] ||| [X][((S/S)\(S/S))/N] [X][N/N] [((S/S)\(S/S))/N] ||| 0.25 0.96050145793 ||| 0-0 1-1 ||| 8 2
[X][S_b_\(S_b_/NP)] [X][DOT] [X] ||| [X][S_b_\(S_b_/NP)] [X][DOT] [S_b_\(S_b_/NP)] ||| 0.0845295055821 0.975293919709 ||| 0-0 1-1 ||| 627 53
[X][(S\S)/S] [X][(S/S)/(S\NP)] [X] ||| [X][(S\S)/S] [X][(S/S)/(S\NP)] [((S\S)/S)/(S\NP)] ||| 0.41935483871 0.968638607969 ||| 0-0 1-1 ||| 31 13
[X][((S\S)/S)/(S_b_\NP)] [X][(S_b_\NP)/(S\NP)] [X] ||| [X][((S\S)/S)/(S_b_\NP)] [X][(S_b_\NP)/(S\NP)] [((S\S)/S)/(S\NP)] ||| 0.0322580645161 0.969105691057 ||| 0-0 1-1 ||| 31 1
...
-----------------------------------------------------------
As you can see, it's some CCG categories ||| then p(mother | children), then a smoothed probability estimate (no need to get into that) ||| then the alignment btw the non-terminals ||| then denominator and numerator counts.
This should work, shouldn't it?
Best,
D.N.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
