good stuff. & interesting grammar.

Escaping [] is probably gonna be necessary in most cases. It's now used
by most scfg implementations to denote non-terms so we might as well
live with it.

If you want to add a few choice words to the website about what you
found, PM me & i'll put it on.

Btw, the on-disk prunes the table AT binarization time, based on p(e|f)
only. The memory implementation only prune during decoding.

This is to speed up decoding using on-disk, which would spend most of
its time reading from disk otherwise.

Therefore, you shouldn't binarize your glue rule if you have big
non-term set (it's unecessary anyway 'cos the glue rule table is so
small). Or make the pruning parameter really big
~/CreateOnDiskPt 1 1 5 9999999999 2 pt.txt pt.folder

On 24/06/2011 03:25, Dennis Mehay wrote:
> Hi again all,
>
> OK, I escaped all instances of [...] in all CCG categories with _..._
> (e.g., [dcl] => _dcl_), re-binarized (i.e., on-disk format
> transformation), re-ran the decoder and...[drum roll]:
>
> Translating: <s> 说了算 </s> ||| [0,0]=X (1) [0,1]=X (1) [0,2]=X (1)
> [1,1]=X (1) [1,2]=X (1) [2,2]=X (1)
>
> Q =1 (S\NP_expl_)/(S_to_\NP) =1 (S\NP_expl_)/(S_b_\NP) =1
> ((S\NP_expl_)/(S_to_\NP))/(S_adj_\NP) =1 Q =3 Q =1 Num of hypo = 10
> --- cells:
> 0 1 2
> 1 3 0
> 3 0
> 1
> BEST TRANSLATION: 8 Q </s> :0-0 : pC=0.000, c=-1.002 [0..2] 5
> [total=-5.893] <<-1.303, 0.000, -6.910, -9.087, -9.576, -5.349,
> -0.693, 1.000, 1.000>>
> is
>
> It works!
>
> Thanks for leading me the answer, Hieu.
>
> Will this be a permanent feature of moses_chart (i.e., not liking
> [...] inside the categories), or do you anticipate it will be easy to
> work around?
>
> If the former, maybe a word or two on the syntax tutorial page
> (http://www.statmt.org/moses/?n=Moses.SyntaxTutorial) would be in order.
>
> Best,
> D.N.
>
> 2011/6/23 Dennis Mehay <[email protected] <mailto:[email protected]>>
>
>     Hi Hieu et al.,
>
>     Sorry for the barrage of emails, but here's what I get when I
>     comment out line 104 in ChartManager.cpp:
>
>     Using rule table (1) [from below -- i.e., with square brackets
>     inside of the CCG categories]:
>
>
>     Translating: <s> 说了算 </s> ||| [0,0]=X (1) [0,1]=X (1) [0,2]=X
>     (1) [1,1]=X (1) [1,2]=X (1) [2,2]=X (1)
>
>     Q =1 ((S\NP[expl])/(S[to]\NP))/(S[adj]\NP) =1
>     (S\NP[expl])/(S[to]\NP) =1 (S\NP[expl])/(S[b]\NP) =1 Q =3 Q =1 Num
>     of hypo = 10 --- cells:
>
>     0 1 2
>     1 3 0
>     3 0
>     1
>     BEST TRANSLATION: 8 Q </s> :0-0 : pC=0.000, c=-1.002 [0..2] 5
>     [total=-5.893] <<-1.303, 0.000, -6.910, -9.087, -9.576, -5.349,
>     -0.693, 1.000, 1.000>>
>     is
>
>     Good, good. The three CCG categories I would expect are there.
>
>     But when using the original (i.e., large), binarized rule table
>     (with square brackets in the CCG categories):
>
>
>     Translating: <s> 说了算 </s> ||| [0,0]=X (1) [0,1]=X (1) [0,2]=X
>     (1) [1,1]=X (1) [1,2]=X (1) [2,2]=X (1)
>
>     Q =1 expl])/(S[to]\NP))/(S[adj]\NP) =1 expl])/(S[to]\NP) =1
>     expl])/(S[b]\NP) =1 Num of hypo = 4 --- cells:
>
>     0 1 2
>     1 3 0
>     0 0
>     0
>     NO BEST TRANSLATION
>
>     Whaaaaa?! Seems that Moses is misprocessing the rule-table because
>     of the CCG-internal square brackets. It must have something to do
>     with what happened when it processed some *other* square bracketed
>     entries in the rule table. Or maybe the binarizer or other rule
>     table processing mechanisms are using square brackets as some kind
>     of landmark when loading, pruning, skipping ahead or somethng else.
>
>     So it seems to be the square brackets after all. I will escape all
>     CCG-internal square bracketings in my whole rule table, binarize
>     it and report back.
>
>     --D.N.
>
>
>     2011/6/23 Dennis Mehay <[email protected] <mailto:[email protected]>>
>
>         Hi Hieu (and others who might be interested),
>
>         So I created three rule tables:
>
>         (1) With CCG-internal brackets and which mentions the source
>         lexical item of interest:
>
>
>         说了算 [X] ||| is [((S\NP[expl])/(S[to]\NP))/(S[adj]\NP)] |||
>         0.000113126 6.94e-05 0.00475133 0.5 2.718 ||| ||| 126 3
>         说了算 [X] ||| is necessary [(S\NP[expl])/(S[to]\NP)] |||
>         0.000309866 6.94e-05 0.00475133 0.00028945 2.718 ||| ||| 46 3
>         说了算 [X] ||| is necessary to [(S\NP[expl])/(S[b]\NP)] |||
>         0.000208847 6.94e-05 0.00475133 1.07891e-05 2.718 ||| ||| 68.25 3
>
>         (2) Without CCG-internal brackets and which mentions the
>         source lexical item of interest:
>
>
>         说了算 [X] ||| is [((S\NP_expl_)/(S_to_\NP))/(S_adj_\NP)] |||
>         0.000113126 6.94e-05 0.00475133 0.5 2.718 ||| ||| 126 3
>         说了算 [X] ||| is necessary [(S\NP_expl_)/(S_to_\NP)] |||
>         0.000309866 6.94e-05 0.00475133 0.00028945 2.718 ||| ||| 46 3
>         说了算 [X] ||| is necessary to [(S\NP_expl_)/(S_b_\NP)] |||
>         0.000208847 6.94e-05 0.00475133 1.07891e-05 2.718 ||| ||| 68.25 3
>
>         (note that the [...] have been replaced with _..._)
>
>         (3) With CCG-internal brackets but *without* the lexical item
>         of interest (I just deleted some of the characters from the
>         Chinese "word"):
>
>
>         算 [X] ||| is [((S\NP[expl])/(S[to]\NP))/(S[adj]\NP)] |||
>         0.000113126 6.94e-05 0.00475133 0.5 2.718 ||| ||| 126 3
>         了算 [X] ||| is necessary [(S\NP[expl])/(S[to]\NP)] |||
>         0.000309866 6.94e-05 0.00475133 0.00028945 2.718 ||| ||| 46 3
>         算 [X] ||| is necessary to [(S\NP[expl])/(S[b]\NP)] |||
>         0.000208847 6.94e-05 0.00475133 1.07891e-05 2.718 ||| ||| 68.25 3
>
>         (4) Same as (3), but with no CCG-internal brackets.
>
>
>         算 [X] ||| is [((S\NP_expl_)/(S_to_\NP))/(S_adj_\NP)] |||
>         0.000113126 6.94e-05 0.00475133 0.5 2.718 ||| ||| 126 3
>         了算 [X] ||| is necessary [(S\NP_expl_)/(S_to_\NP)] |||
>         0.000309866 6.94e-05 0.00475133 0.00028945 2.718 ||| ||| 46 3
>         算 [X] ||| is necessary to [(S\NP_expl_)/(S_b_\NP)] |||
>         0.000208847 6.94e-05 0.00475133 1.07891e-05 2.718 ||| ||| 68.25 3
>
>         Now I tested each of these with an appropriate version of the
>         glue grammar (the whole thing with 4K+ rules in it) -- where
>         "appropriate" just means that it has CCG-internal brackets or
>         _..._ escaped CCG brackets.
>
>         All for of these settings produced a translation. Here are the
>         results:
>
>         ----------------------------------------------
>         Rule table (1) and original glue grammar (w/ 4K+ entries)
>
>
>         Translating: <s> 说了算 </s> ||| [0,0]=X (1) [0,1]=X (1)
>         [0,2]=X (1) [1,1]=X (1) [1,2]=X (1) [2,2]=X (1)
>
>         Num of hypo = 10 --- cells:
>
>         0 1 2
>         1 3 0
>         3 0
>         1
>         BEST TRANSLATION: 8 Q </s> :0-0 : pC=0.000, c=-1.002 [0..2] 5
>         [total=-5.893] <<-1.303, 0.000, -6.910, -9.087, -9.576,
>         -5.349, -0.693, 1.000, 1.000>>
>         is
>         ----------------------------------------------
>         Rule table (2) and glue grammar with [...] => _..._
>         transformation (w/ 4K+ entries)
>
>
>         Translating: <s> 说了算 </s> ||| [0,0]=X (1) [0,1]=X (1)
>         [0,2]=X (1) [1,1]=X (1) [1,2]=X (1) [2,2]=X (1)
>
>         Num of hypo = 10 --- cells:
>
>         0 1 2
>         1 3 0
>         3 0
>         1
>         BEST TRANSLATION: 8 Q </s> :0-0 : pC=0.000, c=-1.002 [0..2] 5
>         [total=-5.893] <<-1.303, 0.000, -6.910, -9.087, -9.576,
>         -5.349, -0.693, 1.000, 1.000>>
>         is
>         ----------------------------------------------
>         Rule table (3) and original glue grammar.
>
>
>         Translating: <s> 说了算 </s> ||| [0,0]=X (1) [0,1]=X (1)
>         [0,2]=X (1) [1,1]=X (1) [1,2]=X (1) [2,2]=X (1)
>
>         Num of hypo = 3222 --- cells:
>         0 1 2
>         1 1587 0
>         1 0
>         1
>         BEST TRANSLATION: 3176 Q </s> :0-0 : pC=0.000, c=-1.002 [0..2]
>         1589 [total=-22.789] <<-1.303, -1.940, -46.302, 0.000, 0.000,
>         0.000, 0.000, 0.000, 1.000>>
>         说了算
>         ----------------------------------------------
>         Rule table (4) and glue grammar with [...] => _..._
>         transformation.
>
>
>         Translating: <s> 说了算 </s> ||| [0,0]=X (1) [0,1]=X (1)
>         [0,2]=X (1) [1,1]=X (1) [1,2]=X (1) [2,2]=X (1)
>
>         Num of hypo = 2007 --- cells:
>         0 1 2
>         1 1587 0
>         1 0
>         1
>         BEST TRANSLATION: 1993 Q </s> :0-0 : pC=0.000, c=-1.002 [0..2]
>         1589 [total=-22.789] <<-1.303, -1.940, -46.302, 0.000, 0.000,
>         0.000, 0.000, 0.000, 1.000>>
>         说了算
>
>         ----------------------------------------------
>
>         So, it seems that escaping the CCG bracketings (for features)
>         does nothing, and I got what I expected in all four cases.
>         moses_chart must be pruning or dropping some rule-table or
>         glue grammar entries somewhere.
>
>         Could it have something to do with my having binarized the
>         original rule table? (Should've mentioned that earlier,
>         sorry.) I think I set the ttable-limit to 100 then...but I
>         thought that setting was a per-source-phrase limit, so it
>         shouldn't matter here, since we have just three entries for
>         the source "word" 说了算 in the original rule table (the large
>         one, that is).
>
>         [Scratches head.]
>
>         --D.N.
>
>
>         2011/6/23 Hieu Hoang <[email protected]
>         <mailto:[email protected]>>
>
>             hmm, strange. if you can send me the model, i'll look into it.
>
>             to get the categories in each cell, uncomment line
>             ChartManager.cpp line 104
>             feel free to make it into a verbose flag option if you wish
>
>
>             On 23/06/2011 09:55, Dennis Mehay wrote:
>>             Hi Hieu,
>>
>>             with ttl's = 100 and 0
>>             --------------------------------------------------------
>>             Translating: <s> 说了算 </s> ||| [0,0]=X (1) [0,1]=X (1)
>>             [0,2]=X (1) [1,1]=X (1) [1,2]=X (1) [2,2]=X (1)
>>
>>             Num of hypo = 4 --- cells:
>>
>>             0 1 2
>>             1 3 0
>>             0 0
>>             0
>>             NO BEST TRANSLATION
>>             --------------------------------------------------------
>>
>>             and with ttl's 100 and 100000000
>>             --------------------------------------------------------
>>             Translating: <s> 说了算 </s> ||| [0,0]=X (1) [0,1]=X (1)
>>             [0,2]=X (1) [1,1]=X (1) [1,2]=X (1) [2,2]=X (1)
>>
>>             Num of hypo = 4 --- cells:
>>             0 1 2
>>             1 3 0
>>             0 0
>>             0
>>             NO BEST TRANSLATION
>>             --------------------------------------------------------
>>
>>             This is from a fresh svn checkout that I compiled just
>>             before running. The glue rules seem to be failing when
>>             trying to combine the chart cells that cover "<s> 说了算".
>>
>>             My glue grammar has 4666 entries in it, for what it's
>>             worth. I can send it to you if you want, but it might be
>>             too big to put up here on the forum.
>>
>>             Is there a quick-and-dirty way to see what categories are
>>             inserted into which cells when (some verbosity setting,
>>             perhaps)?
>>
>>             > I corrected this behaviour recently
>>             >
>>             
>> http://mosesdecoder.svn.sourceforge.net/viewvc/mosesdecoder/trunk/moses/src/ChartTranslationOptionCollection.cpp?r1=4004&r2=4003&pathrev=4004
>>             
>> <http://mosesdecoder.svn.sourceforge.net/viewvc/mosesdecoder/trunk/moses/src/ChartTranslationOptionCollection.cpp?r1=4004&r2=4003&pathrev=4004>
>>
>>             Ah, yes. I named the binary ...19-june-2011, but I had
>>             copied it from a previous svn checkout, sorry. These
>>             things are still happening on the latest checkout, though.
>>
>>             --D.N.
>>
>>             2011/6/22 Hieu Hoang <[email protected]
>>             <mailto:[email protected]>>
>>
>>                 hi dennis
>>
>>                 You're right, it should be working. The entries in
>>                 the glue rules might be pruned. Can you try to change
>>                 the [table-limit] in the ini file to
>>                 [ttable-limit]
>>                 100
>>                 10000000
>>                 or
>>                 [ttable-limit]
>>                 100
>>                 0
>>
>>                 Each row correspond to the table pruning limit for
>>                 each table. If you provide only 1 entry, then it
>>                 prune every table uniformly.
>>                 StaticData.cpp (line 894)
>>                 For a grammar with lots of non-terminals like yours,
>>                 the table limit may be cutting off the some of the
>>                 entries in the glue rule table
>>
>>                 Also, the decoder shouldn't be processing <s> and
>>                 </s> as unknown words, they should only be translated
>>                 by the glue rules. This is the reason you get 1587
>>                 translations of </s>.
>>
>>                 I corrected this behaviour recently
>>                 
>> http://mosesdecoder.svn.sourceforge.net/viewvc/mosesdecoder/trunk/moses/src/ChartTranslationOptionCollection.cpp?r1=4004&r2=4003&pathrev=4004
>>                 
>> <http://mosesdecoder.svn.sourceforge.net/viewvc/mosesdecoder/trunk/moses/src/ChartTranslationOptionCollection.cpp?r1=4004&r2=4003&pathrev=4004>
>>
>>
>>
>>                 On 23/06/2011 06:14, Dennis Mehay wrote:
>>>                 Just in case it confuses anyone, both commands
>>>                 (below) were run in the same way, I just simplified
>>>                 it for expository purposes to " moses_chart -f
>>>                 moses.ini -cube-pruning-pop-limit 2000" in the first
>>>                 case, but not in the second.
>>>
>>>                 --D.N.
>>>
>>>                 2011/6/22 Dennis Mehay <[email protected]
>>>                 <mailto:[email protected]>>
>>>
>>>                     Hi Philipp,
>>>
>>>                     Thanks for the reply. I tracked some of the
>>>                     cases down to a *known* word (or
>>>                     whitespace-tokenized thingie, anyway -- I don't
>>>                     know much of what constitutes a word in written
>>>                     Chinese) by doing the following:
>>>
>>>                     
>>> ----------------------------------------------------------------------
>>>                     $ echo "说了算" | moses_chart -f moses.ini
>>>                     -cube-pruning-pop-limit 2000
>>>
>>>                     Translating: <s> 说了算 </s> ||| [0,0]=X (1)
>>>                     [0,1]=X (1) [0,2]=X (1) [1,1]=X (1) [1,2]=X (1)
>>>                     [2,2]=X (1)
>>>
>>>                     Num of hypo = 1591 --- cells:
>>>                     0 1 2
>>>                     1 3 1587
>>>                     0 0
>>>                     0
>>>                     NO BEST TRANSLATION
>>>                     
>>> ----------------------------------------------------------------------
>>>
>>>                     (An aside: 1587 is the number of categories in
>>>                     the unknown word list. Why does the last token,
>>>                     viz., "</s>", get that many cells? )
>>>
>>>                     Anyhow, sure enough, there are three entries for
>>>                     the middle token "说了算"
>>>
>>>                     
>>> ----------------------------------------------------------------------
>>>                     $ zless rule-table.gz
>>>                     ...
>>>                     说了算 [X] ||| is
>>>                     [((S\NP[expl])/(S[to]\NP))/(S[adj]\NP)] |||
>>>                     0.000113126 6.94e-05 0.00475133 0.5 2.718 |||
>>>                     ||| 126 3
>>>                     说了算 [X] ||| is necessary
>>>                     [(S\NP[expl])/(S[to]\NP)] ||| 0.000309866
>>>                     6.94e-05 0.00475133 0.00028945 2.718 ||| ||| 46 3
>>>                     说了算 [X] ||| is necessary to
>>>                     [(S\NP[expl])/(S[b]\NP)] ||| 0.000208847
>>>                     6.94e-05 0.00475133 1.07891e-05 2.718 ||| |||
>>>                     68.25 3
>>>                     ...
>>>                     
>>> ----------------------------------------------------------------------
>>>
>>>                     There are entries in the glue table for these
>>>                     three categories --
>>>                     ((S\NP[expl])/(S[to]\NP))/(S[adj]\NP),
>>>                     (S\NP[expl])/(S[to]\NP) and
>>>                     (S\NP[expl])/(S[b]\NP) --- so we should be able
>>>                     to hack together a translation using any of them.
>>>
>>>                     
>>> ----------------------------------------------------------------------
>>>                     <s> [X] ||| <s> [Q] ||| 1 |||
>>>                     ...
>>>                     [X][Q]
>>>                     [X][((S\NP[expl])/(S[to]\NP))/(S[adj]\NP)] [X]
>>>                     ||| [X][Q]
>>>                     [X][((S\NP[expl])/(S[to]\NP))/(S[adj]\NP)] [Q]
>>>                     ||| 2.718 ||| 0-0 1-1
>>>                     ...
>>>                     [X][Q] [X][(S\NP[expl])/(S[to]\NP)] [X] |||
>>>                     [X][Q] [X][(S\NP[expl])/(S[to]\NP)] [Q] |||
>>>                     2.718 ||| 0-0 1-1
>>>                     ...
>>>                     [X][Q] [X][(S\NP[expl])/(S[b]\NP)] [X] |||
>>>                     [X][Q] [X][(S\NP[expl])/(S[b]\NP)] [Q] ||| 2.718
>>>                     ||| 0-0 1-1
>>>                     ...
>>>                     
>>> ----------------------------------------------------------------------
>>>
>>>                     And just to be sure that it isn't an unknown
>>>                     word problem, let's mangle the token "说了算" by
>>>                     deleting the last character and see what happens:
>>>
>>>                     
>>> ----------------------------------------------------------------------
>>>                     $ echo "说了" |
>>>                     ../moses/bin/moses-chart-19-june-2011 -f
>>>                     dev-test/ZhEn/mert/run1.moses.ini
>>>                     -cube-pruning-pop-limit 2000
>>>                     Translating: <s> 说了 </s> ||| [0,0]=X (1)
>>>                     [0,1]=X (1) [0,2]=X (1) [1,1]=X (1) [1,2]=X (1)
>>>                     [2,2]=X (1)
>>>
>>>                     Num of hypo = 6396 --- cells:
>>>                     0 1 2
>>>                     1 1587 1587
>>>                     1 0
>>>                     1
>>>                     BEST TRANSLATION: 4763 Q </s> :0-0 : pC=0.000,
>>>                     c=-1.002 [0..2] 3176 [total=-22.789] <<-1.303,
>>>                     -1.940, -46.302, 0.000, 0.000, 0.000, 0.000,
>>>                     0.000, 1.000>>
>>>                     说了
>>>                     
>>> ----------------------------------------------------------------------
>>>
>>>                     The best "translation" is just a pass-through,
>>>                     as expected (and there are 1587 nodes for that
>>>                     unknown token -- just as many as there are
>>>                     unknown word lhs's in the unknown-lhs file).
>>>
>>>                     Strange. Very strange. Or am I missing the obvious?
>>>
>>>                     I'm at a loss here. Does anyone have any guesses
>>>                     as to what's going on here?
>>>
>>>                     --D.N.
>>>
>>>
>>>                     2011/6/22 Philipp Koehn <[email protected]
>>>                     <mailto:[email protected]>>
>>>
>>>                         Hi,
>>>
>>>                         there always should be a rule to combine a
>>>                         span to the left.
>>>
>>>                         Check what labels are chosen for the 13th
>>>                         word, and why there
>>>                         are no glue rules for it.
>>>
>>>                         If I would hazard a guess, I would suspect
>>>                         that this is an
>>>                         unknown word and a file with the likely
>>>                         labels for unknown words
>>>                         is used, but these do not match the glue
>>>                         grammar.
>>>
>>>                         -phi
>>>
>>>                         2011/6/22 Dennis Mehay <[email protected]
>>>                         <mailto:[email protected]>>:
>>>                         > Hi all,
>>>                         >
>>>                         > I posted this, but it bounced. My
>>>                         attachments were too big. I'm resending
>>>                         > without the larger attachment. Apologies
>>>                         for any duplicate posting.
>>>                         >
>>>                         > I'm running moses_chart to do some
>>>                         syntax-based MT experiments, and, during
>>>                         > tuning, I'm coming across some instances
>>>                         where the decoder can't produce a
>>>                         > translation (btw 32 and 38 in a 500
>>>                         sentence tuning set). This should not
>>>                         > be happening, so far as I can tell, since
>>>                         I have a glue grammar (where all
>>>                         > the nonterminals of the training set plus
>>>                         the [Q] nonterminal are accounted
>>>                         > for), and an 'unknown-lhs' list with the
>>>                         relative frequencies of all the
>>>                         > categories as they span only a single word
>>>                         in the training set (i.e., the
>>>                         > frequency of each category's spanning a
>>>                         single word in the rule table / the
>>>                         > total number of single-word instances in
>>>                         the rule table).
>>>                         >
>>>                         > Here is an example of a sentence that
>>>                         there was no translation for:
>>>                         >
>>>                         > ------------------------------
>>>                         >
>>>                         
>>> ---------------------------------------------------------
>>>                         > Translating: <s> 没有 规划 作 指导 , 就 可
>>>                         能 出现 谁 有 权 谁 说了算 , 谁 官 大 谁 说
>>>                         了算 . </s>
>>>                         > ...
>>>                         > Decoding:
>>>                         > Num of hypo = 84813 --- cells:
>>>                         > 0 1 2 3 4 5 6 7 8 9 10
>>>                         
>>> <tel:1%20%C2%A0%202%20%C2%A0%203%20%C2%A0%204%20%C2%A0%205%20%C2%A0%206%20%C2%A0%207%20%C2%A0%208%20%C2%A0%209%20%C2%A010>
>>>                         11 12 13 14 15 16 17 18
>>>                         > 19 20 21
>>>                         > 1 100 77 93 83 99 99 100 100 85 99 43 85 3
>>>                         99 85 18 100
>>>                         > 85 3 14 1000
>>>                         > 40 960 278 717 916 857 976 276 396 952 958
>>>                         150 0 0 919 74 402 802
>>>                         > 0 0 12
>>>                         > 200 975 908 849 850 858 968 971 971 862
>>>                         974 0 0 0 852 865 984
>>>                         > 0 0 0
>>>                         > 200 940 849 889 763 715 990 962 979 905 0
>>>                         0 0 0 864 984 0
>>>                         > 0 0
>>>                         > 200 868 939 886 863 803 887 861 981 0 0 0
>>>                         0 0 871 0
>>>                         > 0 0
>>>                         > 200 828 910 801 838 796 722 870 0 0 0 0 0
>>>                         0 0 0
>>>                         > 0
>>>                         > 200 799 914 832 801 745 926 0 0 0 0 0 0 0 0 0
>>>                         > 200 756 819 901 693 692 0 0 0 0 0 0 0 0 0
>>>                         > 200 716 680 665 437 0 0 0 0 0 0 0 0 0
>>>                         > 200 683 527 929 0 0 0 0 0 0 0 0 0
>>>                         > 200 532 588 0 0 0 0 0 0 0 0 0
>>>                         > 200 580 0 0 0 0 0 0 0 0 0
>>>                         > 200 0 0 0 0 0 0 0 0 0
>>>                         > 0 0 0 0 0 0 0 0 0
>>>                         > 0 0 0 0 0 0 0 0
>>>                         > 0 0 0 0 0 0 0
>>>                         > 0 0 0 0 0 0
>>>                         > 0 0 0 0 0
>>>                         > 0 0 0 0
>>>                         > 0 0 0
>>>                         > 0 0
>>>                         > 0
>>>                         > NO BEST TRANSLATION
>>>                         >
>>>                         > Translation took 4.340 seconds
>>>                         >
>>>                         
>>> ---------------------------------------------------------------------------------------
>>>                         >
>>>                         > The ASCII-art chart's alignment may be a
>>>                         bit off, but, just eye-balling it,
>>>                         > it looks as if the 19th word (index 18)
>>>                         has a chart entry count above it,
>>>                         > but then this entry does not get combined
>>>                         with what's to the left using the
>>>                         > glue rules.
>>>                         >
>>>                         > Could this be a pruning or cutoff issue
>>>                         (i.e., stack size,
>>>                         > cube-pruning-pop-limit, maximum number of
>>>                         rules per span, etc.)? Or maybe
>>>                         > it has to do with the fact that my
>>>                         unknown-lhs file has *all* categories
>>>                         > that spanned a single word in the training
>>>                         set. Maybe I should prune it to
>>>                         > the top 10 or 20, or so. I'm really at a
>>>                         loss here. I thought the glue
>>>                         > grammar would make the decoder always
>>>                         return an answer, no matter how awful.
>>>                         >
>>>                         > Any insight?
>>>                         >
>>>                         > I have attached my moses.ini file in case
>>>                         anyone wants to have a look. I
>>>                         > can also send the glue rule file later,
>>>                         but, as I said, it seems to account
>>>                         > for all of the training set's categories
>>>                         (and it was produced automatically
>>>                         > using the -glue-grammar option).
>>>                         >
>>>                         > Best,
>>>                         > Dennis
>>>                         >
>>>                         _______________________________________________
>>>                         > Moses-support mailing list
>>>                         > [email protected]
>>>                         <mailto:[email protected]>
>>>                         >
>>>                         
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>                         >
>>>                         >
>>>
>>>
>>>
>>>
>>>                 _______________________________________________
>>>                 Moses-support mailing list
>>>                 [email protected] <mailto:[email protected]>
>>>                 http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>                 _______________________________________________
>>                 Moses-support mailing list
>>                 [email protected] <mailto:[email protected]>
>>                 http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to