Hey again
Problem is solved, thanks ;)
The moses.ini (created with the settings below) actually had
[search-algorithm]
3
setting it to 7 worked.
Cheers
Annette
On 04/20/2016 03:18 PM, Annette Rios wrote:
I have used EMS, with these settings:
training-options = "-mgiza -mgiza-cpus 16 -sort-buffer-size 10G
-sort-compress gzip -cores 16 -alt-direct-rule-score-2 -score-command
score-stsg"
extract-settings = "--T2S --STSG --AllowUnary --MaxScope 1000
--MaxNodes 30 --MaxRuleDepth 7 --MaxRuleSize 7"
score-settings = " --GoodTuring --LowCountFeature
--MinCountHierarchical 2"
decoder-settings = "-search-algorithm 7 -feature-overwrite
'TranslationModel0 table-limit=200' -threads 8"
Am I missing something?
Cheers, Annette
On 04/20/2016 12:20 PM, Philip Williams wrote:
Yes, that sounds like the problem. For tree-to-string, you should
give the decoder the option -search-algorithm 7.
Phil
On 20 Apr 2016, at 10:29, Hieu Hoang <[email protected]> wrote:
Ah, what was the exact command you used to do the extraction and the
decoding? Can you also provide the moses.INI file you're using
You might have stumbled upon a stsg extraction algorithm. That will
require telling the decoder that its stsg rather than scfg
Hieu Hoang
http://www.hoang.co.uk/hieu
On 20 Apr 2016 10:57 am, "Annette Rios" <[email protected]
<mailto:[email protected]>> wrote:
The training data is the right format, and the rule extraction
works fine for most of it. There is a problem with this
particular structure (coordinated preposition). The part of the
tree that is relevant looks like this:
<tree label="S"><tree label="AQ">asumidos</tree><tree
label="cag"><tree label="sp"><tree label="SP">con</tree><tree
label="sn"><tree
label="NP">áfrica</tree></tree></tree><tree
label="conj"><tree label="CC">y</tree></tree><tree
label="SP">por</tree><tree label="sn"><tree
label="NP">áfrica</tree></tree></tree>
for which these rules are extracted (among others):
[S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC
y]] [SP] [sn [NP áfrica]]]] ||| und [X][X] Afrika [X] |||
0.0874939 0.69856 0.174988 0.364444 0.606531 ||| 3-0 4-1 5-2 |||
4 2 2 ||| |||
[S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC
y]] [SP] [sn [NP]]]] ||| und [X][X] [X][X] [X] ||| 0.00185172
0.838272 0.174988 0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2
||| |||
[S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC
y]] [SP] [sn]]] ||| und [X][X] [X][X] [X] ||| 0.00185172
0.838272 0.174988 0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2
||| |||
These rules give me the following error when reading the phrase
table:
Exception: moses/Phrase.cpp:214 in void
Moses::Phrase::CreateFromString(Moses::FactorDirection, const
std::vector<long unsigned int>&, const StringPiece&,
Moses::Word**) threw util::Exception because `nextPos ==
string::npos'.
Incorrect formatting of non-terminal. Should have 2 non-terms,
eg. [X][X]. Current string: [SP]
Thanks for the help.
On 04/20/2016 08:17 AM, Hieu Hoang wrote:
your training data should be in a format that Moses
understand, eg.
<tree label="NP"> <tree label="DET"> the </tree> <tree
label="NN"> cat </tree> </tree>
Currently, if looks like the training data is whatever came
out of the parser.
The syntax tutorial has a bit more information
http://www.statmt.org/moses/?n=Moses.SyntaxTutorial
On 18/04/2016 14:07, Annette Rios wrote:
Hi all
I'm trying to build a tree-to-string system, and I get
this error from
moses_chart:
Exception: moses/Phrase.cpp:214 in void
Moses::Phrase::CreateFromString(Moses::FactorDirection,
const
std::vector<long unsigned int>&, const StringPiece&,
Moses::Word**)
threw util::Exception because `nextPos == string::npos'.
Incorrect formatting of non-terminal. Should have 2
non-terms, eg.
[X][X]. Current string: [SP]
The corresponding lines in the phrase table look like this:
[S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]]
[conj [CC y]] [SP]
[sn [NP áfrica]]]] ||| und [X][X] Afrika [X] |||
0.0874939 0.69856
0.174988 0.364444 0.606531 ||| 3-0 4-1 5-2 ||| 4 2 2 ||| |||
[S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]]
[conj [CC y]] [SP]
[sn [NP]]]] ||| und [X][X] [X][X] [X] ||| 0.00185172
0.838272 0.174988
0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| |||
[S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]]
[conj [CC y]] [SP]
[sn]]] ||| und [X][X] [X][X] [X] ||| 0.00185172 0.838272
0.174988
0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| |||
extracted from this parse:
4 asumidos asumido a AQ
gen=m|num=p|postype=qualificative|eagles=AQ0MPP 3
S _ _
5 con con s SP
postype=preposition|eagles=SPS00 8
sp _ _
6 áfrica áfrica n NP
postype=proper||eagles=NP00000 5
sn _ _
7 y y c CC postype=coordinating|eagles=CC
8 conj
_ _
8 por por s SP
postype=preposition|eagles=SPS00 4
cag _ _
9 áfrica áfrica n NP
postype=proper||eagles=NP00000 8
sn _ _
converted to xml with conll2mosesxml.py:
<tree label="S">
<tree label="AQ">asumidos</tree>
<tree label="cag">
<tree label="sp">
<tree label="SP">con</tree>
<tree label="sn">
<tree label="NP">áfrica</tree>
</tree>
</tree>
<tree label="conj">
<tree label="CC">y</tree>
</tree>
<tree label="SP">por</tree>
<tree label="sn">
<tree label="NP">áfrica</tree>
</tree>
</tree>
Is there something wrong in my parse trees that causes this?
Best regards
Annette
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support