On 20 Apr 2016, at 10:29, Hieu Hoang <[email protected]> wrote:
Ah, what was the exact command you used to do the extraction and
the decoding? Can you also provide the moses.INI file you're using
You might have stumbled upon a stsg extraction algorithm. That
will require telling the decoder that its stsg rather than scfg
Hieu Hoang
http://www.hoang.co.uk/hieu
On 20 Apr 2016 10:57 am, "Annette Rios" <[email protected]
<mailto:[email protected]>> wrote:
The training data is the right format, and the rule extraction
works fine for most of it. There is a problem with this
particular structure (coordinated preposition). The part of
the tree that is relevant looks like this:
<tree label="S"><tree label="AQ">asumidos</tree><tree
label="cag"><tree label="sp"><tree label="SP">con</tree><tree
label="sn"><tree
label="NP">áfrica</tree></tree></tree><tree
label="conj"><tree label="CC">y</tree></tree><tree
label="SP">por</tree><tree label="sn"><tree
label="NP">áfrica</tree></tree></tree>
for which these rules are extracted (among others):
[S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC
y]] [SP] [sn [NP áfrica]]]] ||| und [X][X] Afrika [X] |||
0.0874939 0.69856 0.174988 0.364444 0.606531 ||| 3-0 4-1 5-2
||| 4 2 2 ||| |||
[S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC
y]] [SP] [sn [NP]]]] ||| und [X][X] [X][X] [X] ||| 0.00185172
0.838272 0.174988 0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2
2 ||| |||
[S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC
y]] [SP] [sn]]] ||| und [X][X] [X][X] [X] ||| 0.00185172
0.838272 0.174988 0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2
2 ||| |||
These rules give me the following error when reading the
phrase table:
Exception: moses/Phrase.cpp:214 in void
Moses::Phrase::CreateFromString(Moses::FactorDirection, const
std::vector<long unsigned int>&, const StringPiece&,
Moses::Word**) threw util::Exception because `nextPos ==
string::npos'.
Incorrect formatting of non-terminal. Should have 2 non-terms,
eg. [X][X]. Current string: [SP]
Thanks for the help.
On 04/20/2016 08:17 AM, Hieu Hoang wrote:
your training data should be in a format that Moses
understand, eg.
<tree label="NP"> <tree label="DET"> the </tree> <tree
label="NN"> cat </tree> </tree>
Currently, if looks like the training data is whatever
came out of the parser.
The syntax tutorial has a bit more information
http://www.statmt.org/moses/?n=Moses.SyntaxTutorial
On 18/04/2016 14:07, Annette Rios wrote:
Hi all
I'm trying to build a tree-to-string system, and I get
this error from
moses_chart:
Exception: moses/Phrase.cpp:214 in void
Moses::Phrase::CreateFromString(Moses::FactorDirection, const
std::vector<long unsigned int>&, const StringPiece&,
Moses::Word**)
threw util::Exception because `nextPos == string::npos'.
Incorrect formatting of non-terminal. Should have 2
non-terms, eg.
[X][X]. Current string: [SP]
The corresponding lines in the phrase table look like
this:
[S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]]
[conj [CC y]] [SP]
[sn [NP áfrica]]]] ||| und [X][X] Afrika [X] |||
0.0874939 0.69856
0.174988 0.364444 0.606531 ||| 3-0 4-1 5-2 ||| 4 2 2
||| |||
[S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]]
[conj [CC y]] [SP]
[sn [NP]]]] ||| und [X][X] [X][X] [X] ||| 0.00185172
0.838272 0.174988
0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| |||
[S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]]
[conj [CC y]] [SP]
[sn]]] ||| und [X][X] [X][X] [X] ||| 0.00185172
0.838272 0.174988
0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| |||
extracted from this parse:
4 asumidos asumido a AQ
gen=m|num=p|postype=qualificative|eagles=AQ0MPP 3
S _ _
5 con con s SP
postype=preposition|eagles=SPS00 8
sp _ _
6 áfrica áfrica n NP
postype=proper||eagles=NP00000 5
sn _ _
7 y y c CC postype=coordinating|eagles=CC
8 conj
_ _
8 por por s SP
postype=preposition|eagles=SPS00 4
cag _ _
9 áfrica áfrica n NP
postype=proper||eagles=NP00000 8
sn _ _
converted to xml with conll2mosesxml.py:
<tree label="S">
<tree label="AQ">asumidos</tree>
<tree label="cag">
<tree label="sp">
<tree label="SP">con</tree>
<tree label="sn">
<tree label="NP">áfrica</tree>
</tree>
</tree>
<tree label="conj">
<tree label="CC">y</tree>
</tree>
<tree label="SP">por</tree>
<tree label="sn">
<tree label="NP">áfrica</tree>
</tree>
</tree>
Is there something wrong in my parse trees that causes
this?
Best regards
Annette
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support