Ah, what was the exact command you used to do the extraction and the decoding? Can you also provide the moses.INI file you're using
You might have stumbled upon a stsg extraction algorithm. That will require telling the decoder that its stsg rather than scfg Hieu Hoang http://www.hoang.co.uk/hieu On 20 Apr 2016 10:57 am, "Annette Rios" <[email protected]> wrote: > The training data is the right format, and the rule extraction works fine > for most of it. There is a problem with this particular structure > (coordinated preposition). The part of the tree that is relevant looks > like this: > > <tree label="S"><tree label="AQ">asumidos</tree><tree label="cag"><tree > label="sp"><tree label="SP">con</tree><tree label="sn"><tree > label="NP">áfrica</tree></tree></tree><tree label="conj"><tree > label="CC">y</tree></tree><tree label="SP">por</tree><tree label="sn"><tree > label="NP">áfrica</tree></tree></tree> > > for which these rules are extracted (among others): > > [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP] > [sn [NP áfrica]]]] ||| und [X][X] Afrika [X] ||| 0.0874939 0.69856 0.174988 > 0.364444 0.606531 ||| 3-0 4-1 5-2 ||| 4 2 2 ||| ||| > > [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP] > [sn [NP]]]] ||| und [X][X] [X][X] [X] ||| 0.00185172 0.838272 0.174988 > 0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| ||| > > [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP] > [sn]]] ||| und [X][X] [X][X] [X] ||| 0.00185172 0.838272 0.174988 0.865553 > 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| ||| > > > These rules give me the following error when reading the phrase table: > > Exception: moses/Phrase.cpp:214 in void > Moses::Phrase::CreateFromString(Moses::FactorDirection, const > std::vector<long unsigned int>&, const StringPiece&, Moses::Word**) threw > util::Exception because `nextPos == string::npos'. > Incorrect formatting of non-terminal. Should have 2 non-terms, eg. [X][X]. > Current string: [SP] > > Thanks for the help. > > On 04/20/2016 08:17 AM, Hieu Hoang wrote: > >> your training data should be in a format that Moses understand, eg. >> <tree label="NP"> <tree label="DET"> the </tree> <tree label="NN"> >> cat </tree> </tree> >> Currently, if looks like the training data is whatever came out of the >> parser. >> >> The syntax tutorial has a bit more information >> http://www.statmt.org/moses/?n=Moses.SyntaxTutorial >> >> On 18/04/2016 14:07, Annette Rios wrote: >> >>> Hi all >>> >>> I'm trying to build a tree-to-string system, and I get this error from >>> moses_chart: >>> >>> Exception: moses/Phrase.cpp:214 in void >>> Moses::Phrase::CreateFromString(Moses::FactorDirection, const >>> std::vector<long unsigned int>&, const StringPiece&, Moses::Word**) >>> threw util::Exception because `nextPos == string::npos'. >>> Incorrect formatting of non-terminal. Should have 2 non-terms, eg. >>> [X][X]. Current string: [SP] >>> >>> The corresponding lines in the phrase table look like this: >>> >>> [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP] >>> [sn [NP áfrica]]]] ||| und [X][X] Afrika [X] ||| 0.0874939 0.69856 >>> 0.174988 0.364444 0.606531 ||| 3-0 4-1 5-2 ||| 4 2 2 ||| ||| >>> [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP] >>> [sn [NP]]]] ||| und [X][X] [X][X] [X] ||| 0.00185172 0.838272 0.174988 >>> 0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| ||| >>> [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP] >>> [sn]]] ||| und [X][X] [X][X] [X] ||| 0.00185172 0.838272 0.174988 >>> 0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| ||| >>> >>> >>> extracted from this parse: >>> >>> 4 asumidos asumido a AQ >>> gen=m|num=p|postype=qualificative|eagles=AQ0MPP 3 S _ _ >>> 5 con con s SP postype=preposition|eagles=SPS00 8 >>> sp _ _ >>> 6 áfrica áfrica n NP postype=proper||eagles=NP00000 5 >>> sn _ _ >>> 7 y y c CC postype=coordinating|eagles=CC 8 conj >>> _ _ >>> 8 por por s SP postype=preposition|eagles=SPS00 4 >>> cag _ _ >>> 9 áfrica áfrica n NP postype=proper||eagles=NP00000 8 >>> sn _ _ >>> >>> converted to xml with conll2mosesxml.py: >>> >>> <tree label="S"> >>> <tree label="AQ">asumidos</tree> >>> <tree label="cag"> >>> <tree label="sp"> >>> <tree label="SP">con</tree> >>> <tree label="sn"> >>> <tree label="NP">áfrica</tree> >>> </tree> >>> </tree> >>> <tree label="conj"> >>> <tree label="CC">y</tree> >>> </tree> >>> <tree label="SP">por</tree> >>> <tree label="sn"> >>> <tree label="NP">áfrica</tree> >>> </tree> >>> </tree> >>> >>> >>> Is there something wrong in my parse trees that causes this? >>> >>> Best regards >>> >>> Annette >>> >>> _______________________________________________ >>> Moses-support mailing list >>> [email protected] >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> >> >> >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
