Ah, what was the exact command you used to do the extraction and the
decoding? Can you also provide the moses.INI file you're using

You might have stumbled upon a stsg extraction algorithm. That will require
telling the decoder that its stsg rather than scfg

Hieu Hoang
http://www.hoang.co.uk/hieu
On 20 Apr 2016 10:57 am, "Annette Rios" <[email protected]> wrote:

> The training data is the right format, and the rule extraction works fine
> for most of it. There is a problem with this particular structure
> (coordinated preposition). The part of  the tree that is relevant looks
> like this:
>
> <tree label="S"><tree label="AQ">asumidos</tree><tree label="cag"><tree
> label="sp"><tree label="SP">con</tree><tree label="sn"><tree
> label="NP">&#xE1;frica</tree></tree></tree><tree label="conj"><tree
> label="CC">y</tree></tree><tree label="SP">por</tree><tree label="sn"><tree
> label="NP">&#xE1;frica</tree></tree></tree>
>
> for which these rules are extracted (among others):
>
> [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP]
> [sn [NP áfrica]]]] ||| und [X][X] Afrika [X] ||| 0.0874939 0.69856 0.174988
> 0.364444 0.606531 ||| 3-0 4-1 5-2 ||| 4 2 2 ||| |||
>
> [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP]
> [sn [NP]]]] ||| und [X][X] [X][X] [X] ||| 0.00185172 0.838272 0.174988
> 0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| |||
>
> [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP]
> [sn]]] ||| und [X][X] [X][X] [X] ||| 0.00185172 0.838272 0.174988 0.865553
> 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| |||
>
>
> These rules give me the following error when reading the phrase table:
>
> Exception: moses/Phrase.cpp:214 in void
> Moses::Phrase::CreateFromString(Moses::FactorDirection, const
> std::vector<long unsigned int>&, const StringPiece&, Moses::Word**) threw
> util::Exception because `nextPos == string::npos'.
> Incorrect formatting of non-terminal. Should have 2 non-terms, eg. [X][X].
> Current string: [SP]
>
> Thanks for the help.
>
> On 04/20/2016 08:17 AM, Hieu Hoang wrote:
>
>> your training data should be in a format that Moses understand, eg.
>>     <tree label="NP"> <tree label="DET"> the </tree> <tree label="NN">
>> cat </tree> </tree>
>> Currently, if looks like the training data is whatever came out of the
>> parser.
>>
>> The syntax tutorial has a bit more information
>>    http://www.statmt.org/moses/?n=Moses.SyntaxTutorial
>>
>> On 18/04/2016 14:07, Annette Rios wrote:
>>
>>> Hi all
>>>
>>> I'm trying to build a tree-to-string system, and I get this error from
>>> moses_chart:
>>>
>>> Exception: moses/Phrase.cpp:214 in void
>>> Moses::Phrase::CreateFromString(Moses::FactorDirection, const
>>> std::vector<long unsigned int>&, const StringPiece&, Moses::Word**)
>>> threw util::Exception because `nextPos == string::npos'.
>>> Incorrect formatting of non-terminal. Should have 2 non-terms, eg.
>>> [X][X]. Current string: [SP]
>>>
>>> The corresponding lines in the phrase table look like this:
>>>
>>> [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP]
>>> [sn [NP áfrica]]]] ||| und [X][X] Afrika [X] ||| 0.0874939 0.69856
>>> 0.174988 0.364444 0.606531 ||| 3-0 4-1 5-2 ||| 4 2 2 ||| |||
>>> [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP]
>>> [sn [NP]]]] ||| und [X][X] [X][X] [X] ||| 0.00185172 0.838272 0.174988
>>> 0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| |||
>>> [S [AQ asumidos] [cag [sp [SP con] [sn [NP áfrica]]] [conj [CC y]] [SP]
>>> [sn]]] ||| und [X][X] [X][X] [X] ||| 0.00185172 0.838272 0.174988
>>> 0.865553 0.606531 ||| 3-0 4-1 5-2 ||| 189 2 2 ||| |||
>>>
>>>
>>> extracted from this parse:
>>>
>>> 4    asumidos    asumido    a    AQ
>>>    gen=m|num=p|postype=qualificative|eagles=AQ0MPP    3    S _    _
>>> 5    con    con    s    SP    postype=preposition|eagles=SPS00 8
>>>    sp    _    _
>>> 6    áfrica    áfrica    n    NP postype=proper||eagles=NP00000  5
>>>    sn    _    _
>>> 7    y    y    c    CC    postype=coordinating|eagles=CC    8 conj
>>>    _    _
>>> 8    por    por    s    SP    postype=preposition|eagles=SPS00 4
>>>    cag    _    _
>>> 9    áfrica    áfrica    n    NP postype=proper||eagles=NP00000  8
>>>    sn    _    _
>>>
>>> converted to xml with conll2mosesxml.py:
>>>
>>>             <tree label="S">
>>>               <tree label="AQ">asumidos</tree>
>>>               <tree label="cag">
>>>                 <tree label="sp">
>>>                   <tree label="SP">con</tree>
>>>                   <tree label="sn">
>>>                     <tree label="NP">&#xE1;frica</tree>
>>>                   </tree>
>>>                 </tree>
>>>                 <tree label="conj">
>>>                   <tree label="CC">y</tree>
>>>                 </tree>
>>>                 <tree label="SP">por</tree>
>>>                 <tree label="sn">
>>>                   <tree label="NP">&#xE1;frica</tree>
>>>                 </tree>
>>>               </tree>
>>>
>>>
>>> Is there something wrong in my parse trees that causes this?
>>>
>>> Best regards
>>>
>>> Annette
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>
>>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to