hi Vito,

please git pull and try decoding again. I've just pushed a fix

https://github.com/hieuhoang/mosesdecoder/commit/0005e98b2674906162ce7945c5edd6a42c9ca418
Basically, I've changed changed the behavious of the pugi call so that it
doesn't unescape the &apos words


Hieu Hoang
http://www.hoang.co.uk/hieu

On 28 September 2016 at 14:33, Hieu Hoang <[email protected]> wrote:

> ah ok. do you have a moses.ini and example input sentence to go with that.
>
> pugixml.cpp is used to parse the input sentence for XML markups for
> placeholders, forced-translation etc. You shouldn't change the code for
> pugixml 'cos it's an imported library that we don't control and we may
> reimport in future if there are new releases. The problem seems to be
> Moses2' use of the library so it should be fixed in Moses2
>
> Hieu Hoang
> http://www.hoang.co.uk/hieu
>
> On 28 September 2016 at 14:22, Vito Mandorino <vito.mandorino@
> linguacustodia.com> wrote:
>
>> We are able to replicate the issue with the probingPT version of this
>> phrase-table:
>>
>> &apos; ||| &apos; ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| |||
>> &amp; ||| &amp; ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| |||
>> &gt; ||| &gt; ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| |||
>> &lt; ||| &lt; ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| |||
>> &quot; ||| &quot; ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| |||
>> &nbsp; ||| &nbsp; ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| |||
>> &#160; ||| &#160; ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| |||
>>
>> If we understand well, the origin of the issue is in the function
>> strconv_escape in ./contrib/moses2/pugixml.cpp  which replaces some of
>> these entities with the actual symbol. Commenting out that part seems to
>> fix the problem, but we wonder if this may cause any issues elsewhere since
>> we don't know the purpose of the entity replacement.
>>
>> Best regards,
>> Vito
>>
>> 2016-09-28 11:19 GMT+02:00 Hieu Hoang <[email protected]>:
>>
>>> Can you make your model files available for download?
>>>
>>> Moses and Moses2 aren't guaranteed to give exactly the same answer.
>>> However, they should be the same quality overall
>>>
>>> Hieu Hoang
>>> http://www.hoang.co.uk/hieu
>>>
>>> On 28 September 2016 at 09:53, Vito Mandorino <
>>> [email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>> we are testing moses2 and we find a decrease in quality which seems to
>>>> be related to apostrophes. For instance:
>>>>
>>>> Source segment 1:
>>>> mise à disposition des actionnaires des documents d&apos; information
>>>> relatifs à la sicav
>>>>
>>>> MT Moses:
>>>> provision shareholders of the briefing material for the sicav
>>>>
>>>> MT Moses2:
>>>> provision of shareholders documents d' information concerning the fund
>>>>
>>>>
>>>> Source segment 2:
>>>> tout titre qui deviendrait spéculatif à la suite d&apos; une
>>>> rétrogradation après son acquisition par le fonds ne sera pas liquidé , à
>>>> moins que le conseiller en investissement n&apos; estime qu&apos; il y va
>>>> de l&apos; intérêt des actionnaires .
>>>>
>>>> MT Moses:
>>>> any security that would become speculative following a downgrading
>>>> after its takeover by the fund will not be liquidated , unless the
>>>> investment adviser believes it is in the interest of shareholders .
>>>>
>>>> MT Moses2:
>>>> any security that would become speculative following a possible
>>>> downgrade d' by the fund after its acquisition will not be liquidated ,
>>>> unless the investment advisor believes n' stake qu' l' interest of
>>>> shareholders .
>>>>
>>>> It is actually strange that the raw MT output contains the apostrophe
>>>> symbol instead of the &apos; entity . What could the reason be?
>>>>
>>>> Best regards,
>>>> Vito
>>>>
>>>>
>>>> --
>>>> *M**. Vito MANDORINO -- Chief Scientist*
>>>>
>>>>
>>>> [image: Description : Description : lingua_custodia_final full logo]
>>>>
>>>>  *The Translation Trustee*
>>>>
>>>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>>>>
>>>> *Tel : +33 1 30 44 04 23   Mobile : +33 6 84 65 68 89
>>>> <%2B33%206%2084%2065%2068%2089>*
>>>>
>>>> *Email :*  *[email protected]
>>>> <[email protected]>*
>>>>
>>>> *Website :*
>>>> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>>>>
>>>> _______________________________________________
>>>> Moses-support mailing list
>>>> [email protected]
>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>>
>>>
>>
>>
>> --
>> *M**. Vito MANDORINO -- Chief Scientist*
>>
>>
>> [image: Description : Description : lingua_custodia_final full logo]
>>
>>  *The Translation Trustee*
>>
>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>>
>> *Tel : +33 1 30 44 04 23   Mobile : +33 6 84 65 68 89
>> <%2B33%206%2084%2065%2068%2089>*
>>
>> *Email :*  *[email protected]
>> <[email protected]>*
>>
>> *Website :*
>> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>>
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to