hi Vito, please git pull and try decoding again. I've just pushed a fix
https://github.com/hieuhoang/mosesdecoder/commit/0005e98b2674906162ce7945c5edd6a42c9ca418 Basically, I've changed changed the behavious of the pugi call so that it doesn't unescape the &apos words Hieu Hoang http://www.hoang.co.uk/hieu On 28 September 2016 at 14:33, Hieu Hoang <[email protected]> wrote: > ah ok. do you have a moses.ini and example input sentence to go with that. > > pugixml.cpp is used to parse the input sentence for XML markups for > placeholders, forced-translation etc. You shouldn't change the code for > pugixml 'cos it's an imported library that we don't control and we may > reimport in future if there are new releases. The problem seems to be > Moses2' use of the library so it should be fixed in Moses2 > > Hieu Hoang > http://www.hoang.co.uk/hieu > > On 28 September 2016 at 14:22, Vito Mandorino <vito.mandorino@ > linguacustodia.com> wrote: > >> We are able to replicate the issue with the probingPT version of this >> phrase-table: >> >> ' ||| ' ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| ||| >> & ||| & ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| ||| >> > ||| > ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| ||| >> < ||| < ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| ||| >> " ||| " ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| ||| >> ||| ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| ||| >>   |||   ||| 1 1 1 1 ||| 0-0 ||| 1 1 1 ||| ||| >> >> If we understand well, the origin of the issue is in the function >> strconv_escape in ./contrib/moses2/pugixml.cpp which replaces some of >> these entities with the actual symbol. Commenting out that part seems to >> fix the problem, but we wonder if this may cause any issues elsewhere since >> we don't know the purpose of the entity replacement. >> >> Best regards, >> Vito >> >> 2016-09-28 11:19 GMT+02:00 Hieu Hoang <[email protected]>: >> >>> Can you make your model files available for download? >>> >>> Moses and Moses2 aren't guaranteed to give exactly the same answer. >>> However, they should be the same quality overall >>> >>> Hieu Hoang >>> http://www.hoang.co.uk/hieu >>> >>> On 28 September 2016 at 09:53, Vito Mandorino < >>> [email protected]> wrote: >>> >>>> Hi, >>>> >>>> we are testing moses2 and we find a decrease in quality which seems to >>>> be related to apostrophes. For instance: >>>> >>>> Source segment 1: >>>> mise à disposition des actionnaires des documents d' information >>>> relatifs à la sicav >>>> >>>> MT Moses: >>>> provision shareholders of the briefing material for the sicav >>>> >>>> MT Moses2: >>>> provision of shareholders documents d' information concerning the fund >>>> >>>> >>>> Source segment 2: >>>> tout titre qui deviendrait spéculatif à la suite d' une >>>> rétrogradation après son acquisition par le fonds ne sera pas liquidé , à >>>> moins que le conseiller en investissement n' estime qu' il y va >>>> de l' intérêt des actionnaires . >>>> >>>> MT Moses: >>>> any security that would become speculative following a downgrading >>>> after its takeover by the fund will not be liquidated , unless the >>>> investment adviser believes it is in the interest of shareholders . >>>> >>>> MT Moses2: >>>> any security that would become speculative following a possible >>>> downgrade d' by the fund after its acquisition will not be liquidated , >>>> unless the investment advisor believes n' stake qu' l' interest of >>>> shareholders . >>>> >>>> It is actually strange that the raw MT output contains the apostrophe >>>> symbol instead of the ' entity . What could the reason be? >>>> >>>> Best regards, >>>> Vito >>>> >>>> >>>> -- >>>> *M**. Vito MANDORINO -- Chief Scientist* >>>> >>>> >>>> [image: Description : Description : lingua_custodia_final full logo] >>>> >>>> *The Translation Trustee* >>>> >>>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux* >>>> >>>> *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89 >>>> <%2B33%206%2084%2065%2068%2089>* >>>> >>>> *Email :* *[email protected] >>>> <[email protected]>* >>>> >>>> *Website :* >>>> *www.linguacustodia.finance <http://www.linguacustodia.com/>* >>>> >>>> _______________________________________________ >>>> Moses-support mailing list >>>> [email protected] >>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>> >>>> >>> >> >> >> -- >> *M**. Vito MANDORINO -- Chief Scientist* >> >> >> [image: Description : Description : lingua_custodia_final full logo] >> >> *The Translation Trustee* >> >> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux* >> >> *Tel : +33 1 30 44 04 23 Mobile : +33 6 84 65 68 89 >> <%2B33%206%2084%2065%2068%2089>* >> >> *Email :* *[email protected] >> <[email protected]>* >> >> *Website :* >> *www.linguacustodia.finance <http://www.linguacustodia.com/>* >> > >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
