We haven't checked the probingpt + minlexr speedup yet, however we have
found some further differences in the output with respect to the standard
Moses decoder.

It happens sometimes that the order of replacement of placeholders with
actual numbers is not the good one. For instance :

moses2 output: as of december 2012 , 31
moses output: as of december 31 , 2012

moses2 output: à jour au 2013 février 15
moses output: à jour au 15 février 2013

Is this the expected behavior?

Another minor difference is the handling of the carriage return character
("\r") . It seems to be deleted by standard Moses and converted into
newline by Moses2.

Best,
Vito

2016-10-07 17:24 GMT+02:00 Hieu Hoang <hieuho...@gmail.com>:

> yep, it should give you a big speedup compared to probingpt + minlexr model
>
> Hieu Hoang
> http://www.hoang.co.uk/hieu
>
> On 7 October 2016 at 16:21, Vito Mandorino <vito.mandorino@
> linguacustodia.com> wrote:
>
>> Yes I modified the line in the moses.ini . My comparison was with respect
>> to probingPT + minlexr reordering model (rather than .gz reordering model)
>>
>> 2016-10-07 16:25 GMT+02:00 Hieu Hoang <hieuho...@gmail.com>:
>>
>>> weird. it should be a massive speedup (~500%). You have to change the
>>> moses.ini file slightly
>>>
>>>   [feature]
>>>   LexicalReordering … path=reordering-table.msd-bidi
>>> rectional-fe.0.5.0-0.gz
>>> to
>>>   [feature]
>>>   LexicalReordering … property-index=0
>>>
>>>
>>> Hieu Hoang
>>> http://www.hoang.co.uk/hieu
>>>
>>> On 7 October 2016 at 15:02, Vito Mandorino <
>>> vito.mandor...@linguacustodia.com> wrote:
>>>
>>>> Yes, that worked for me as well, thank you. There is a little
>>>> improvement in speed but not that much actually (about 5% faster using 30
>>>> threads).
>>>>
>>>> 2016-10-04 11:44 GMT+02:00 Hieu Hoang <hieuho...@gmail.com>:
>>>>
>>>>> yes - the script expects the files to be gzipped.
>>>>> It runs ok for me. I executed this:
>>>>>
>>>>>     MOSES_DIR=~/workspace/github/mosesdecoder.perf
>>>>>
>>>>>     $MOSES_DIR/scripts/generic/binarize4moses2.perl
>>>>> --phrase-table=phrase-table.gz 
>>>>> --lex-ro=reordering-table.wbe-msd-bidirectional-fe.gz
>>>>> --output-dir=integrated_phrase-reordering/ --num-lex-scores=6
>>>>>
>>>>> Got this:
>>>>>
>>>>>     Executing: gzip -dc phrase-table.gz |
>>>>> /home/hieu/workspace/github/mosesdecoder.perf/scripts/generi
>>>>> c/../../contrib/sigtest-filter/filter-pt -n 0 | gzip -c >
>>>>> ./tmp.14373/pt.gz
>>>>>     ...
>>>>>     Reading phrase table finished, writing remaining files to disk.
>>>>>
>>>>> $ ll integrated_phrase-reordering/
>>>>> total 24688
>>>>> drwxrwxr-x 2 hieu hieu     4096 Oct  4 10:38 ./
>>>>> drwxrwxr-x 5 hieu hieu     4096 Oct  4 10:42 ../
>>>>> -rw-rw-r-- 1 hieu hieu   917861 Oct  4 10:42 Alignments.dat
>>>>> -rw-rw-r-- 1 hieu hieu  2267885 Oct  4 10:42 cache
>>>>> -rw-rw-r-- 1 hieu hieu       76 Oct  4 10:42 config
>>>>> -rw-rw-r-- 1 hieu hieu  3146720 Oct  4 10:42 probing_hash.dat
>>>>> -rw-rw-r-- 1 hieu hieu   333856 Oct  4 10:42 source_vocabids
>>>>> -rw-rw-r-- 1 hieu hieu 18429920 Oct  4 10:42 TargetColl.dat
>>>>> -rw-rw-r-- 1 hieu hieu   121401 Oct  4 10:42 TargetVocab.dat
>>>>>
>>>>>
>>>>> On 04/10/2016 09:06, Vito Mandorino wrote:
>>>>>
>>>>> The command was
>>>>>
>>>>> perl /home/Moses/mosesdecoder/scripts/generic/binarize4moses2.perl
>>>>> --phrase-table=/home/vito/phrase-table.sorted
>>>>> --lex-ro=/home/vito/reordering-table.sorted
>>>>> --output-dir=/home/vito/integrated_phrase-reordering/
>>>>> --num-lex-scores=6
>>>>>
>>>>> The tables in the command are sorted with LC_ALL . I attach them in
>>>>> .gz format. Should one use the .gz format also in the command above?
>>>>>
>>>>> Vito
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> *M**. Vito MANDORINO -- Chief Scientist*
>>>>
>>>>
>>>> [image: Description : Description : lingua_custodia_final full logo]
>>>>
>>>>  *The Translation Trustee*
>>>>
>>>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>>>>
>>>> *Tel : +33 1 30 44 04 23   Mobile : +33 6 84 65 68 89
>>>> <%2B33%206%2084%2065%2068%2089>*
>>>>
>>>> *Email :*  *vito.mandor...@linguacustodia.com
>>>> <massinissa.ah...@linguacustodia.com>*
>>>>
>>>> *Website :*
>>>> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>>>>
>>>
>>>
>>
>>
>> --
>> *M**. Vito MANDORINO -- Chief Scientist*
>>
>>
>> [image: Description : Description : lingua_custodia_final full logo]
>>
>>  *The Translation Trustee*
>>
>> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>>
>> *Tel : +33 1 30 44 04 23   Mobile : +33 6 84 65 68 89
>> <%2B33%206%2084%2065%2068%2089>*
>>
>> *Email :*  *vito.mandor...@linguacustodia.com
>> <massinissa.ah...@linguacustodia.com>*
>>
>> *Website :*
>> *www.linguacustodia.finance <http://www.linguacustodia.com/>*
>>
>
>


-- 
*M**. Vito MANDORINO -- Chief Scientist*


[image: Description : Description : lingua_custodia_final full logo]

 *The Translation Trustee*

*1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*

*Tel : +33 1 30 44 04 23   Mobile : +33 6 84 65 68 89*

*Email :*  *vito.mandor...@linguacustodia.com
<massinissa.ah...@linguacustodia.com>*

*Website :*
*www.linguacustodia.finance <http://www.linguacustodia.com/>*
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to