Re: [Moses-support] using factored models

Hieu Hoang Sat, 18 Oct 2014 10:01:29 -0700

the moses.ini looks ok. Did you clean your training data? Did you tokenize
it with the moses tokenizer? Did you do anything to your phrase-table?


On 18 October 2014 17:49, Mohammad Salameh <[email protected]> wrote:

> Hi Hieu
> Please find the moses.ini file attached
> the exact commands are:
>
>
>
> ####TRAIN TM
> $SCRIPTS_ROOTDIR/training/train-model.perl -root-dir $WORK
> -external-bin-dir $MGIZA_HOME -corpus  $WORK/corpus/trn.fil -f en -e ar
> -alignment grow-diag-final-and -max-phrase-length 8 --translation-factors
> 0-0,1 --alignment-factors 0-1 -reordering msd-bidirectional-fe -mgiza -lm
> 0:5:$WORK/lm/ar_surf.lm &>$WORK/training.out
>
> ####TUNE
> mkdir $WORK/tuning/mertA
> SCRIPTS_ROOTDIR/training/mert-moses.pl $WORK/tuning/dev.en $WORK/tuning/
> dev.ar $MOSES $WORK/model/moses.ini --working-dir $WORK/tuning/mertA
> --mertdir $MOSES_HOME/bin  --decoder-flags "-threads 11 -max-phrase-length
> 8" --threads 11 &> $WORK/tuning/mertA/mert.out
>
>
> Thanks,
> Mohammad
>
> On Sat, Oct 18, 2014 at 6:20 AM, Hieu Hoang <[email protected]> wrote:
>
>> hi mohammad
>>
>>
>> On 17 October 2014 21:45, Mohammad Salameh <[email protected]> wrote:
>>
>>> Thanks Hieu,
>>> I wan to exclude the <s> because I want to translate chunks of source
>>> sentences  with one model, and then add them  and their score as extra
>>> feature to a phrase table of a different model.
>>> So I don't want the sentence boundaries to be involved in the
>>> translation.
>>>
>> I understand. Moses doesn't allow you to exclude <s>, however, if you
>> don't want the score for this, then maybe you should write a feature
>> function to subtract it from the score. Or modify an existing language
>> model to not score <s>
>>
>>>
>>> Also, I trained a factored system with  --translation-factors 0-0,1.
>>> The training process ended successfully and I do not see any error with the
>>> training.out file.
>>> But the tuning and decoding is ending up with Segmentation Fault error
>>> when loading the phrase table and when it reaches 3% when loading.
>>> I have attached the mert.out.
>>> Would it be possible to know the reason, or which phrases in the phrase
>>> table is causing the interruption in loading?
>>>
>> Can you also send the moses.ini file you used, and the EXACT command you
>> executed.
>>
>>
>>> Thanks,
>>> Salameh
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Oct 17, 2014 at 12:57 PM, Hieu Hoang <[email protected]>
>>> wrote:
>>>
>>>>  sorry, must have missed your email. Answers below
>>>>
>>>> On 16/10/14 20:21, Mohammad Salameh wrote:
>>>>
>>>> Hi,
>>>> any answer to the above questions,
>>>> Thanks,
>>>> Salameh
>>>>
>>>> On Fri, Oct 10, 2014 at 10:11 AM, Mohammad Salameh <
>>>> [email protected]> wrote:
>>>>
>>>>> Hi
>>>>> I have few questions on how Moses system works
>>>>>
>>>>>  1) would it be possible to do a factored translation where factors
>>>>> appear in the output but do not be part of the translation process. For
>>>>> example, I have English surface form on source side and  Arabic surface 
>>>>> and
>>>>> their stems on the target side. I want to translate from English surface
>>>>> form to Arabic surface, but also see the stems accompanying the surface
>>>>> forms in the output.
>>>>>  I have tried setting --translation-factors 0-0 , but only ended up
>>>>> with the Arabic surface forms in the output.
>>>>>
>>>>   I'm not sure what you mean by 'not be part of the translation
>>>> process'. If you want to see the stem in the output but you don't want it
>>>> in the translation table, then there needs to be some process that generate
>>>> the stem, given the target word. Moses has a crude solution - it is called
>>>> the generation step.
>>>>
>>>>
>>>>>
>>>>>  2) when translating sentences with moses , I assume that moses adds
>>>>> the sentence boundary markers <s> </s> automatically. Would it be possible
>>>>> to exclude these from the translation. I need to get translation scores 
>>>>> for
>>>>> chunks of input sentences which does not involve scores generated based on
>>>>> <s> and </s> from LM or phrase table.
>>>>>
>>>>   Yes, it include <s> </s>. No, you can't exclude these from the
>>>> translation process.
>>>>
>>>> I'm curious to know why you want to exclude these
>>>>
>>>>
>>>>>  3) I added additional phrases to the phrase table. Should the phrase
>>>>> table be sorted again and is it enough to do "LC_ALL=C sort " on the PT to
>>>>> be used properly ?
>>>>>
>>>>   Yes, it needs to be sorted again. You must also make sure that the
>>>> new phrases are not duplicates of existing phrases
>>>>
>>>>
>>>>>  Thanks
>>>>>
>>>>> _______________________________________________
>>>>> Moses-support mailing list
>>>>> [email protected]
>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Moses-support mailing 
>>>> [email protected]http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Hieu Hoang
>> Research Associate
>> University of Edinburgh
>> http://www.hoang.co.uk/hieu
>>
>>
>


-- 
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] using factored models

Reply via email to