the moses.ini looks ok. Did you clean your training data? Did you tokenize it with the moses tokenizer? Did you do anything to your phrase-table?
On 18 October 2014 17:49, Mohammad Salameh <[email protected]> wrote: > Hi Hieu > Please find the moses.ini file attached > the exact commands are: > > > > ####TRAIN TM > $SCRIPTS_ROOTDIR/training/train-model.perl -root-dir $WORK > -external-bin-dir $MGIZA_HOME -corpus $WORK/corpus/trn.fil -f en -e ar > -alignment grow-diag-final-and -max-phrase-length 8 --translation-factors > 0-0,1 --alignment-factors 0-1 -reordering msd-bidirectional-fe -mgiza -lm > 0:5:$WORK/lm/ar_surf.lm &>$WORK/training.out > > ####TUNE > mkdir $WORK/tuning/mertA > SCRIPTS_ROOTDIR/training/mert-moses.pl $WORK/tuning/dev.en $WORK/tuning/ > dev.ar $MOSES $WORK/model/moses.ini --working-dir $WORK/tuning/mertA > --mertdir $MOSES_HOME/bin --decoder-flags "-threads 11 -max-phrase-length > 8" --threads 11 &> $WORK/tuning/mertA/mert.out > > > Thanks, > Mohammad > > On Sat, Oct 18, 2014 at 6:20 AM, Hieu Hoang <[email protected]> wrote: > >> hi mohammad >> >> >> On 17 October 2014 21:45, Mohammad Salameh <[email protected]> wrote: >> >>> Thanks Hieu, >>> I wan to exclude the <s> because I want to translate chunks of source >>> sentences with one model, and then add them and their score as extra >>> feature to a phrase table of a different model. >>> So I don't want the sentence boundaries to be involved in the >>> translation. >>> >> I understand. Moses doesn't allow you to exclude <s>, however, if you >> don't want the score for this, then maybe you should write a feature >> function to subtract it from the score. Or modify an existing language >> model to not score <s> >> >>> >>> Also, I trained a factored system with --translation-factors 0-0,1. >>> The training process ended successfully and I do not see any error with the >>> training.out file. >>> But the tuning and decoding is ending up with Segmentation Fault error >>> when loading the phrase table and when it reaches 3% when loading. >>> I have attached the mert.out. >>> Would it be possible to know the reason, or which phrases in the phrase >>> table is causing the interruption in loading? >>> >> Can you also send the moses.ini file you used, and the EXACT command you >> executed. >> >> >>> Thanks, >>> Salameh >>> >>> >>> >>> >>> >>> >>> On Fri, Oct 17, 2014 at 12:57 PM, Hieu Hoang <[email protected]> >>> wrote: >>> >>>> sorry, must have missed your email. Answers below >>>> >>>> On 16/10/14 20:21, Mohammad Salameh wrote: >>>> >>>> Hi, >>>> any answer to the above questions, >>>> Thanks, >>>> Salameh >>>> >>>> On Fri, Oct 10, 2014 at 10:11 AM, Mohammad Salameh < >>>> [email protected]> wrote: >>>> >>>>> Hi >>>>> I have few questions on how Moses system works >>>>> >>>>> 1) would it be possible to do a factored translation where factors >>>>> appear in the output but do not be part of the translation process. For >>>>> example, I have English surface form on source side and Arabic surface >>>>> and >>>>> their stems on the target side. I want to translate from English surface >>>>> form to Arabic surface, but also see the stems accompanying the surface >>>>> forms in the output. >>>>> I have tried setting --translation-factors 0-0 , but only ended up >>>>> with the Arabic surface forms in the output. >>>>> >>>> I'm not sure what you mean by 'not be part of the translation >>>> process'. If you want to see the stem in the output but you don't want it >>>> in the translation table, then there needs to be some process that generate >>>> the stem, given the target word. Moses has a crude solution - it is called >>>> the generation step. >>>> >>>> >>>>> >>>>> 2) when translating sentences with moses , I assume that moses adds >>>>> the sentence boundary markers <s> </s> automatically. Would it be possible >>>>> to exclude these from the translation. I need to get translation scores >>>>> for >>>>> chunks of input sentences which does not involve scores generated based on >>>>> <s> and </s> from LM or phrase table. >>>>> >>>> Yes, it include <s> </s>. No, you can't exclude these from the >>>> translation process. >>>> >>>> I'm curious to know why you want to exclude these >>>> >>>> >>>>> 3) I added additional phrases to the phrase table. Should the phrase >>>>> table be sorted again and is it enough to do "LC_ALL=C sort " on the PT to >>>>> be used properly ? >>>>> >>>> Yes, it needs to be sorted again. You must also make sure that the >>>> new phrases are not duplicates of existing phrases >>>> >>>> >>>>> Thanks >>>>> >>>>> _______________________________________________ >>>>> Moses-support mailing list >>>>> [email protected] >>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>> >>>>> >>>> >>>> >>>> _______________________________________________ >>>> Moses-support mailing >>>> [email protected]http://mailman.mit.edu/mailman/listinfo/moses-support >>>> >>>> >>>> >>> >> >> >> -- >> Hieu Hoang >> Research Associate >> University of Edinburgh >> http://www.hoang.co.uk/hieu >> >> > -- Hieu Hoang Research Associate University of Edinburgh http://www.hoang.co.uk/hieu
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
