The train-model.perl script from Beta 0.91 configured for MGIZA++ failed on step 2.1b in the reverse direction with the error below. I think this might be a result of inadequate cleaning. Can anyone confirm this or offer an alternate reason? Thanks.
m5p0 = -1 (fixed value for parameter p_0 in IBM-5 (if negative then it is determined in training)) manlexfactor1 = 0 () manlexfactor2 = 0 () manlexmaxmultiplicity = 20 () maxfertility = 10 (maximal fertility for fertility models) ncpus = 1 (Number of threads to be executed, use 0 if you just want all CPUs to be used) p0 = 0.999 (fixed value for parameter p_0 in IBM-3/4 (if negative then it is determined in training)) pegging = 0 (0: no pegging; 1: do pegging) reading vocabulary files Reading vocabulary file from:/opt/domy/TRAININGS/alignments/align-dell2_full-en-es/giza.classes/en.vcb Reading vocabulary file from:/opt/domy/TRAININGS/alignments/align-dell2_full-en-es/giza.classes/es.vcb Source vocabulary list has 85970 unique tokens Target vocabulary list has 84643 unique tokens Calculating vocabulary frequencies from corpus /opt/domy/TRAININGS/alignments/align-dell2_full-en-es/giza.classes/es-en-int-train.snt Reading more sentence pairs into memory ... ERROR: target word 118049 is not in the vocabulary list Exit code: 255 _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
