You are right!
Here it is:
TRAINING_run-giza.1.STDERR:ERROR: reading vocabulary; 403704 !!!!!!!!!! 2
TRAINING_run-giza-inverse.1.STDERR:ERROR: reading vocabulary; 403704
!!!!!!!!!! 2

I got the point, there is a character (I guess a Non-printing ASCII control
characters), which produces this line in ar.vcb
403704  2
where is missing the character itself.

Thanks a lot
Marco

On Mon, Sep 5, 2011 at 3:11 PM, Philipp Koehn <[email protected]> wrote:

> Hi,
>
> if you do
>
> grep -i 'error' TRAINING_run-giza*.STDERR
>
> then you get the line that triggered experiment.perl
> to stop.
>
> -phi
>
> On Mon, Sep 5, 2011 at 2:05 PM, marco turchi <[email protected]>
> wrote:
> > Hi
> > both the digest files contain only this message:
> > error
> >
> > I can check in experiment.perl where it is printed out.
> >
> > Thanks a lot
> > Marco
> >
> > On Mon, Sep 5, 2011 at 3:01 PM, Philipp Koehn <[email protected]>
> wrote:
> >>
> >> Hi,
> >>
> >> can you look at the *.STDERR.digest files? They should
> >> contain some patterns that indicate errors that were found
> >> in the *.STDERR output.
> >>
> >> If the *.STDERR.digest are empty then indeed no errors were
> >> found, but then experiment.perl should not have stopped.
> >>
> >> -phi
> >>
> >>
> >> On Mon, Sep 5, 2011 at 9:21 AM, marco turchi <[email protected]>
> >> wrote:
> >> > Hi Philipp,
> >> > I have checked the STDERR log files but there are not errors. Giza++
> has
> >> > generated both the A3.final.gz files that the experiment.perl crashed.
> >> >
> >> > Training data has been cleaned using the clean perl script
> >> >
> >> > Thanks a lot
> >> > Marco
> >> >
> >> > On Sat, Sep 3, 2011 at 2:54 AM, Philipp Koehn <[email protected]>
> >> > wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> GIZA++ operates by default with a sentence length limit of 100,
> >> >> which may be a problem here.
> >> >>
> >> >> But you have to look at the steps/1/TRAINING_run-giza*.1.STDERR
> >> >> files to see what triggered experiment.perl to conclude that there is
> >> >> an error.
> >> >>
> >> >> -phi
> >> >>
> >> >> On Wed, Aug 31, 2011 at 8:47 AM, marco turchi <
> [email protected]>
> >> >> wrote:
> >> >> > Dear All,
> >> >> > I'm training a model using 6M en-ar parallel sentences. Mgiza
> >> >> > produces
> >> >> > the
> >> >> > two alignement files A3.final.gz, then the experimental enviroment
> >> >> > crashes
> >> >> > reporting only these messages:
> >> >> > number of steps doable or running: 2
> >> >> >         executing
> >> >> >
> >> >> >
> >> >> >
> /nfs/staging/turchmo/WorkingFolder/TranslationModels/en-ar/Descr//steps/1/TRAINING_run-giza.1
> >> >> > via sh (1)
> >> >> >         executing
> >> >> >
> >> >> >
> >> >> >
> /nfs/staging/turchmo/WorkingFolder/TranslationModels/en-ar/Descr//steps/1/TRAINING_run-giza-inverse.1
> >> >> > via sh (2)
> >> >> > step TRAINING:run-giza-inverse crashed
> >> >> > number of steps doable or running: 1
> >> >> > step TRAINING:run-giza crashed
> >> >> > number of steps doable or running: 0
> >> >> > .
> >> >> >
> >> >> > I checked:
> >> >> > 1) all the log files but any error is reported;
> >> >> > 2) the data looking for "|", but it is not there.
> >> >> > 3) maximum length of a sentence, and it is less than 999.
> >> >> > 4) possible dos characters
> >> >> > 5) both the training files are cleaned with the moses script
> >> >> >
> >> >> > I have trained other models without any problems. I do not
> understand
> >> >> > which
> >> >> > is the problem. Any advice?
> >> >> >
> >> >> > Thanks a lot
> >> >> > Marco
> >> >> >
> >> >> > _______________________________________________
> >> >> > Moses-support mailing list
> >> >> > [email protected]
> >> >> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >> >> >
> >> >> >
> >> >
> >> >
> >
> >
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to