This is as far as I've got. If possible it would be appreciated to move
conversation over to the Jira ticket.
https://issues.apache.org/jira/browse/JOSHUA-304
Lewis

On Tue, Aug 23, 2016 at 8:58 PM, lewis john mcgibbney <lewi...@apache.org>
wrote:

> Hi dev@,
> I ran into a bit of bother whilst attempting to complete the example at
> [0].
> Joshua master is installed correctly.
> The problem I am having is almost exactly described at [1]
>
> I attempt to build the model using the following parameters
>
> $JOSHUA/bin/pipeline.pl --type hiero --rundir 1 --readme "Baseline Hiero
> run" --source es --target en --witten-bell --corpus
> $SPANISH/corpus/asr/callhome_train --corpus $SPANISH/corpus/asr/fisher_train
> --tune  $SPANISH/corpus/asr/fisher_dev --test  
> $SPANISH/corpus/asr/callhome_devtest
> --lm-order 3
>
> It seems that the initial aspects of the pipeline run and complete well
> with the following output
>
> [source-numlines] retrieved cached result =>   151810
>
> However when the pipeline progresses to alignment with GIZA, the generated
> log indicates some fatal error which I am not familiarized with [1]. I've
> never seen it.
> As you can see there are many many sentence mismatch errors within a
> final alignment phase with the following log output
>
> ERROR: Can't generate symmetrized alignment file
>
> I then tried to change the aligner to berekelylm as suggested in [1] and
> also based upon some advice given by Matt in a more recent thread. As
> follows
>
> $JOSHUA/bin/pipeline.pl --type hiero --rundir 3 --readme "Baseline Hiero
> run 3" --source es --target en --lm-gen berkeleylm --lm berkeleylm
> --aligner berkeley --corpus $SPANISH/corpus/asr/callhome_train --corpus
> $SPANISH/corpus/asr/fisher_train --tune  $SPANISH/corpus/asr/fisher_dev
> --test  $SPANISH/corpus/asr/callhome_devtest --lm-order 3
>
> However this results in the following output within the early aspects of
> the pipeline
>
> [source-numlines] retrieved cached result =>   151810
> [berkeley-aligner-chunk-0] rebuilding...
>   dep=alignments/0/word-align.conf [CHANGED]
>   dep=/usr/local/incubator-joshua/experiments/fisher_
> callhome_experiment/4/data/train/splits/corpus.es.0 [CHANGED]
>   dep=/usr/local/incubator-joshua/experiments/fisher_
> callhome_experiment/4/data/train/splits/corpus.en.0 [CHANGED]
>   dep=alignments/0/training.align [NOT FOUND]
>   cmd=java -d64 -Xmx10g -jar 
> /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++alignments/0/word-align.conf
>   JOB FAILED (return code 1)
> [aligner-combine] rebuilding...
>   dep=alignments/0/training.align [NOT FOUND]
>   dep=alignments/training.align [NOT FOUND]
>   cmd=cat alignments/0/training.align > alignments/training.align
>   JOB FAILED (return code 1)
> cat: alignments/0/training.align: No such file or directory
>
> It turns out of course that the '++alignments/0/word-align.conf' is not
> present. So I am looking for that bug in the codebase right now and will
> try to submit a PR.
>
> Lewis
>
> [0] https://github.com/apache/incubator-joshua/tree/master/
> examples#building-a-spanish----english-translation-model-
> using-the-fisher-spanish-callhome-corpus
> [1] https://groups.google.com/forum/#!topic/joshua_support/CvNjIRboixc
> [2] https://paste.apache.org/wjm9
>
> --
> http://home.apache.org/~lewismc/
> @hectorMcSpector
> http://www.linkedin.com/in/lmcgibbney
>



-- 
http://home.apache.org/~lewismc/
@hectorMcSpector
http://www.linkedin.com/in/lmcgibbney

Reply via email to