Hi dev@, I ran into a bit of bother whilst attempting to complete the example at [0]. Joshua master is installed correctly. The problem I am having is almost exactly described at [1]
I attempt to build the model using the following parameters $JOSHUA/bin/pipeline.pl --type hiero --rundir 1 --readme "Baseline Hiero run" --source es --target en --witten-bell --corpus $SPANISH/corpus/asr/callhome_train --corpus $SPANISH/corpus/asr/fisher_train --tune $SPANISH/corpus/asr/fisher_dev --test $SPANISH/corpus/asr/callhome_devtest --lm-order 3 It seems that the initial aspects of the pipeline run and complete well with the following output [source-numlines] retrieved cached result => 151810 However when the pipeline progresses to alignment with GIZA, the generated log indicates some fatal error which I am not familiarized with [1]. I've never seen it. As you can see there are many many sentence mismatch errors within a final alignment phase with the following log output ERROR: Can't generate symmetrized alignment file I then tried to change the aligner to berekelylm as suggested in [1] and also based upon some advice given by Matt in a more recent thread. As follows $JOSHUA/bin/pipeline.pl --type hiero --rundir 3 --readme "Baseline Hiero run 3" --source es --target en --lm-gen berkeleylm --lm berkeleylm --aligner berkeley --corpus $SPANISH/corpus/asr/callhome_train --corpus $SPANISH/corpus/asr/fisher_train --tune $SPANISH/corpus/asr/fisher_dev --test $SPANISH/corpus/asr/callhome_devtest --lm-order 3 However this results in the following output within the early aspects of the pipeline [source-numlines] retrieved cached result => 151810 [berkeley-aligner-chunk-0] rebuilding... dep=alignments/0/word-align.conf [CHANGED] dep=/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/4/data/train/splits/corpus.es.0 [CHANGED] dep=/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/4/data/train/splits/corpus.en.0 [CHANGED] dep=alignments/0/training.align [NOT FOUND] cmd=java -d64 -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar ++alignments/0/word-align.conf JOB FAILED (return code 1) [aligner-combine] rebuilding... dep=alignments/0/training.align [NOT FOUND] dep=alignments/training.align [NOT FOUND] cmd=cat alignments/0/training.align > alignments/training.align JOB FAILED (return code 1) cat: alignments/0/training.align: No such file or directory It turns out of course that the '++alignments/0/word-align.conf' is not present. So I am looking for that bug in the codebase right now and will try to submit a PR. Lewis [0] https://github.com/apache/incubator-joshua/tree/master/examples#building-a-spanish----english-translation-model-using-the-fisher-spanish-callhome-corpus [1] https://groups.google.com/forum/#!topic/joshua_support/CvNjIRboixc [2] https://paste.apache.org/wjm9 -- http://home.apache.org/~lewismc/ @hectorMcSpector http://www.linkedin.com/in/lmcgibbney
