[
https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15446643#comment-15446643
]
Lewis John McGibbney commented on JOSHUA-304:
---------------------------------------------
Hi [~post]
What new steps did you actually add?
I've wiped everything that was generated by Joshua. I've rebuilt JOSHUA-304
branch. I'm getting the following
{code}
$JOSHUA/bin/pipeline.pl --type hiero --rundir
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0 --readme
"Baseline Hiero run 0 --lm-gen berkeleylm --lm berkeleylm --aligner berkeley
JOSHUA-304" --source es --target en --lm-gen berkeleylm --lm berkeleylm
--aligner berkeley --corpus $SPANISH/corpus/asr/callhome_train --corpus
$SPANISH/corpus/asr/fisher_train --tune $SPANISH/corpus/asr/fisher_dev --test
$SPANISH/corpus/asr/callhome_devtest
...
snip
...
[test-vocab-es] rebuilding...
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/corpus.es
[CHANGED]
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/vocab.es
[NOT FOUND]
cmd=cat
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/corpus.es
| /usr/local/incubator-joshua/scripts/training/build-vocab.pl >
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/vocab.es
took 0 seconds (0s)
[test-vocab-en] rebuilding...
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/corpus.en
[CHANGED]
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/vocab.en
[NOT FOUND]
cmd=cat
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/corpus.en
| /usr/local/incubator-joshua/scripts/training/build-vocab.pl >
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/test/vocab.en
took 0 seconds (0s)
[source-numlines] rebuilding...
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/train/corpus.es
[CHANGED]
cmd=cat
/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/train/corpus.es
| wc -l
took 0 seconds (0s)
[source-numlines] retrieved cached result => 151810
[berkeley-aligner-chunk-0] rebuilding...
dep=alignments/0/word-align.conf [CHANGED]
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/train/splits/corpus.es.0
[NOT FOUND]
dep=/usr/local/jpl/xdata/joshua_experiments/fisher_callhome_experiment/0/data/train/splits/corpus.en.0
[NOT FOUND]
dep=alignments/0/training.align [NOT FOUND]
cmd=java -d64 -Xmx10g -jar
/usr/local/incubator-joshua/ext/berkeleyaligner/distribution/berkeleyaligner.jar
++alignments/0/word-align.conf
JOB FAILED (return code 1)
[aligner-combine] rebuilding...
dep=alignments/0/training.en-es.align [NOT FOUND]
dep=alignments/training.align [NOT FOUND]
cmd=cat alignments/0/training.en-es.align > alignments/training.align
JOB FAILED (return code 1)
cat: alignments/0/training.en-es.align: No such file or directory
{code}
> word-align.conf alignment template file not compatible with berkeley aligner
> ----------------------------------------------------------------------------
>
> Key: JOSHUA-304
> URL: https://issues.apache.org/jira/browse/JOSHUA-304
> Project: Joshua
> Issue Type: Bug
> Components: alignment, berkeley, templates
> Affects Versions: 6.0.5
> Reporter: Lewis John McGibbney
> Priority: Blocker
> Fix For: 6.1
>
>
> It takes me quite some time to debug what was going on and why pipeline's
> were failing when using the berkeley aligner.
> It turns out that the word-align.conf template provided at
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf
> is not compatible with the berkeley aligner.
> In particular the following lines are non compatible
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf#L12-L15
> Evidence of this is provided below
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1, HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'JOINT JOINT'; valid choices: FORWARD|REVERSE|BOTH_INDEP|JOINT
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Exception in thread "main" java.lang.NumberFormatException: For input string:
> "5 5"
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Integer.parseInt(Integer.java:580)
> at java.lang.Integer.parseInt(Integer.java:615)
> at
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:143)
> at
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:240)
> at edu.berkeley.nlp.fig.basic.OptInfo.set(OptionsParser.java:294)
> at
> edu.berkeley.nlp.fig.basic.OptionsParser.readOptionsFile(OptionsParser.java:555)
> at
> edu.berkeley.nlp.fig.basic.OptionsParser.doParse(OptionsParser.java:604)
> at edu.berkeley.nlp.fig.exec.Execution.init(Execution.java:293)
> at edu.berkeley.nlp.wordAlignment.Main.main(Main.java:149)
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Cannot create directory: alignments/0
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)