[
https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435687#comment-15435687
]
Lewis John McGibbney commented on JOSHUA-304:
---------------------------------------------
[~post] unfortunately my local tests are still not coming up with anything
fruitful.
{code}
lmcgibbn@LMC-032857 /usr/local/incubator-joshua(JOSHUA-304) $
$JOSHUA/bin/pipeline.pl --type hiero --rundir 8 --readme "Baseline Hiero run 8
--lm-gen berkeleylm --lm berkeleylm --aligner berkeley proposed bug fixed in
../../scripts/training/paralign.pl" --source es --target en --lm-gen berkeleylm
--lm berkeleylm --aligner berkeley --corpus $SPANISH/corpus/asr/callhome_train
--corpus $SPANISH/corpus/asr/fisher_train --tune
$SPANISH/corpus/asr/fisher_dev --test $SPANISH/corpus/asr/callhome_devtest
[train-copy-and-filter] cached, skipping...
[train-tokenize-es] cached, skipping...
[train-tokenize-en] cached, skipping...
[train-trim] cached, skipping...
[train-lowercase-es] cached, skipping...
[train-lowercase-en] cached, skipping...
[train-vocab-es] cached, skipping...
[train-vocab-en] cached, skipping...
[tune-copy-and-filter] cached, skipping...
[tune-tokenize-es] cached, skipping...
[tune-tokenize-en.0] cached, skipping...
[tune-tokenize-en.1] cached, skipping...
[tune-tokenize-en.2] cached, skipping...
[tune-tokenize-en.3] cached, skipping...
[tune-lowercase-es] cached, skipping...
[tune-lowercase-en.0] cached, skipping...
[tune-lowercase-en.1] cached, skipping...
[tune-lowercase-en.2] cached, skipping...
[tune-lowercase-en.3] cached, skipping...
[tune-vocab-es] cached, skipping...
[tune-vocab-en.0] cached, skipping...
[tune-vocab-en.1] cached, skipping...
[tune-vocab-en.2] cached, skipping...
[tune-vocab-en.3] cached, skipping...
[test-copy-and-filter] cached, skipping...
[test-tokenize-es] cached, skipping...
[test-tokenize-en] cached, skipping...
[test-lowercase-es] cached, skipping...
[test-lowercase-en] cached, skipping...
[test-vocab-es] cached, skipping...
[test-vocab-en] cached, skipping...
[source-numlines] cached, skipping...
[source-numlines] retrieved cached result => 151810
[berkeley-aligner-chunk-0] rebuilding...
dep=alignments/0/word-align.conf [CHANGED]
dep=/usr/local/incubator-joshua/8/data/train/splits/corpus.es.0 [NOT FOUND]
dep=/usr/local/incubator-joshua/8/data/train/splits/corpus.en.0 [NOT FOUND]
dep=alignments/0/training.align [NOT FOUND]
cmd=java -d64 -Xmx10g -jar
/usr/local/incubator-joshua/ext/berkeleyaligner/distribution/berkeleyaligner.jar
++alignments/0/word-align.conf
JOB FAILED (return code 1)
[aligner-combine] rebuilding...
dep=alignments/0/training.en-es.align [NOT FOUND]
dep=alignments/training.align [CHANGED]
cmd=cat alignments/0/training.en-es.align > alignments/training.align
JOB FAILED (return code 1)
cat: alignments/0/training.en-es.align: No such file or directory
{code}
> word-align.conf alignment template file not compatible with berkeley aligner
> ----------------------------------------------------------------------------
>
> Key: JOSHUA-304
> URL: https://issues.apache.org/jira/browse/JOSHUA-304
> Project: Joshua
> Issue Type: Bug
> Components: alignment, berkeley, templates
> Affects Versions: 6.0.5
> Reporter: Lewis John McGibbney
> Priority: Blocker
> Fix For: 6.1
>
>
> It takes me quite some time to debug what was going on and why pipeline's
> were failing when using the berkeley aligner.
> It turns out that the word-align.conf template provided at
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf
> is not compatible with the berkeley aligner.
> In particular the following lines are non compatible
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf#L12-L15
> Evidence of this is provided below
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1, HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'JOINT JOINT'; valid choices: FORWARD|REVERSE|BOTH_INDEP|JOINT
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Exception in thread "main" java.lang.NumberFormatException: For input string:
> "5 5"
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Integer.parseInt(Integer.java:580)
> at java.lang.Integer.parseInt(Integer.java:615)
> at
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:143)
> at
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:240)
> at edu.berkeley.nlp.fig.basic.OptInfo.set(OptionsParser.java:294)
> at
> edu.berkeley.nlp.fig.basic.OptionsParser.readOptionsFile(OptionsParser.java:555)
> at
> edu.berkeley.nlp.fig.basic.OptionsParser.doParse(OptionsParser.java:604)
> at edu.berkeley.nlp.fig.exec.Execution.init(Execution.java:293)
> at edu.berkeley.nlp.wordAlignment.Main.main(Main.java:149)
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Cannot create directory: alignments/0
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)