[
https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15434164#comment-15434164
]
Lewis John McGibbney edited comment on JOSHUA-304 at 8/24/16 4:42 AM:
----------------------------------------------------------------------
It should be noted that in order for me to override the exceptions thrown above
the template ended up looking like the following N.B. the changes in values for
forwardModels, reverseModels, mode and iters keys.
{code}
## word-align.conf
## ----------------------
## This is an example training script for the Berkeley
## word aligner. In this configuration it uses two HMM
## alignment models trained jointly and then decoded
## using the competitive thresholding heuristic.
##########################################
# Training: Defines the training regimen
##########################################
forwardModels HMM
reverseModels HMM
mode JOINT
iters 5
###############################################
# Execution: Controls output and program flow
###############################################
execDir alignments/0
create
saveParams false
numThreads 1
msPerLine 10000
alignTraining
#################
# Language/Data
#################
foreignSuffix es.0
englishSuffix en.0
# Choose the training sources, which can either be directories or files that
list files/directories
trainSources
/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/data/train/splits/corpus
sentences MAX
testSources /dev/null
overwriteExecDir true
#################
# 1-best output
#################
competitiveThresholding
{code}
was (Author: lewismc):
It should be noted that in order for me to override the exceptions thrown above
the template ended up looking like the following
{code}
## word-align.conf
## ----------------------
## This is an example training script for the Berkeley
## word aligner. In this configuration it uses two HMM
## alignment models trained jointly and then decoded
## using the competitive thresholding heuristic.
##########################################
# Training: Defines the training regimen
##########################################
forwardModels HMM
reverseModels HMM
mode JOINT
iters 5
###############################################
# Execution: Controls output and program flow
###############################################
execDir alignments/0
create
saveParams false
numThreads 1
msPerLine 10000
alignTraining
#################
# Language/Data
#################
foreignSuffix es.0
englishSuffix en.0
# Choose the training sources, which can either be directories or files that
list files/directories
trainSources
/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/data/train/splits/corpus
sentences MAX
testSources /dev/null
overwriteExecDir true
#################
# 1-best output
#################
competitiveThresholding
{code}
> word-align.conf alignment template file not compatible with berkeley aligner
> ----------------------------------------------------------------------------
>
> Key: JOSHUA-304
> URL: https://issues.apache.org/jira/browse/JOSHUA-304
> Project: Joshua
> Issue Type: Bug
> Components: alignment, berkeley, templates
> Affects Versions: 6.0.5
> Reporter: Lewis John McGibbney
> Priority: Blocker
> Fix For: 6.1
>
>
> It takes me quite some time to debug what was going on and why pipeline's
> were failing when using the berkeley aligner.
> It turns out that the word-align.conf template provided at
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf
> is not compatible with the berkeley aligner.
> In particular the following lines are non compatible
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf#L12-L15
> Evidence of this is provided below
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1, HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'JOINT JOINT'; valid choices: FORWARD|REVERSE|BOTH_INDEP|JOINT
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Exception in thread "main" java.lang.NumberFormatException: For input string:
> "5 5"
> at
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> at java.lang.Integer.parseInt(Integer.java:580)
> at java.lang.Integer.parseInt(Integer.java:615)
> at
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:143)
> at
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:240)
> at edu.berkeley.nlp.fig.basic.OptInfo.set(OptionsParser.java:294)
> at
> edu.berkeley.nlp.fig.basic.OptionsParser.readOptionsFile(OptionsParser.java:555)
> at
> edu.berkeley.nlp.fig.basic.OptionsParser.doParse(OptionsParser.java:604)
> at edu.berkeley.nlp.fig.exec.Execution.init(Execution.java:293)
> at edu.berkeley.nlp.wordAlignment.Main.main(Main.java:149)
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Cannot create directory: alignments/0
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)