updated pipeline doc for 6.0.3 release
Project: http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/commit/f956df75 Tree: http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/tree/f956df75 Diff: http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/diff/f956df75 Branch: refs/heads/master Commit: f956df750815d8170e368fa5b8cce359d792a800 Parents: 81748f5 Author: Matt Post <[email protected]> Authored: Mon Jun 1 11:42:51 2015 -0600 Committer: Matt Post <[email protected]> Committed: Mon Jun 1 11:42:51 2015 -0600 ---------------------------------------------------------------------- 6.0/pipeline.md | 42 +++++++++++++++++------------------------- _data/joshua.yaml | 4 ++-- 2 files changed, 19 insertions(+), 27 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/f956df75/6.0/pipeline.md ---------------------------------------------------------------------- diff --git a/6.0/pipeline.md b/6.0/pipeline.md index 5c34237..f35f618 100644 --- a/6.0/pipeline.md +++ b/6.0/pipeline.md @@ -4,6 +4,9 @@ category: links title: The Joshua Pipeline --- +*Please note that the Joshua 6.0.3 included some big changes to directory organization of the + pipeline's files.* + This page describes the Joshua pipeline script, which manages the complexity of training and evaluating machine translation systems. The pipeline eases the pain of two related tasks in statistical machine translation (SMT) research: @@ -164,13 +167,13 @@ generated by the individual sub-steps). corpus.en thrax-input-file tune/ - tune.tok.lc.ur - tune.tok.lc.en + corpus.ur -> tune.tok.lc.ur + corpus.en -> tune.tok.lc.en grammar.filtered.gz grammar.glue test/ - test.tok.lc.ur - test.tok.lc.en + corpus.ur -> test.tok.lc.ur + corpus.en -> test.tok.lc.en grammar.filtered.gz grammar.glue alignments/ @@ -182,14 +185,14 @@ generated by the individual sub-steps). grammar.gz lm.gz tune/ - 1/ - decoder_command - joshua.config - params.txt - joshua.log - mert.log - joshua.config.ZMERT.final - final-bleu + decoder_command + model/ + [model files] + params.txt + joshua.log + mert.log + joshua.config.final + final-bleu These files will be described in more detail in subsequent sections of this tutorial. @@ -554,17 +557,11 @@ memory specification (passed to its `-Xmx` flag). Two optimizers are provided with Joshua: MERT and PRO (`--tuner {mert,pro}`). If Moses is installed, you can also use Cherry & Foster's k-best batch MIRA (`--tuner mira`, recommended). -Tuning is run till convergence in the `$RUNDIR/tune/N` directory, where N is the tuning instance. -By default, tuning is run just once, but the pipeline supports running the optimizer an arbitrary -number of times due to [recent work](http://www.youtube.com/watch?v=BOa3XDkgf0Y) pointing out the -variance of tuning procedures in machine translation, in particular MERT. This can be activated -with `--optimizer-runs N`. Each run can be found in a directory `$RUNDIR/tune/N`. +Tuning is run till convergence in the `$RUNDIR/tune` directory. When tuning is finished, each final configuration file can be found at either - $RUNDIR/tune/N/joshua.config.final - -where N varies from 1..`--optimizer-runs`. + $RUNDIR/tune/joshua.config.final ## <a id="testing" /> 7. Testing @@ -583,11 +580,6 @@ number of arguments: This tells the decoder to start at the test step. -- `--name NAME` - - A name is needed to distinguish this test set from the previous ones. Output for this test run - will be stored at `$RUNDIR/test/NAME`. - - `--joshua-config CONFIG` A tuned parameter file is required. This file will be the output of some prior tuning run. http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/f956df75/_data/joshua.yaml ---------------------------------------------------------------------- diff --git a/_data/joshua.yaml b/_data/joshua.yaml index 23c55d6..595f48d 100644 --- a/_data/joshua.yaml +++ b/_data/joshua.yaml @@ -1,2 +1,2 @@ -release_version: 6.0.2 -release_date: April 10, 2015 +release_version: 6.0.3 +release_date: June 1, 2015
