fixed corpus name
Project: http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/commit/ac9e2c97 Tree: http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/tree/ac9e2c97 Diff: http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/diff/ac9e2c97 Branch: refs/heads/asf-site Commit: ac9e2c979011e20c37af1188cb3fc93a59f59463 Parents: 899bd5f Author: Matt Post <[email protected]> Authored: Mon Jun 15 17:10:17 2015 -0400 Committer: Matt Post <[email protected]> Committed: Mon Jun 15 17:10:17 2015 -0400 ---------------------------------------------------------------------- 6 | 1 + 6.0/tutorial.md | 9 +++++---- 2 files changed, 6 insertions(+), 4 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/ac9e2c97/6 ---------------------------------------------------------------------- diff --git a/6 b/6 new file mode 120000 index 0000000..5049538 --- /dev/null +++ b/6 @@ -0,0 +1 @@ +6.0 \ No newline at end of file http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/ac9e2c97/6.0/tutorial.md ---------------------------------------------------------------------- diff --git a/6.0/tutorial.md b/6.0/tutorial.md index d167cdc..482162f 100644 --- a/6.0/tutorial.md +++ b/6.0/tutorial.md @@ -34,12 +34,12 @@ data was collected by translating transcribed speech from previous LDC releases. Download the data and install it somewhere: cd ~/data - wget --no-check -O fisher-spanish-corpus.zip https://github.com/joshua-decoder/fisher-callhome-corpus/archive/master.zip - unzip fisher-spanish-corpus.zip + wget --no-check -O fisher-callhome-corpus.zip https://github.com/joshua-decoder/fisher-callhome-corpus/archive/master.zip + unzip fisher-callhome-corpus.zip Then define the environment variable `$FISHER` to point to it: - cd ~/data/fisher-spanish-corpus-master + cd ~/data/fisher-callhome-corpus-master export FISHER=$(pwd) ### Preparing the data @@ -76,7 +76,8 @@ this **not** be inside your `$JOSHUA` directory*. We will now create the baseline run, using a particular directory structure for experiments that will allow us to take advantage of scripts provided with Joshua for displaying the results of many -related experiments. Because this can take quite some time to run, we are going to add a crippling +related experiments. Because this can take quite some time to run, we are going to reduce the model +by quite a bit by restriction: Joshua will only use sentences in the training sets with ten or fewer words on either side (Spanish or English):
